Package: creditmodel 1.3.1

creditmodel: Toolkit for Credit Modeling, Analysis and Visualization

Provides a highly efficient R tool suite for Credit Modeling, Analysis and Visualization.Contains infrastructure functionalities such as data exploration and preparation, missing values treatment, outliers treatment, variable derivation, variable selection, dimensionality reduction, grid search for hyper parameters, data mining and visualization, model evaluation, strategy analysis etc. This package is designed to make the development of binary classification models (machine learning based models as well as credit scorecard) simpler and faster. The references including: 1 Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS; 2 Bezdek, James C.FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences (0098-3004),<doi:10.1016/0098-3004(84)90020-7>.

Authors:Dongping Fan [aut, cre]

creditmodel_1.3.1.tar.gz
creditmodel_1.3.1.zip(r-4.5)creditmodel_1.3.1.zip(r-4.4)creditmodel_1.3.1.zip(r-4.3)
creditmodel_1.3.1.tgz(r-4.4-any)creditmodel_1.3.1.tgz(r-4.3-any)
creditmodel_1.3.1.tar.gz(r-4.5-noble)creditmodel_1.3.1.tar.gz(r-4.4-noble)
creditmodel_1.3.1.tgz(r-4.4-emscripten)creditmodel_1.3.1.tgz(r-4.3-emscripten)
creditmodel.pdf |creditmodel.html
creditmodel/json (API)

# Install 'creditmodel' in R:
install.packages('creditmodel', repos = c('https://fanhansen.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Datasets:

On CRAN:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

181 exports 4 stars 0.49 score 44 dependencies 14 scripts 698 downloads

Last updated 3 years agofrom:a4f0795017. Checks:OK: 3 NOTE: 4. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 01 2024
R-4.5-winNOTESep 01 2024
R-4.5-linuxNOTESep 01 2024
R-4.4-winNOTESep 01 2024
R-4.4-macNOTESep 01 2024
R-4.3-winOKSep 01 2024
R-4.3-macOKSep 01 2024

Exports:%alike%%islike%add_variable_processaddress_variebleanalysis_nasanalysis_outliersas_percentauc_valueavg_xchar_corchar_cor_varschar_to_numchecking_datacity_varieblecity_varieble_processcnt_xcohort_plotcohort_table_plotcolAllnascolAllzeroscolMaxMinscolor_ramp_palettecolSdscor_heat_plotcor_plotcos_simcustomer_segmentationcut_equalcv_splitdata_cleansingdata_explorationdate_cutde_one_hot_encodingde_percentderived_intervalderived_partial_acfderived_pctderived_tsderived_ts_varsdigits_numentropy_weightentry_rate_naeuclid_distfast_high_cor_filterfeature_selectorfuzzy_clusterfuzzy_cluster_meansgather_datagbm_filtergbm_paramsget_auc_ks_lambdaget_bins_tableget_bins_table_allget_breaksget_breaks_allget_correlation_groupget_ivget_iv_allget_logistic_coefget_medianget_namesget_nas_randomget_partial_dependence_plotsget_psiget_psi_allget_psi_ivget_psi_iv_allget_psi_plotsget_score_cardget_shadow_nasget_sim_sign_lambdaget_tree_breaksget_x_listhigh_cor_filterhigh_cor_selectoris_dateknn_nas_impks_plotks_psi_plotks_tableks_table_plotks_valuelasso_filterlift_plotlift_valuelocal_outlier_factorlog_translog_varsloop_functionlove_colorlow_variance_filterlr_paramslr_params_searchlr_vifmax_min_normmax_xmerge_categorymin_max_normmin_xmodel_key_indexmodel_result_plotmulti_gridmulti_left_joinn_charnull_blank_naone_hot_encodingoutliers_detectionoutliers_kmeans_lofp_to_scorepartial_dependence_plotPCA_reduceperf_tableplot_colorsplot_oot_perfplot_tableplot_themepred_scoreprocess_nasprocess_nas_varprocess_outlierspsi_iv_filterpsi_plotquick_as_dfranking_percent_dictranking_percent_dict_xranking_percent_procranking_percent_proc_xre_codere_nameread_datareduce_high_cor_filterremove_duplicatedreplace_valuereplace_value_xrequire_packagesrf_paramsroc_plotrowAllrowAllnasrowAnyrowCVsrowMaxMinsrowMaxsrowMinsrowSdssave_datascore_distribution_plotscore_transferselect_best_breaksselect_best_classselect_cor_groupselect_cor_listsim_strsplit_binssplit_bins_allsql_hive_text_parsestart_parallel_computingstop_parallel_computingstr_matchsum_tablesum_xterm_filterterm_idfterm_tfidftime_series_proctime_transfertime_variabletime_vars_processtnr_valuetrain_lrtrain_test_splittrain_xgbtraining_modelvar_group_procvariable_processwoe_transwoe_trans_allxgb_dataxgb_filterxgb_paramsxgb_params_search

Dependencies:clicodetoolscolorspacedata.tabledoParalleldplyrfansifarverforeachgenericsggplot2glmnetgluegtableisobanditeratorsjsonlitelabelinglatticelifecyclemagrittrMASSMatrixmgcvmunsellnlmepillarpkgconfigR6RColorBrewerRcppRcppEigenrlangrpartscalesshapesurvivaltibbletidyselectutf8vctrsviridisLitewithrxgboost

Introduction to creditmodel

Rendered fromintroduction.Rmdusingknitr::rmarkdownon Sep 01 2024.

Last update: 2020-11-09
Started: 2019-10-23

Readme and manuals

Help Manual

Help pageTopics
creditmodel: toolkit for credit modeling and data analysiscreditmodel-package creditmodel
Fuzzy String matching%alike%
Fuzzy String matching%islike%
add_variable_processadd_variable_process
address_variebleaddress_varieble
missing Analysisanalysis_nas
Outliers Analysisanalysis_outliers
Percent Formatas_percent
auc_value 'auc_value' is for get best lambda required in lasso_filter. This function required in 'lasso_filter'auc_value
Cramer's V matrix between categorical variables.char_cor char_cor_vars
character to numberchar_to_num
Checking Datachecking_data
city_varieblecity_varieble
Processing of Address Variablescity_varieble_process
cohort_table_plot 'cohort_table_plot' is for ploting cohort(vintage) analysis table.cohort_plot cohort_table_plot
Correlation Heat Plotcor_heat_plot
Correlation Plotcor_plot
cos_simcos_sim
Customer Segmentationcustomer_segmentation
Generating Initial Equal Size Sample Binscut_equal
Stratified Foldscv_split
Data Cleaningdata_cleansing
Data Explorationdata_exploration
Date Time Cut Pointdate_cut
Recovery One-Hot Encodingde_one_hot_encoding
Recovery Percent Formatde_percent
derived_intervalderived_interval
derived_partial_acfderived_partial_acf
derived_pctderived_pct
Derivation of Behavioral Variablesderived_ts derived_ts_vars
Number of digitsdigits_num
Entropy Weight Methodentropy_weight
Max Percent of missing Valueentry_rate_na
euclid_disteuclid_dist
Functions of xgboost fevaleval_auc eval_ks eval_lift eval_tnr
Entropy Weight Method Dataewm_data
high_cor_filterfast_high_cor_filter high_cor_filter
Feature Selection Wrapperfeature_selector
Fuzzy Cluster means.fuzzy_cluster fuzzy_cluster_means
gather or aggregate datagather_data
Select Features using GBMgbm_filter
GBM Parametersgbm_params
get_auc_ks_lambda 'get_auc_ks_lambda' is for get best lambda required in lasso_filter. This function required in 'lasso_filter'get_auc_ks_lambda
Table of Binningget_bins_table get_bins_table_all
Generates Best Breaks for Binningget_breaks get_breaks_all
get_correlation_groupget_correlation_group select_cor_group select_cor_list
Calculate Information Value (IV) 'get_iv' is used to calculate Information Value (IV) of an independent variable. 'get_iv_all' can loop through IV for all specified independent variables.get_iv get_iv_all
get logistic coefget_logistic_coef
get central value.get_median
Get Variable Namesget_names
get_nas_randomget_nas_random
Calculate Population Stability Index (PSI) 'get_psi' is used to calculate Population Stability Index (PSI) of an independent variable. 'get_psi_all' can loop through PSI for all specified independent variables.get_psi get_psi_all
Calculate IV & PSIget_psi_iv get_psi_iv_all
Plot PSI(Population Stability Index)get_psi_plots psi_plot
Score Cardget_score_card
get_shadow_nasget_shadow_nas
get_sim_sign_lambda 'get_sim_sign_lambda' is for get Best lambda required in lasso_filter. This function required in 'lasso_filter'get_sim_sign_lambda
Getting the breaks for terminal nodes from decision treeget_tree_breaks
Get X List.get_x_list
Compare the two highly correlated variableshigh_cor_selector
is_dateis_date
Imputate nas using KNNknn_nas_imp
ks_table & plotks_psi_plot ks_table ks_table_plot model_key_index
ks_valueks_value
Variable selection by LASSOlasso_filter
Lending Club datalendingclub
lift_valuelift_value
local_outlier_factor 'local_outlier_factor' is function for calculating the lof factor for a data set using knn This function is not intended to be used by end user.local_outlier_factor
Logarithmic transformationlog_trans log_vars
Loop Function. #' 'loop_function' is an iterator to loop throughloop_function
love_colorlove_color
Filtering Low Variance Variableslow_variance_filter
Logistic Regression & Scorecard Parameterslr_params lr_params_search
Variance-Inflation Factorslr_vif
Max Min Normalizationmax_min_norm
Merge Categorymerge_category
Min Max Normalizationmin_max_norm
model result plots 'model_result_plot' is a wrapper of following: 'perf_table' is for generating a model performance table. 'ks_plot' is for K-S. 'roc_plot' is for ROC. 'lift_plot' is for Lift Chart. 'score_distribution_plot' is for ploting the score distribution.ks_plot lift_plot model_result_plot perf_table roc_plot score_distribution_plot
Arrange list of plots into a gridmulti_grid
multi_left_joinmulti_left_join
The length of a string.n_char
Encode NAsnull_blank_na
One-Hot Encodingone_hot_encoding
Outliers Detection 'outliers_detection' is for outliers detecting using Kmeans and Local Outlier Factor (lof)outliers_detection
Entropye_ij p_ij
prob to socrep_to_score
partial_dependence_plotget_partial_dependence_plots partial_dependence_plot
PCA Dimension ReductionPCA_reduce
Plot Colorscolor_ramp_palette plot_colors
plot_oot_perf 'plot_oot_perf' is for ploting performance of cross time samples in the futureplot_oot_perf
plot_tableplot_table
plot_themeplot_theme
pred_scorepred_score
missing Treatmentprocess_nas process_nas_var
Outliers Treatmentoutliers_kmeans_lof process_outliers
Variable reduction based on Information Value & Population Stability Index filterpsi_iv_filter
List as data.frame quicklyquick_as_df
Ranking Percent Processranking_percent_dict ranking_percent_dict_x ranking_percent_proc ranking_percent_proc_x
re_code 're_code' search for matches to argument pattern within each element of a character vector:re_code
Renamere_name
Read datacheck_data_format read_data
Filtering highly correlated variables with reduce methodreduce_high_cor_filter
Remove Duplicated Observationsremove_duplicated
Replace Valuereplace_value replace_value_x
Packages required and intallmentrequire_packages
Random Forest Parametersrf_params
Functions for vector operation.avg_x cnt_x colAllnas colAllzeros colMaxMins colSds max_x min_x rowAll rowAllnas rowAny rowCVs rowMaxMins rowMaxs rowMins rowSds sum_x
Save datasave_data
Score Transformationscore_transfer
Generates Best Binning Breaksselect_best_breaks select_best_class
sim_strsim_str
split_binssplit_bins
Split bins allsplit_bins_all
Automatic production of hive SQLsql_hive_text_parse
Parallel computing and export variables to global Env.start_parallel_computing
Stop parallel computingstop_parallel_computing
string match #' 'str_match' search for matches to argument pattern within each element of a character vector:str_match
Summary tablesum_table
TF-IDFterm_filter term_idf term_tfidf
Process time series datatime_series_proc
Time Format Transferingtime_transfer
time_variabletime_variable
Processing of Time or Date Variablestime_vars_process
tnr_valuetnr_value
Trainig LR modeltrain_lr
Train-Test-Splittrain_test_split
Training XGboosttrain_xgb
Training modeltraining_model
UCI Credit Card dataUCICreditCard
Process group numeric variablesvar_group_proc
variable_processvariable_process
WOE Transformationwoe_trans woe_trans_all
XGboost dataxgb_data
Select Features using XGBxgb_filter
XGboost Parametersxgb_params xgb_params_search