본문 바로가기
카테고리 없음

train/test

by 안행주의 2020. 7. 9.

Home Credit의 상품은 크게 3가지이다.

  • POS loans 판매 장소(현장/온라인)에서, 구매 시점에 고객이 구매하려는 상품/서비스에 대한 금액을 제공하는 것
  • Cash loans 구매하려는 상품/서비스를 지정하지 않고, 현지 규제 요건과 매매 관련한 것과 상관없이, 상품/서비스를 고객에게 전형적으로 제공하는 것
  • Revolving loans 신용 카드를 포함해서, 기존 고객에게 리볼빙 서비스 기반으로 개인 신용 한도까지 상품/서비스의 구매에 대해 전형적으로 제공하는 것

 

1,application_{train|test}.csv,SK_ID_CURR,ID of loan in our sample,

대출 ID

 

2,application_{train|test}.csv,TARGET,"Target variable (1 - client with payment difficulties: he/she had late payment more than X days on at least one of the first Y installments of the loan in our sample, 0 - all other cases)",

['TARGET'].value_counts() 

0    282686 
1     24825 : 지불에 어려움을 겪는 고객, 첫 Y 대출 금액의 적어도 일부분에 대해 X 날 이상 지불이 늦음 

 

5,application_{train|test}.csv,NAME_CONTRACT_TYPE,Identification if loan is cash or revolving,

대출이 현금인지 리볼빙인지

*리볼빙: 약정된 결제일에 최소 금액만 결제하고 나머지 대금은 대출로 이전하는 방식

 

6,application_{train|test}.csv,CODE_GENDER,Gender of the client, 성별[F/M/XNA]

7,application_{train|test}.csv,FLAG_OWN_CAR,Flag if the client owns a car, 차 소유 여부[Y/N]

8,application_{train|test}.csv,FLAG_OWN_REALTY,Flag if client owns a house or flat, 집/아파트 보유 여부[Y/N]

 

9,application_{train|test}.csv,CNT_CHILDREN,Number of children the client has, 자녀수

10,application_{train|test}.csv,AMT_INCOME_TOTAL,Income of the client, 수입

 

11,application_{train|test}.csv,AMT_CREDIT,Credit amount of the loan, 대출 총액12,application_{train|test}.csv,AMT_ANNUITY,Loan annuity, 매달 내야하는 돈(이자 포함)

13,application_{train|test}.csv,AMT_GOODS_PRICE,For consumer loans it is the price of the goods for which the loan is given, 대출받아서 사려고 한 상품의 총액 


14,application_{train|test}.csv,NAME_TYPE_SUITE,Who was accompanying client when he was applying for the loan,

['Unaccompanied' 'Family' 'Spouse, partner' 'Children' 'Other_A' nan 'Other_B' 'Group of people']

대출 신청시 동행인

 

15,application_{train|test}.csv,NAME_INCOME_TYPE,"Clients income type (businessman, working, maternity leave,... )",

['Working' 'State servant' 'Commercial associate' 'Pensioner' 'Unemployed' 'Student' 'Businessman' 'Maternity leave']

소득 타입

 

16,application_{train|test}.csv,NAME_EDUCATION_TYPE,Level of highest education the client achieved,

['Secondary / secondary special' 'Higher education' 'Incomplete higher' 'Lower secondary' 'Academic degree']

최종 학력 수준

 

17,application_{train|test}.csv,NAME_FAMILY_STATUS,Family status of the client,

['Single / not married' 'Married' 'Civil marriage' 'Widow' 'Separated' 'Unknown']

가족 현황


18,application_{train|test}.csv,NAME_HOUSING_TYPE,"What is the housing situation of the client (renting, living with parents, ...)",

['House / apartment' 'Rented apartment' 'With parents' 'Municipal apartment' 'Office apartment' 'Co-op apartment']

고객 주거 현황


19,application_{train|test}.csv,REGION_POPULATION_RELATIVE,Normalized population of region where client lives (higher number means the client lives in more populated region),normalized

고객 거주 지역의 정규화된 인구수(높은 숫자일 수록 클라이언트가 사는 지역의 인구수가 많음)

 

20,application_{train|test}.csv,DAYS_BIRTH,Client's age in days at the time of application,time only relative to the application

신청 당일 고객 연령

 

21,application_{train|test}.csv,36,How many days before the application the person started current employment,time only relative to the application

현재 직장에서 일한 일 수, 신청일 기준


22,application_{train|test}.csv,DAYS_REGISTRATION,How many days before the application did client change his registration,time only relative to the application

고객이 등록서류를 변경한 일 수, 신청일 기준


23,application_{train|test}.csv,DAYS_ID_PUBLISH,How many days before the application did client change the identity document with which he applied for the loan,time only relative to the application

고객이 대출을 신청한 동안에 신분증 문서를 변경한 일수, 신청일 기준


24,application_{train|test}.csv,OWN_CAR_AGE,Age of client's car,

고객 자동차 연식


25,application_{train|test}.csv,FLAG_MOBIL,"Did client provide mobile phone (1=YES, 0=NO)",

고객 휴대폰 번호 제공 여부

26,application_{train|test}.csv,FLAG_EMP_PHONE,"Did client provide work phone (1=YES, 0=NO)",

고객 직장 번호 제공 여부

27,application_{train|test}.csv,FLAG_WORK_PHONE,"Did client provide home phone (1=YES, 0=NO)",

고객 자택 번호 제공 여부

28,application_{train|test}.csv,FLAG_CONT_MOBILE,"Was mobile phone reachable (1=YES, 0=NO)",

고객 휴대폰 연결 가능 여부

29,application_{train|test}.csv,FLAG_PHONE,"Did client provide home phone (1=YES, 0=NO)",

고객 자택 번호 제공 여부

30,application_{train|test}.csv,FLAG_EMAIL,"Did client provide email (1=YES, 0=NO)",

고객 이메일 제공 여부

31,application_{train|test}.csv,OCCUPATION_TYPE,What kind of occupation does the client have,

고객 직업

 

32,application_{train|test}.csv,CNT_FAM_MEMBERS,How many family members does client have,

고객 가족 구성원 수

 

33,application_{train|test}.csv,REGION_RATING_CLIENT,"Our rating of the region where client lives (1,2,3)",

고객이 사는 지역에 대한 우리의 평가(1, 2, 3)

 

34,application_{train|test}.csv,REGION_RATING_CLIENT_W_CITY,"Our rating of the region where client lives with taking city into account (1,2,3)",

고객이 사는 도시에 대한 우리의 평가(1, 2, 3)

 

35,application_{train|test}.csv,WEEKDAY_APPR_PROCESS_START,On which day of the week did the client apply for the loan,

고객이 어떤 요일에 대출을 신청했는지?

 

36,application_{train|test}.csv,HOUR_APPR_PROCESS_START,Approximately at what hour did the client apply for the loan,rounded

고객이 대출을 신청한 대략적인 시각


37,application_{train|test}.csv,REG_REGION_NOT_LIVE_REGION,"Flag if client's permanent address does not match contact address (1=different, 0=same, at region level)",

고객 영구 주소와 획득 주소가 일치하는지

38,application_{train|test}.csv,REG_REGION_NOT_WORK_REGION,"Flag if client's permanent address does not match work address (1=different, 0=same, at region level)",

고객 영구 주소와 직장 주소가 일치하는지

39,application_{train|test}.csv,LIVE_REGION_NOT_WORK_REGION,"Flag if client's contact address does not match work address (1=different, 0=same, at region level)",

고객 획득 주소와 직장 주소가 일치하는지

40,application_{train|test}.csv,REG_CITY_NOT_LIVE_CITY,"Flag if client's permanent address does not match contact address (1=different, 0=same, at city level)",

고객 영구 주소와 획득 주소가 일치하는지

41,application_{train|test}.csv,REG_CITY_NOT_WORK_CITY,"Flag if client's permanent address does not match work address (1=different, 0=same, at city level)",

고객 영구 주소와 직장 주소가 일치하는지

42,application_{train|test}.csv,LIVE_CITY_NOT_WORK_CITY,"Flag if client's contact address does not match work address (1=different, 0=same, at city level)",

고객 획득 주소와 직장 주소가 일치하는지


43,application_{train|test}.csv,ORGANIZATION_TYPE,Type of organization where client works,

고객이 일하는 조직의 유형


44,application_{train|test}.csv,EXT_SOURCE_1,Normalized score from external data source,normalized

외부 데이터 소스의 정규화된 점수

45,application_{train|test}.csv,EXT_SOURCE_2,Normalized score from external data source,normalized

46,application_{train|test}.csv,EXT_SOURCE_3,Normalized score from external data source,normalized

 

47,application_{train|test}.csv,APARTMENTS_AVG,"Normalized information about building where the client lives, What is average (_AVG suffix), modus (_MODE suffix), median (_MEDI suffix) apartment size, common area, living area, age of building, number of elevators, number of entrances, state of the building, number of floor",normalized

고객이 거주하는 곳에 대한 표준화된 정보, 평균 (_AVG suffix), 양식 (_MODE suffix), 중앙값 (_MEDI suffix) 아파트 사이즈, 공용지, 주거지, 빌딩 나이, 엘리베이터 수, 입구 수, 빌딩 상태, 층 수

47. APARTMENTS_AVG: 아파트 사이즈 평균

48. BASEMENTAREA_AVG 

49. YEARS_BEGINEXPLUATATION_AVG

50. YEARS_BUILD_AVG

51. COMMONAREA_AVG

52. ELEVATORS_AVG

53. ENTRANCES_AVG

54. FLOORSMAX_AVG

55. FLOORSMIN_AVG

56. LANDAREA_AVG

57. LIVINGAPARTMENTS_AVG

58. LIVINGAREA_AVG

59. NONLIVINGAPARTMENTS_AVG

60. NONLIVINGAREA_AVG

 

61. APARTMENTS_MODE

62. BASEMENTAREA_MODE

63. YEARS_BEGINEXPLUATATION_MODE

64. YEARS_BUILD_MODE

65. COMMONAREA_MODE

66. ELEVATORS_MODE

67. ENTRANCES_MODE

68. FLOORSMAX_MODE

69. FLOORSMIN_MODE

70. LANDAREA_MODE

71. LIVINGAPARTMENTS_MODE

72. LIVINGAREA_MODE

73. NONLIVINGAPARTMENTS_MODE

74. NONLIVINGAREA_MODE

 

75. APARTMENTS_MEDI

76. BASEMENTAREA_MEDI

77. YEARS_BEGINEXPLUATATION_MEDI

78. YEARS_BUILD_MEDI

79. COMMONAREA_MEDI

80. ELEVATORS_MEDI

81. ENTRANCES_MEDI

82. FLOORSMAX_MEDI

83. FLOORSMIN_MEDI

84. LANDAREA_MEDI

85. LIVINGAPARTMENTS_MEDI

86. LIVINGAREA_MEDI

87. NONLIVINGAPARTMENTS_MEDI

88. NONLIVINGAREA_MEDI

 

89. FONDKAPREMONT_MODE

90. HOUSETYPE_MODE

91. TOTALAREA_MODE

92. WALLSMATERIAL_MODE

93. EMERGENCYSTATE_MODE

 

94,application_{train|test}.csv,OBS_30_CNT_SOCIAL_CIRCLE,How many observation of client's social surroundings with observable 30 DPD (days past due) default,

해당 고객 주위 사람 중 30일 이상 지불이 늦을 수 있는 사람의 수

95,application_{train|test}.csv,DEF_30_CNT_SOCIAL_CIRCLE,How many observation of client's social surroundings defaulted on 30 DPD (days past due),

해당 고객 주위 사람 중 30일 이상 지불이 정말 늦은 사람의 수

96,application_{train|test}.csv,OBS_60_CNT_SOCIAL_CIRCLE,How many observation of client's social surroundings with observable 60 DPD (days past due) default,

해당 고객 주위 사람 중 60일 이상 지불이 늦을 수 있는 사람의 수

97,application_{train|test}.csv,DEF_60_CNT_SOCIAL_CIRCLE,How many observation of client's social surroundings defaulted on 60 (days past due) DPD,

해당 고객 주위 사람 중 60일 이상 지불이 정말 늦은 사람의 수

 

98,application_{train|test}.csv,DAYS_LAST_PHONE_CHANGE,How many days before application did client change phone,

지원 며칠 전에 휴대폰을 변경했는가?

 

고객이 문서를 몇 번 제공했는가?

99,application_{train|test}.csv,FLAG_DOCUMENT_2,Did client provide document 2,
100,application_{train|test}.csv,FLAG_DOCUMENT_3,Did client provide document 3,
101,application_{train|test}.csv,FLAG_DOCUMENT_4,Did client provide document 4,
102,application_{train|test}.csv,FLAG_DOCUMENT_5,Did client provide document 5,
103,application_{train|test}.csv,FLAG_DOCUMENT_6,Did client provide document 6,
104,application_{train|test}.csv,FLAG_DOCUMENT_7,Did client provide document 7,
105,application_{train|test}.csv,FLAG_DOCUMENT_8,Did client provide document 8,
106,application_{train|test}.csv,FLAG_DOCUMENT_9,Did client provide document 9,
107,application_{train|test}.csv,FLAG_DOCUMENT_10,Did client provide document 10,
108,application_{train|test}.csv,FLAG_DOCUMENT_11,Did client provide document 11,
109,application_{train|test}.csv,FLAG_DOCUMENT_12,Did client provide document 12,
110,application_{train|test}.csv,FLAG_DOCUMENT_13,Did client provide document 13,
111,application_{train|test}.csv,FLAG_DOCUMENT_14,Did client provide document 14,
112,application_{train|test}.csv,FLAG_DOCUMENT_15,Did client provide document 15,
113,application_{train|test}.csv,FLAG_DOCUMENT_16,Did client provide document 16,
114,application_{train|test}.csv,FLAG_DOCUMENT_17,Did client provide document 17,
115,application_{train|test}.csv,FLAG_DOCUMENT_18,Did client provide document 18,
116,application_{train|test}.csv,FLAG_DOCUMENT_19,Did client provide document 19,
117,application_{train|test}.csv,FLAG_DOCUMENT_20,Did client provide document 20,
118,application_{train|test}.csv,FLAG_DOCUMENT_21,Did client provide document 21,

 

대출 신청 전 CREDIT

119,application_{train|test}.csv,AMT_REQ_CREDIT_BUREAU_HOUR,Number of enquiries to Credit Bureau about the client one hour before application,

신청 전 1시간 고객에 대한 CB로의 문의 횟수

120,application_{train|test}.csv,AMT_REQ_CREDIT_BUREAU_DAY,Number of enquiries to Credit Bureau about the client one day before application (excluding one hour before application),

신청 전 하루동안 고객에 대한 CB로의 문의 횟수(1시간 전 제외)

121,application_{train|test}.csv,AMT_REQ_CREDIT_BUREAU_WEEK,Number of enquiries to Credit Bureau about the client one week before application (excluding one day before application),

신청 전 일주일동안 고객에 대한 CB로의 문의 횟수(하루 전 제외)
122,application_{train|test}.csv,AMT_REQ_CREDIT_BUREAU_MON,Number of enquiries to Credit Bureau about the client one month before application (excluding one week before application),

신청 전 한 달동안 고객에 대한 CB로의 문의 횟수(일주일 전 전 제외)

123,application_{train|test}.csv,AMT_REQ_CREDIT_BUREAU_QRT,Number of enquiries to Credit Bureau about the client 3 month before application (excluding one month before application),

신청 전 세 달 동안 고객에 대한 CB로의 문의 횟수(한 달 전 제외)

124,application_{train|test}.csv,AMT_REQ_CREDIT_BUREAU_YEAR,Number of enquiries to Credit Bureau about the client one day year (excluding last 3 months before application),

신청 전 일년 동안 고객에 대한 CB로의 문의 횟수(세 달 전 제외)


train['index'].dtype == 'object'

['NAME_CONTRACT_TYPE', 'CODE_GENDER', 'FLAG_OWN_CAR', 'FLAG_OWN_REALTY', 'NAME_TYPE_SUITE', 'NAME_INCOME_TYPE', 'NAME_EDUCATION_TYPE', 'NAME_FAMILY_STATUS', 'NAME_HOUSING_TYPE', 'OCCUPATION_TYPE', 'WEEKDAY_APPR_PROCESS_START', 'ORGANIZATION_TYPE', 'FONDKAPREMONT_MODE', 'HOUSETYPE_MODE', 'WALLSMATERIAL_MODE', 'EMERGENCYSTATE_MODE'],

 

카테고리가 2개인 경우 [0, 1]로 label encoding

5,NAME_CONTRACT_TYPE

Cash loans         278232
Revolving loans     29279
 

 

6,CODE_GENDER

F      202448 
M      105059 
XNA         4  --> 제거

 

7,FLAG_OWN_CAR

N    202924
Y    104587

 

8,FLAG_OWN_REALTY

Y    213312
N     94199

 

카테고리가 2개 이상인 경우 One-hot encoding

14,NAME_TYPE_SUITE

Unaccompanied      248526
Family              40149
Spouse, partner     11370
Children             3267
Other_B              1770
Other_A               866
Group of people       271

 

15,NAME_INCOME_TYPE

Working                 158774
Commercial associate     71617
Pensioner                55362
State servant            21703
Unemployed                  22
Student                     18
Businessman                 10
Maternity leave              5

 

16,NAME_EDUCATION_TYPE

Secondary / secondary special    218391
Higher education                  74863
Incomplete higher                 10277
Lower secondary                    3816
Academic degree                     164

 

17,NAME_FAMILY_STATUS

Married                 196432
Single / not married     45444
Civil marriage           29775
Separated                19770
Widow                    16088
Unknown                      2

 

18,NAME_HOUSING_TYPE

House / apartment      272868
With parents            14840
Municipal apartment     11183
Rented apartment         4881
Office apartment         2617
Co-op apartment          1122

 

31,OCCUPATION_TYPE

Laborers                 55186
Sales staff              32102
Core staff               27570
Managers                 21371
Drivers                  18603
High skill tech staff    11380
Accountants               9813
Medicine staff            8537
Security staff            6721
Cooking staff             5946
Cleaning staff            4653
Private service staff     2652
Low-skill Laborers        2093
Waiters/barmen staff      1348
Secretaries               1305
Realty agents              751
HR staff                   563
IT staff                   526

 

35,WEEKDAY_APPR_PROCESS_START

TUESDAY      53901
WEDNESDAY    51934
MONDAY       50714
THURSDAY     50591
FRIDAY       50338
SATURDAY     33852
SUNDAY       16181

 

43,ORGANIZATION_TYPE  고객이 일하는 조직의 유형

Business Entity Type 3    67992
XNA                       55374
Self-employed             38412
Other                     16683
Medicine                  11193
Business Entity Type 2    10553
Government                10404
School                     8893
Trade: type 7              7831
Kindergarten               6880
Construction               6721
Business Entity Type 1     5984
Transport: type 4          5398
Trade: type 3              3492
Industry: type 9           3368
Industry: type 3           3278
Security                   3247
Housing                    2958
Industry: type 11          2704
Military                   2634
Bank                       2507
Agriculture                2454
Police                     2341
Transport: type 2          2204
Postal                     2157
Security Ministries        1974
Trade: type 2              1900
Restaurant                 1811
Services                   1575
University                 1327
Industry: type 7           1307
Transport: type 3          1187
Industry: type 1           1039
Hotel                       966
Electricity                 950
Industry: type 4            877
Trade: type 6               631
Industry: type 5            599
Insurance                   597
Telecom                     577
Emergency                   560
Industry: type 2            458
Advertising                 429
Realtor                     396
Culture                     379
Industry: type 12           369
Trade: type 1               348
Mobile                      317
Legal Services              305
Cleaning                    260
Transport: type 1           201
Industry: type 6            112
Industry: type 10           109
Religion                     85
Industry: type 13            67
Trade: type 4                64
Trade: type 5                49
Industry: type 8             24

 

89,FONDKAPREMONT_MODE  아파트 건물 공동 재산 점검 지역 프로그램

reg oper account         73830 유효한 계좌 기재
reg oper spec account    12080
not specified             5687  명시되지 않음
org spec account          5619

fond kapital remont

 

90,HOUSETYPE_MODE

block of flats      150503
specific housing      1499
terraced house        1212     

 

92,WALLSMATERIAL_MODE

Panel           66040
Stone, brick    64815
Block            9253
Wooden           5362
Mixed            2296
Monolithic       1779
Others           1625

 

93,EMERGENCYSTATE_MODE

No     159428 
Yes      2328

 

 

196,previous_application.csv,CODE_REJECT_REASON,Why was the previous application rejected,

['XAP' 'HC' 'LIMIT' 'CLIENT' 'SCOFR' 'SCO' 'XNA' 'VERIF' 'SYSTEM']

이전 신청이 거절된 이유


197,previous_application.csv,NAME_TYPE_SUITE,Who accompanied client when applying for the previous application

[nan 'Unaccompanied' 'Spouse, partner' 'Family' 'Children' 'Other_B' 'Other_A' 'Group of people']

이전 신청을 할 때 동반했던 사람

 

198,previous_application.csv,NAME_CLIENT_TYPE,Was the client old or new client when applying for the previous application,

['Repeater' 'New' 'Refreshed' 'XNA']

이전 대출 신청을 할 때 신규 고객인지, 기존 고객인지

 

199,previous_application.csv,NAME_GOODS_CATEGORY,What kind of goods did the client apply for in the previous application,

['Mobile' 'XNA' 'Consumer Electronics' 'Construction Materials' 'Auto Accessories' 'Photo / Cinema Equipment' 'Computers' 'Audio/Video' 'Medicine' 'Clothing and Accessories' 'Furniture' 'Sport and Leisure' 'Homewares' 'Gardening' 'Jewelry' 'Vehicles' 'Education' 'Medical Supplies' 'Other' 'Direct Sales' 'Office Appliances' 'Fitness' 
 'Tourism' 'Insurance' 'Additional Service' 'Weapon' 'Animals' 'House Construction']

이전 대출 신청에서 고객은 어떤 상품에 지원했는가?


200,previous_application.csv,NAME_PORTFOLIO,"Was the previous application for CASH, POS, CAR, ",

['POS' 'Cash' 'XNA' 'Cards' 'Cars']

이전 신청이 뭘 위한 것이었나??

 

201,previous_application.csv,NAME_PRODUCT_TYPE,Was the previous application x-sell o walk-in,

['XNA' 'x-sell' 'walk-in']

이전 지원할 때 전자상거래였나, 방문이었나

 

202,previous_application.csv,CHANNEL_TYPE,Through which channel we acquired the client on the previous application,

['Country-wide' 'Contact center' 'Credit and cash offices' 'Stone' 'Regional / Local' 'AP+ (Cash loan)' 'Channel of corporate sales' 'Car dealer']

어떤 채널을 통해서 고객의 이전 지원 내역을 얻었나

 

203,previous_application.csv,SELLERPLACE_AREA,Selling area of seller place of the previous application,

[  35   -1  200 ... 2233  887 2420]

이전 대출의 판매처의 판매 지역(판매 분야)

 

204,previous_application.csv,NAME_SELLER_INDUSTRY,The industry of the seller,

['Connectivity' 'XNA' 'Consumer electronics' 'Industry' 'Clothing' 'Furniture' 'Construction' 'Jewelry' 'Auto technology' 'MLM partners' 'Tourism']

판매자의 산업

 

205,previous_application.csv,CNT_PAYMENT,Term of previous credit at application of the previous application,

이전 신청에서 신용 거래 기간

 

206,previous_application.csv,NAME_YIELD_GROUP,Grouped interest rate into small medium and high of the previous application,grouped

['middle' 'low_action' 'high' 'low_normal' 'XNA']

이전 신청의 금리를 저/중/고로 그룹화

 

207,previous_application.csv,PRODUCT_COMBINATION,Detailed product combination of the previous application,

['POS mobile with interest' 'Cash X-Sell: low' 'Cash X-Sell: high' 'Cash X-Sell: middle' 'Cash Street: high' 'Cash' 
 'POS household without interest' 'POS household with interest' 'POS other with interest' 'Card X-Sell' 'POS mobile without interest' 'Card Street' 'POS industry with interest' 'Cash Street: low' 'POS industry without interest' 'Cash Street: middle' 'POS others without interest' nan]

이전 신청서의 상세한 상품 결합

 

208,previous_application.csv,DAYS_FIRST_DRAWING,Relative to application date of current application when was the first disbursement of the previous application,time only relative to the application

이전 신청서의 첫 지불을 했을 때와 현재 신청서의 신청 날짜 비교, 신청서 기준

['DAYS_FIRST_DRAWING'].value_counts() 

365243.0    934444 
-228.0          123 
-224.0          121 
-212.0          121 
-223.0          119

 

209,previous_application.csv,DAYS_FIRST_DUE,Relative to application date of current application when was the first due supposed to be of the previous application,time only relative to the application

이전 신청서에서 최초 지불해야 하는 때와 현재 신청서의 신청 날짜 비교, 신청서 기준

['DAYS_FIRST_DUE'].value_counts() 

 365243.0    40645 
-334.0         772 
-509.0         760 
-208.0         751 
-330.0         750


210,previous_application.csv,DAYS_LAST_DUE_1ST_VERSION,Relative to application date of current application when was the first due of the previous application,time only relative to the application

이전 신청서에서 최초 지불한 때와 현재 신청서의 신청 날짜 비교, 신청서 기준

['DAYS_LAST_DUE_1ST_VERSION'].value_counts() 

365243.0    93864 
9.0           720 
8.0           706 
0.0           705 
5.0           702

 

211,previous_application.csv,DAYS_LAST_DUE,Relative to application date of current application when was the last due date of the previous application,time only relative to the application

이전 신청서에서 마지막 지불한 때와 현재 신청서의 신청 날짜 비교, 신청서 기준

['DAYS_LAST_DUE'].value_counts() 

 365243.0    211221 
-245.0          658 
-188.0          650 
-239.0          642 
-167.0          638

 

212,previous_application.csv,DAYS_TERMINATION,Relative to application date of current application when was the expected termination of the previous application,time only relative to the application

이전 신청서의 예상되었던 종료 날짜와 현재 신청서의 신청 날짜 비교, 신청서 기준

['DAYS_TERMINATION'].value_counts() 

 365243.0    225913 
-233.0          786 
-184.0          770 
-170.0          770 
-163.0          769

 

213,previous_application.csv,NFLAG_INSURED_ON_APPROVAL,Did the client requested insurance during the previous application,

[ 0.  1. nan]

이전 신청 기간 동안 고객이 보증을 요청했는지


214,installments_payments.csv,SK_ID_PREV ,"ID of previous credit in Home credit related to loan in our sample. (One loan in our sample can have 0,1,2 or more previous loans in Home Credit)",hashed

우리 샘플에서 대출과 관련된 Home Credit의 이전 신용 거래 ID(0, 1, 2 또는 그 이상의 대출 가능)

len(inst_pay['SK_ID_PREV'].unique()) --> 997752

 

215,installments_payments.csv,SK_ID_CURR,ID of loan in our sample,hashed

우리 샘플에서 대출 ID

len(inst_pay['SK_ID_CURR'].unique()) --> 339587

 

216,installments_payments.csv,NUM_INSTALMENT_VERSION,Version of installment calendar (0 is for credit card) of previous credit. Change of installment version from month to month signifies that some parameter of payment calendar has changed,

[1. 0. 2. 4. 3. 5. 7. 8. 6. 13. 9. 21. 22. 12. 17. 18. 11. 14. 34. 33. 19. 16. 15. 10. 26. 27. 20. 25. 23. 24. 31. 32. 28. 35. 29. 30. 43. 39. 40. 36. 41. 42. 37. 38. 68. 44. 45. 46. 178. 52. 51. 53. 54. 49. 50. 58. 57. 55. 56. 48. 47. 72. 59. 73. 61. ]

이전 신용의 할부 달력 버전(0은 신용 카드용). (몇개월)에서 (몇개월)로 할부 버전 변화는 지급 일정관리의 일부 매개변수가 변경되었음을 의미한다.

['NUM_INSTALMENT_VERSION'].value_counts()

1.0      8485004 
0.0      4082498 
2.0       620283 
3.0       237063 
4.0        55274

[178. 73. 72.(7) 68. 61.(8) 59. 58. ...]

   

217,installments_payments.csv,NUM_INSTALMENT_NUMBER,On which installment we observe payment,

할부개월수

1      1004160 
2       985716 
3       968279 
4       943502 
5       880007 
        ...    
266          2 
273          2 
276          1 
274          1 
277          1 

 

218,installments_payments.csv,DAYS_INSTALMENT,When the installment of previous credit was supposed to be paid (relative to application date of current loan),time only relative to the application

납입일, 이전 신용 거래의 할부금을 납입해야하는 때(현재 대출의 신청 날짜와 비교해서), 신청서랑만 비교

219,installments_payments.csv,DAYS_ENTRY_PAYMENT,When was the installments of previous credit paid actually (relative to application date of current loan),time only relative to the application

[-120, -180, -150, ...-2922, -2, -1]  # null값 존재(2905)

실제 납입일, 이전 신용 거래의 할부금이 실제 납입된 때(현재 대출의 신청 날짜와 비교해서), 신청서랑만 비교

   DAYS_INSTALMENT    DAYS_ENTRY_PAYMENT

0          -1180.0                      -1187.0
1          -2156.0                      -2156.0 
2            -63.0                          -63.0
3          -2418.0                      -2426.0 
4          -1383.0                      -1366.0 

 

220,installments_payments.csv,AMT_INSTALMENT,What was the prescribed installment amount of previous credit on this installment,

할부금, 이번 할부금 납입에서 이전 신용 거래에 대해 납부해야하는 금액

221,installments_payments.csv,AMT_PAYMENT,What the client actually paid on previous credit on this installment,

납입금, 이번 할부금 납입에서 고객이 실제로 이전 신용 거래에 대해 지불한 것  # null값 존재(2905)

    AMT_INSTALMENT        AMT_PAYMENT

0     6948.360                        6948.360 
1     1716.525                        1716.525 
2    25425.000                      25425.000
3    24350.130                      24350.130 
4     2160.585                        2165.040

 

['DAYS_ENTRY_PAYMENT'], ['AMT_PAYMENT'] --> 최근 거래 고객에 대한 데이터 없음, 납입일이 안된 것으로 추정

(inst_pay['AMT_PAYMENT'].isnull() & inst_pay['DAYS_ENTRY_PAYMENT'].isnull()).sum() = 2905