[python] x, y 쪼개기, train, test 쪼개기

Notice

Recent Posts

Recent Comments

Link

« 2024/12 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tags more

Archives

Today

Total

관리 메뉴

지방이의 Data Science Lab

[python] x, y 쪼개기, train, test 쪼개기 본문

Data Analysis/Python

[python] x, y 쪼개기, train, test 쪼개기

[지현] 2020. 2. 9. 15:41

imbalance일때 학습시키려면 계층유지셔커서 쪼개는 방법

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

X = flatten.drop('KIS_credit_&_2018',axis=1) 
y = flatten['KIS_credit_&_2018']
 
 
#방법1
from sklearn.model_selection import train_test_split 
train_test_split(X, y, random_state=0, stratify=y, shuffle=True) 
 
train=flatten.iloc[train_inds] 
test=flatten.iloc[test_inds]
 
#방법2
from sklearn.model_selection import train_test_split 
from sklearn.model_selection import GroupShuffleSplit 
  
train_inds, test_inds=next(GroupShuffleSplit(test_size=.3,n_splits=10,random_state=7).split(flatten,groups=flatten['Name'])) 
train=flatten.iloc[train_inds] 
test=flatten.iloc[test_inds]
 
 
#방법3(추천)
from sklearn.model_selection import StratifiedShuffleSplit
split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)
 
for train_index, test_index in split.split(X, y):
    strat_train_set = flatten.loc[train_index]
    strat_test_set = flatten.loc[test_index]
Colored by Color Scripter

cs

쪼갠 후 , 확인하는 코드:

1
2
3

strat_test_set.groupby(['KIS_credit_&_2018'])['Name'].count()

저작자표시 비영리 동일조건

'Data Analysis > Python' 카테고리의 다른 글

[Python] 두 리스트에서 다른 것 찾기 (0)	2020.08.01
[python] string 을 list로 변환 (0)	2020.02.13
[python] imputation (0)	2020.02.08
[python] file 속 데이터들을 전부 가져오는 방법 glob.glob (0)	2020.02.08
[python] one row to multiple rows (0)	2020.02.07

'Data Analysis/Python' Related Articles

Comments

지방이의 Data Science Lab

[python] x, y 쪼개기, train, test 쪼개기 본문

[python] x, y 쪼개기, train, test 쪼개기

'Data Analysis > Python' 카테고리의 다른 글

티스토리툴바