지방이의 Data Science Lab

[python] imputation 본문

Data Analysis/Python

[python] imputation

[지현] 2020. 2. 8. 16:51
1
2
3
4
5
6
7
8
9
10
11
 
#1. 0으로 잘못 표기되어 나왔을 경우 mean값으로 대체
pledge = pd.read_csv('train_pledge.csv', engine='python')
non_combat = np.array(pledge['non_combat_play_time'])
non_combat_mean = non_combat[np.nonzero(non_combat)].mean()
pledge['non_combat_play_time'= np.where(pledge['combat_play_time']>0, pledge['non_combat_play_time'+ non_combat_mean, pledge['non_combat_play_time'])
 
 
#2. na라는 모든 값을 -1로 대체 
for x in data.columns.values:
        data[x]=data[x].fillna(value = -1)
 

 

Comments