[python] one row to multiple rows

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

지방이의 Data Science Lab

[python] one row to multiple rows 본문

Data Analysis/Python

[python] one row to multiple rows

[지현] 2020. 2. 7. 15:18

* 원하는 list에서 무언가 지우고 싶을때 del쓰면 된다. (추가하고 싶을땐 append)

* 데이터를 좀 정리해서 무거운 데이터를 가볍게 가지고 싶을때 사용하면 된다.

즉, 아래와 같이 생긴 데이터를

회사명	2014년_신용등급	2015년_신용등급	2015년_신용등급
삼성	AAA	AAA	AAA
		...

회사명	년도	등급
삼성	2014	AAA
삼성	2015	AAA
삼성	2016	AAA

이렇게 만들어 주는 과정이다.

과정에서 들어간 핵심코드는 ravel 'C'olumn별로 쓰여있는 등급을 한 열 안에 몰아넣는 것.

나머지 회사와 년도는 늘어난 갯수만큼 맟줘준것임.

colnames는 한줄로 늘리고 싶은 열 이름들이 들어간 값,

company는 나머지 이름들.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

y = pd.read_csv("../data/y.csv",encoding ='cp949', skiprows = 6)
colnames = [x for x in y.columns]
company = colnames[0:3]
del colnames[0:3]
 
credit = pd.Series(y
          .set_index(company)[colnames]
          .values.ravel('C'))
credit = pd.DataFrame({'KIS_credit':credit})
company = y.loc[y.index.repeat(5)].reset_index(drop=True)[['KIS', 'Stock', 'Name']]
 
y = pd.concat([company, credit], axis = 1)
 
year = cycle([2014, 2015, 2016, 2017, 2018])
y['year'] = [next(year) for count in range(y.shape[0])]
 
y = y.dropna(thresh=5)
y = y.groupby(['Name']).filter(lambda x:x.shape[0] > 4)
y
 

저작자표시 비영리 동일조건

'Data Analysis > Python' 카테고리의 다른 글

[python] imputation (0)	2020.02.08
[python] file 속 데이터들을 전부 가져오는 방법 glob.glob (0)	2020.02.08
[python] 원하는 string포함한 pd.dataframe 필터링 (0)	2020.02.05
[python] key id가 multiple 관측치일때 갯수 일정하게 (1)	2020.02.01
[python] kmeans, agglomerative clustering (0)	2019.09.03

'Data Analysis/Python' Related Articles

Comments

지방이의 Data Science Lab

[python] one row to multiple rows 본문

[python] one row to multiple rows

'Data Analysis > Python' 카테고리의 다른 글

티스토리툴바