pandas: map, apply, transform, agg

agrregate는 groupby로 데이터를 집계
map은 입력되는 element별로 함수 매핑 (Series형만 가능)
apply도 입력되는 element별로 연산 (lambda) 적용 (Series, DataFrame 모두 가능)
applymap은 DataFrame에만 가능
transform은 input과 같은 사이즈의 새로운 DataFrame로 변환

map, apply는 element level
transform은 column level

아래 예시에서,

apply는 Label별로 결과를 리턴해주었으며

transform은 input df의 index를 유지하며 데이터프레임 형태로 리턴해주었다. (해당 인덱스가 어떤 label이었는지 알 수 없음)

label_grouping = df.groupby('Label')
label_grouping.apply(lambda x: x.mean())
# output:

      Quantity   Values
Label
-----------------------
A       6.5       1.5
B       6.0       1.0
C       8.0       3.0

label_grouping.transform(lambda x: x.mean())
# see how `transform` could manage to keeps the input index labels in the output
# output:

    Quantity   Values
------------------------
V     6.5       1.5
W     6.0       1.0
X     8.0       3.0
Y     6.5       1.5
Z     8.0       3.0

apply가 가장 유연한 메소드니까

뭐가 더 나을지 잘 모르겠으면 apply 사용해보고, 속도 비교해보면 될듯! ㅎ

[참고]

stackoverflow: transform vs applymap

Pandas: apply, map or transform?

groupby 메서드들의 활용 방안

저작자표시 비영리 변경금지 (새창열림)

'데이터 분석' 카테고리의 다른 글

문해력: 문제 정의 (1)	2024.01.03
Chapter 1 미니 프로젝트: 뉴욕맛집 (0)	2023.12.21
평가 메트릭 (0)	2023.12.20
MongoDB (0)	2023.12.05
데이터 분석 프로세스 (2)	2023.12.05

땅호와

pandas: map, apply, transform, agg

'데이터 분석' 카테고리의 다른 글

티스토리툴바

pandas: map, apply, transform, agg

'데이터 분석' 카테고리의 다른 글

관련글

티스토리툴바