레이블 인코딩이 안되는 이유가 궁금합니다

[퇴근후딴짓] 빅데이터 분석기사 실기 (작업형1,2,3)

기출(작업형2) 한 가지 방법으로 풀기 🆕 updated 2024.6

해결된 질문

작성

import pandas as pd
train = pd.read_csv("5_train.csv")
test = pd.read_csv("5_test.csv")

#EDA
train.head()
train.shape, test.shape
# train.info()
# train['price'].value_counts()
train.isnull().sum()
test.isnull().sum()
cols = train.select_dtypes(include='O').columns

print(train.shape, test.shape)

# #Label Encoding 
from sklearn.preprocessing import LabelEncoder 
for col in cols:
  le = LabelEncoder()
  train[col] = le.fit_transform(train[col])
  test[col] = le.transform(test[col])

print(train.shape, test.shape)

안녕하세요 선생님!

에러는 안 나는데 레이블인코딩이 안되는데 뭐가 문제인지 모르겠습니다..!

python 머신러닝 빅데이터 pandas 빅데이터분석기사

답변 1

퇴근후딴짓

지식공유자

에러도 복사-붙여넣기 부탁드려요!

코드상에 큰 문제 없는 것으로 보아. 다시 한번 실행해주시고

train과 test 데이터셋의 범주형 컬럼의 고유값을 출력해서 비교해 보시죠!

차이가 있을 수 있습니다.

olive h

질문자

올려주신 기출문제(다중분류) 4회를 푸는데 에러는 전혀 안 뜨는데 레이블 인코딩만 안되어서요!ㅠㅠ

# 1. 문제정의
# 평가: f1 macro
# target: Segmentation
# 최종파일: result.csv(컬럼 1개 pred)
#테스트 데이터 2154개 로우 확인
# 2. 라이브러리 및 데이터 불러오기
import pandas as pd
train = pd.read_csv("4_train.csv")
test = pd.read_csv("4_test.csv")

train.head()
test.head()
train['Segmentation'].value_counts()
train.shape, test.shape
# train.info()
train.isnull().sum()
test.isnull().sum()
print(train.describe(include='O'))
print(test.describe(include='O'))
cols = train.select_dtypes(include='object').columns
print(train.shape, test.shape)

#Label Encoding
from sklearn.preprocessing import LabelEncoder
for col in cols:
  le = LabelEncoder()
  train[col] = le.fit_transform(train[col])
  test[col] = le.transform(test[col])
print(train.shape, test.shape)
# train.head()

어쨌든 신택스 문제는 아니라는 말씀이시죠?ㅠㅠ

인프런 커뮤니티 질문&답변

레이블 인코딩이 안되는 이유가 궁금합니다