Recursive Feature Elimination 관련 질문드립니다.

Question

안녕하세요. 좋은 강의 잘 듣고 있습니다.

<신규> Recursive Feature Elimination과 SelectFromModel 실습 강의 부분에서

svc = SVC(kernel="linear")
# REFCV로 Feature들을 반복적으로 제거해가면서 학습/평가 수행.  
rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(2),          
              scoring='accuracy', verbose=2)                         
rfecv.fit(X, y)
print("Optimal number of features : %d" % rfecv.n_features_)

이 코드를 돌렸을 때 나오는 verbosity가 잘 이해되지 않아서 질문드립니다.

Fitting estimator with 25 features.
Fitting estimator with 24 features.
Fitting estimator with 23 features.
Fitting estimator with 22 features.
...
Fitting estimator with 4 features.
Fitting estimator with 3 features.
Fitting estimator with 2 features.
Fitting estimator with 25 features.
Fitting estimator with 24 features.
Fitting estimator with 23 features.
Fitting estimator with 22 features.
...
Fitting estimator with 4 features.
Fitting estimator with 3 features.
Fitting estimator with 2 features.
Fitting estimator with 25 features.
Fitting estimator with 24 features.
Fitting estimator with 23 features.
Fitting estimator with 22 features.
...
Fitting estimator with 7 features.
Fitting estimator with 6 features.
Fitting estimator with 5 features.
Fitting estimator with 4 features.
Optimal number of features : 3

cv=2라서 25~2 features로 코드가 2번 돌아가는 것 같은데

그 후에 25~4(optimal number of features + 1)까지 한 번 더 돌아가는 이유가 무엇인가요?

추가적으로,


plt.ylabel("Cross validation score (nb of correct classifications)")

여기 nb of correct classifications에서 nb가 number의 약자가 맞을까요?

이상입니다. 감사합니다.

권 철민 · Answer

안녕하십니까,

말씀하신대로 cv + 1 번 호출되는 것 같은데 이유는 저도 정확하게는 모르겠습니다. 제 생각엔 cv 갯수로 호출한 뒤 최종적으로 내부에서 다시 한번 feature selection을 하는 것 같습니다.
네, number가 맞는데, 정확도 확률로 보시면 될 것 같습니다.

감사합니다.

인프런 커뮤니티 질문&답변

Recursive Feature Elimination 관련 질문드립니다.