[머신러닝] 앙상블(Ensemble) 기법_부스팅(Boosting)

• Adaboost와 같이 중요한 데이터에 대해 weight를 주는 방식

• Gradient Boost와 같이 정답지와 오답지간의 차이(residual)를 훈련에서 다시 투입한 후 gradient를 이용해 모델을 개성하는 방식

• Naïve gradient boosting은 greedy algorithm이어서 training dataset에서 overfitting이 쉽게 일어남

• 이는 Regularization 방법이나 algorithm에 penalty를 추가함으로써 개선이 가능하다.

• Tree constraint: depth of the trees 나 the number of the trees로 조정

• Weighted update: learning rate 조정

• Random sampling: tree를 random subset of features and samples에 fit => 이경우 stochastic gradient boosting으로 불려짐

1) Explore Number of trees: GradientBoostingClassifier함수에서 n_estimators 인수로 tree 수 설정(기본 100)

2) Explore Number of samples: GradientBoostingClassifier함수에서 subsample 인수로 sample의 수 결정

3) Explore Number of features: GradientBoostingClassifier함수에서 max_features 인수로 feature 의 수 결정

4) Explore Learning Rate: GradientBoostingClassifier함수에서 learning_rate 인수로 결정

5) Explore Tree Depth: GradientBoostingClassifier함수에서 max_depth 인수로 결정(default:3)

[머신러닝] 앙상블(Ensemble) 기법_부스팅(Boosting)_LightGBM (1)	2024.01.13
[머신러닝] 앙상블(Ensemble) 기법_부스팅(Boosting)_XGBoost (0)	2024.01.13
[머신러닝] 앙상블(Ensemble) 기법_부스팅(Boosting)_Adaboost (1)	2024.01.12
[머신러닝] 앙상블(Ensemble) 기법_부스팅(Boosting)_개념 정리 (0)	2024.01.12
[머신러닝] 앙상블(Ensemble) 기법_배깅(Bagging)_Random Forest Algorithm (1)	2024.01.12

홍이지