loss : 경사 하강법에서 사용할 비용함수 MSE
learning rate : 학습률
n_estimator : weak_learn 개수 (디폴트 100)
subsample : weak_learn 가 학습에 사용하는 데이터와 샘플링의 비율 (디폴트 1)
> 과적합이 우려되면 1보다 작은 수
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingClassifier
# 정확도, 혼돈행렬(참, 예측), 리포트
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.model_selection import train_test_split, GridSearchCV
dt_iris = datasets.load_iris()
X = dt_iris.data
y = dt_iris.target
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_state=124)
sc= StandardScaler()
X_train_ss = sc.fit_transform(X_train)
X_test_ss = sc.transform(X_test)
prams_boost = {
'n_estimators' : [100,150,200,250,400],
'learning_rate' : [0.01,0.02,0.03,0.05,0.1],
'max_depth':[3,4,5,7,10],
'subsample': [0.9,0.7,0.5,0.3,0.2]
}
model = GradientBoostingClassifier(random_state=1234)
gboost_cv = GridSearchCV(model, param_grid=prams_boost, cv=3,verbose=2)
gboost_cv.fit(X_train_ss, y_train)
print(gboost_cv.best_params_)
print(gboost_cv.best_score_)
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.3; total time= 1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.3; total time= 1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.2; total time= 1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.2; total time= 1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.2; total time= 1.2s
{'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 400, 'subsample': 0.3}
1.0
# model = GradientBoostingClassifier(max_depth=4, learning_rate=0.01, random_state=1234, subsample=0.8)
# model.fit(X_train_ss, y_train)
model = gboost_cv.best_estimator_
y_pred = model.predict(X_test_ss)
print(accuracy_score(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
0.9111111111111111
[[14 0 0]
[ 0 11 3]
[ 0 1 16]]
precision recall f1-score support
0 1.00 1.00 1.00 14
1 0.92 0.79 0.85 14
2 0.84 0.94 0.89 17
accuracy 0.91 45
macro avg 0.92 0.91 0.91 45
weighted avg 0.91 0.91 0.91 45
y_pred = gboost_cv.best_estimator_
y_pred
GradientBoostingClassifier(learning_rate=0.01, n_estimators=400,
random_state=1234, subsample=0.3)
'Colab > 머신러닝' 카테고리의 다른 글
16. K-평균 (K-means) & 실루엣 계수 (silhouette coefficient) 01 (0) | 2023.03.10 |
---|---|
15. XGBoost 01 (0) | 2023.03.10 |
13. GBoost 01 (0) | 2023.03.10 |
12. 부스팅(Boosting) 01 (0) | 2023.03.09 |
11. 랜덤 포레스트 (random forest) 03 (0) | 2023.03.09 |