14. GBoost 02

Colab/머신러닝

14. GBoost 02

HicKee 2023. 3. 10. 15:36

loss : 경사 하강법에서 사용할 비용함수 MSE

learning rate : 학습률

n_estimator : weak_learn 개수 (디폴트 100)

subsample : weak_learn 가 학습에 사용하는 데이터와 샘플링의 비율 (디폴트 1)

> 과적합이 우려되면 1보다 작은 수

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingClassifier
# 정확도, 혼돈행렬(참, 예측), 리포트
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.model_selection import train_test_split, GridSearchCV

dt_iris = datasets.load_iris()
X = dt_iris.data
y = dt_iris.target

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_state=124)

sc= StandardScaler()

X_train_ss = sc.fit_transform(X_train)
X_test_ss = sc.transform(X_test)

prams_boost = {
    'n_estimators' : [100,150,200,250,400],
    'learning_rate' : [0.01,0.02,0.03,0.05,0.1],
    'max_depth':[3,4,5,7,10],
    'subsample': [0.9,0.7,0.5,0.3,0.2]
}
model  = GradientBoostingClassifier(random_state=1234)
gboost_cv = GridSearchCV(model, param_grid=prams_boost, cv=3,verbose=2)
gboost_cv.fit(X_train_ss, y_train)
print(gboost_cv.best_params_)
print(gboost_cv.best_score_)

[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.3; total time=   1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.3; total time=   1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.2; total time=   1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.2; total time=   1.2s
[CV] END learning_rate=0.1, max_depth=10, n_estimators=400, subsample=0.2; total time=   1.2s
{'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 400, 'subsample': 0.3}
1.0

# model = GradientBoostingClassifier(max_depth=4, learning_rate=0.01, random_state=1234, subsample=0.8)
# model.fit(X_train_ss, y_train)
model = gboost_cv.best_estimator_


y_pred = model.predict(X_test_ss)

print(accuracy_score(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

0.9111111111111111
[[14  0  0]
[ 0 11  3]
[ 0  1 16]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       0.92      0.79      0.85        14
           2       0.84      0.94      0.89        17

    accuracy                           0.91        45
   macro avg       0.92      0.91      0.91        45
weighted avg       0.91      0.91      0.91        45

y_pred = gboost_cv.best_estimator_
y_pred

GradientBoostingClassifier(learning_rate=0.01, n_estimators=400,
random_state=1234, subsample=0.3)

저작자표시

'Colab > 머신러닝' 카테고리의 다른 글

16. K-평균 (K-means) & 실루엣 계수 (silhouette coefficient) 01 (0)	2023.03.10
15. XGBoost 01 (0)	2023.03.10
13. GBoost 01 (0)	2023.03.10
12. 부스팅(Boosting) 01 (0)	2023.03.09
11. 랜덤 포레스트 (random forest) 03 (0)	2023.03.09

현재글14. GBoost 02

HicKee

14. GBoost 02

'Colab > 머신러닝' 카테고리의 다른 글

'Colab/머신러닝'의 다른글

티스토리툴바

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

14. GBoost 02

'Colab > 머신러닝' 카테고리의 다른 글

'Colab/머신러닝'의 다른글

관련글

티스토리툴바