I am doing this as to learn machine learning.

I did comparison of Gradient Boosted Regression, XGBoost, Lasso, Ridge, ElasticNetCV, Support Vector Regression, and LightGBM.

After doing the calculation of mean squared error for each algorithm, I wanted to plot the training error to see their performance.

However, the plot came out differently than the number I received.

For example, in the ** image** below, for Lasso, the mean squared error is at about

**.**

*0.006++*But, when I calculate using code below, the result is **0.0082**.

**Algorithm**

```
lsr = Lasso(alpha=0.00047)
```

**Mean Squared Error calculation**

```
-cross_val_score(lsr, train_dummies, y, scoring="neg_mean_squared_error").mean()
```

Here are the rest of other algorithm that I ran:

```
svr = make_pipeline(RobustScaler(), SVR(C= 20, epsilon= 0.008, gamma=0.0003))
gbr = GradientBoostingRegressor(max_depth=4, n_estimators=150)
xgbr = XGBRegressor(max_depth=5, n_estimators=400)
rr = Ridge(alpha=13)
svr = make_pipeline(RobustScaler(), SVR(C= 20, epsilon= 0.008, gamma=0.0003))
lgbm = LGBMRegressor(objective='regression',
num_leaves=4,
learning_rate=0.01,
n_estimators=5000,
max_bin=200,
bagging_fraction=0.75,
bagging_freq=5,
bagging_seed=7,
feature_fraction=0.2,
feature_fraction_seed=7,
verbose=-1,
)
```

**These are for elasticnet**

```
e_alphas = [0.0001, 0.0002, 0.0003, 0.0004, 0.0005, 0.0006, 0.0007]
e_l1ratio = [0.8, 0.85, 0.9, 0.95, 0.99, 1]
en = make_pipeline(RobustScaler(), ElasticNetCV(max_iter=1e7, alphas=e_alphas, cv=5, l1_ratio=e_l1ratio))
```

Here is the code for the **learning curve**

```
pl_mo = {'GBR': GradientBoostingRegressor(max_depth=4, n_estimators=150),
'XGB': XGBRegressor(max_depth=5, n_estimators=400),
'Lasso': Lasso(alpha=0.00047),
'Ridge': Ridge(alpha=13),
'ENet': make_pipeline(RobustScaler(), ElasticNetCV(max_iter=1e7, alphas=e_alphas, l1_ratio=e_l1ratio)),
'SVR': make_pipeline(RobustScaler(), SVR(C= 20, epsilon= 0.008, gamma=0.0003)),
'LGBM': LGBMRegressor(objective='regression',
num_leaves=4,
learning_rate=0.01,
n_estimators=5000,
max_bin=200,
bagging_fraction=0.75,
bagging_freq=5,
bagging_seed=7,
feature_fraction=0.2,
feature_fraction_seed=7,
verbose=-1,
)
}
plt.figure(figsize=(10,7))
for k,v in pl_mo.items():
(train_sizes,
train_scores,
test_scores) = learning_curve(v,
train_dummies,
y,
cv=5,
scoring='neg_mean_squared_error')
train_scores = -train_scores
train_mean = np.mean(train_scores, axis=1)
plt.plot(train_sizes, train_mean, label=k)
plt.title("Training Error")
plt.xlabel("Training Set Size"), plt.ylabel("Mean Squared Error")
plt.legend()
plt.show()
```

Here is the **plot result image**.

If anyone could point me in the right direction, I would be eternally grateful.