0% found this document useful (0 votes)
173 views12 pages

Untitled1.ipynb - Colaboratory

The document contains information about a student named Daniel Ergawanto including their name, class, and student number. It then describes setting up a Google Drive folder for the project and installing some Python packages for logging and data processing. It loads and inspects a water quality dataset to prepare it for an ANN model.

Uploaded by

Daniel Ergawanto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
173 views12 pages

Untitled1.ipynb - Colaboratory

The document contains information about a student named Daniel Ergawanto including their name, class, and student number. It then describes setting up a Google Drive folder for the project and installing some Python packages for logging and data processing. It loads and inspects a water quality dataset to prepare it for an ANN model.

Uploaded by

Daniel Ergawanto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

NAMA : DANIEL ERGAWANTO L NIM : 204308096 KELAS: TKA 6B

from datetime import datetime

#### PROJECT DESCRIPTION
notebook_version = '2.0.0'
notebook_title = 'kualitas_air_ann_so' + '_' + notebook_version
prefix = datetime.utcnow().strftime("%Y%m%d_%H%M")
project_title = prefix + '_' + notebook_title

print(f'Judul Notebook: {notebook_title}')
print(f'Judul Proyek: {project_title}')
print(f'Nama  : DANIEL ERGAWANTO L')
print(f'Kelas : TKA-6B')
print(f'NPM   : 204308096')

Judul Notebook: kualitas_air_ann_so_2.0.0


Judul Proyek: 20230414_0154_kualitas_air_ann_so_2.0.0
Nama : DANIEL ERGAWANTO L
Kelas : TKA-6B
NPM : 204308096

#### Memasang Akses Google Drive (untuk tempat menyimpan hasil training)
from google.colab import drive
drive.mount('/content/gdrive')
drop_path = '/content/gdrive/My Drive/Colab Notebooks/_dropbox'

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).

#### Instalasi Paket Python Pribadi (untuk Logging)
#### https://github.com/taruma/umakit
!pip install umakit
from umakit.logtool import LogTool
mylog = LogTool()
mylog._reset()
#### Instalasi Paket Hidrokit (untuk plotting, dan transformasi kolom saat data␣
!→ preprocessing)
#### https://github.com/taruma/hidrokit
!pip install hidrokit

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


Requirement already satisfied: umakit in /usr/local/lib/python3.9/dist-packages (0.1.1)
/bin/bash: -c: line 0: syntax error near unexpected token `)'
/bin/bash: -c: line 0: `→ preprocessing)'
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: hidrokit in /usr/local/lib/python3.9/dist-packages (0.4.1)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.9/dist-packages (from hidrokit) (3.7.1)
Requirement already satisfied: pandas in /usr/local/lib/python3.9/dist-packages (from hidrokit) (1.5.3)
Requirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (from hidrokit) (1.22.4)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (23.0)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (8.4.0)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (4.39.3)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (1.0.7)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (1.4.4)
Requirement already satisfied: importlib-resources>=3.2.0 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (5.
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.9/dist-packages (from matplotlib->hidrokit) (0.11.0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas->hidrokit) (2022.7.1)
Requirement already satisfied: zipp>=3.1.0 in /usr/local/lib/python3.9/dist-packages (from importlib-resources>=3.2.0->matplotlib->
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil>=2.7->matplotlib->hidrokit)

#### Mengunggah dataset
from google.colab import files
uploaded = files.upload()

Choose Files No file chosen Upload widget is only available when

#### 1. Import dataset ke pandas.DataFrame
import pandas as pd
## import dataset
dataset = pd.read_excel('PLOT DATA DUGA AIR 2013.xlsx', skiprows=[0])
## Menamai kolom
dataset.columns = ["tanggal","JAN","FEB","MAR","APR","MEI","JUN","JUL","AGS","SEP","OKT","NOV","DES"]
## Mengatur index dataframe ke tanggal
dataset = dataset.set_index('tanggal')
# Mengubah tipe data kolom index menjadi datetime
dataset.index = pd.to_datetime(dataset.index, format='%Y-%m-%d')
# Output
print(dataset.info())

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 29 entries, 1970-01-01 00:00:00.000000003 to 1970-01-01 00:00:00.000000031
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 JAN 29 non-null float64
1 FEB 26 non-null float64
2 MAR 29 non-null float64
3 APR 28 non-null float64
4 MEI 29 non-null float64
5 JUN 28 non-null float64
6 JUL 29 non-null float64
7 AGS 29 non-null float64
8 SEP 28 non-null float64
9 OKT 29 non-null float64
10 NOV 28 non-null float64
11 DES 29 non-null float64
dtypes: float64(12)
memory usage: 2.9 KB
None

## Gambaran umum dataset
dataset.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 29 entries, 1970-01-01 00:00:00.000000003 to 1970-01-01 00:00:00.000000031
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 JAN 29 non-null float64
1 FEB 26 non-null float64
2 MAR 29 non-null float64
3 APR 28 non-null float64
4 MEI 29 non-null float64
5 JUN 28 non-null float64
6 JUL 29 non-null float64
7 AGS 29 non-null float64
8 SEP 28 non-null float64
9 OKT 29 non-null float64
10 NOV 28 non-null float64
11 DES 29 non-null float64
dtypes: float64(12)
memory usage: 2.9 KB

from hidrokit.viz import graph
graph.subplots(dataset, ncols=1, nrows=12, figsize=(15, 15));
## Pairplot untuk melihat hubungan masing-masing kolom
import seaborn as sns
g = sns.pairplot(new_dataset)
g.fig.set_size_inches(15,15)
#### Import Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Memisahkan dataset training dan test
training_dataset = new_dataset.loc[:"1970-01-01 00:00:00.000000020", :]
test_dataset = new_dataset.loc["'1970-01-01 00:00:00.000000027":, :]
## Informasi training set dan testing set
print("Informasi training set: {} baris, {} kolom".format(
    training_dataset.shape[0], training_dataset.shape[1])
)
print("Informasi testing set: {} baris, {} kolom".format(
    test_dataset.shape[0], test_dataset.shape[1])
)
## Menampilkan training set
training_dataset.index = training_dataset.index.strftime("%Y-%m-%d")
training_dataset.head()

Informasi training set: 18 baris, 12 kolom


Informasi testing set: 29 baris, 12 kolom
JAN FEB MAR APR MEI JUN JUL AGS SEP OKT NOV DES

TANGGAL

1970-01-01 1.30 1.60 2.28 1.26 1.00 1.46 1.25 1.60 1.40 1.23 1.43 1.16

1970-01-01 1.46 1.45 1.60 1.28 0.93 1.55 1.20 1.33 1.35 2.21 1.40 1.55

1970-01-01 1.75 1.45 1.45 1.26 1.10 1.50 1.31 1.20 1.25 1.43 1.29 1.41

1970-01-01 1.58 1.58 1.45 1.35 1.06 1.57 1.26 1.30 1.20 1.45 1.65 1.38

1970-01-01 1.70 1.45 1.58 1.43 0.93 1.15 1.30 1.45 1.20 2.28 1.51 1.38

array_train = training_dataset.values
array_train[:1, :]

array([[1.3 , 1.6 , 2.28, 1.26, 1. , 1.46, 1.25, 1.6 , 1.4 , 1.23, 1.43,
1.16]])

#### Scaling dataset
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range=(0,1))
array_train = sc.fit_transform(array_train)
array_train[:1, :]

array([[0.18181818, 0.6 , 1. , 0. , 0.13333333,


0.42465753, 0.14285714, 0.88888889, 0.52631579, 0. ,
0.43589744, 0.59171598]])

# Membuat dataframe baru setelah proses scaling
training_dataset_scale = pd.DataFrame(
    data=array_train,
    columns=training_dataset.columns,
    index=training_dataset.index
)
# Membuat tabel timesteps
from hidrokit.prep import timeseries
n_timesteps = 2
df_train_ts = timeseries.timestep_table(training_dataset_scale,timesteps=n_timesteps)
array_train_ts = df_train_ts.values
print("Dimensi array setelah diberi kolom timesteps: {}".format(array_train_ts.shape))

Dimensi array setelah diberi kolom timesteps: (16, 36)

## Menampilkan hasil pemberian kolom timestep dengan pandas.DataFrame
df_train_ts.head()

JAN_tmin0 JAN_tmin1 JAN_tmin2 FEB_tmin0 FEB_tmin1 FEB_tmin2 MAR_tmin0 MAR_tmin1 MAR_tmin2 APR_tmin0 ... SEP_tmin

TANGGAL

1970-
1.000000 0.472727 0.181818 0.400000 0.400000 0.600000 0.023529 0.200000 1.000000 0.000000 ... 0.526316
01-01

1970-
0.690909 1.000000 0.472727 0.573333 0.400000 0.400000 0.023529 0.023529 0.200000 0.067164 ... 0.394737
01-01

1970-
0.909091 0.690909 1.000000 0.400000 0.573333 0.400000 0.176471 0.023529 0.023529 0.126866 ... 0.131579
01-01

1970-
0.818182 0.909091 0.690909 0.666667 0.400000 0.573333 0.023529 0.176471 0.023529 0.179104 ... 0.000000
01-01

1970-
0.909091 0.818182 0.909091 0.946667 0.666667 0.400000 0.258824 0.023529 0.176471 0.037313 ... 0.000000
01-01

5 rows × 36 columns

## Pembagian X_train dan y_train untuk
## Kasus Single-Output Regression Neural Network
target_col = ["DES_tmin0"]
drop_col = ["OKT_tmin0", "NOV_tmin0", "DES_tmin0"]
df_X_train = df_train_ts.drop(drop_col, axis=1)
df_y_train = df_train_ts[target_col]
X_train = df_X_train.values
y_train = df_y_train.values.flatten()
print(f"Dimensi X_train = {X_train.shape}")
print(f"Dimensi y_train = {y_train.shape}")

Dimensi X_train = (16, 33)


Dimensi y_train = (16,)

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.wrappers.scikit_learn import KerasRegressor
def build_model(optimizer='adam', activation='sigmoid', first_layer=10,
                hidden_layers=[30], p=0, message=True):
    global idx
    model = Sequential()
    model.add(Dense(first_layer, activation=activation, input_dim=33))
    model.add(Dropout(p))
    if hidden_layers:
      for x in hidden_layers:
          model.add(Dense(x, activation=activation))
          if x == hidden_layers[-1]:
            model.add(Dropout(p/2))
          else:
            model.add(Dropout(p))
    model.add(Dense(1))
    model.compile(optimizer=optimizer, loss='mean_squared_error',
              metrics=['mse', 'mae'])
    if message and ('idx' in globals()):
       print(f"{idx}>", end="")
       idx -= 1
    if (idx % 10) == 0:
       print()
    return model
model = KerasRegressor(build_fn=build_model, verbose=0)

<ipython-input-64-1ca7dc9a53c2>:26: DeprecationWarning: KerasRegressor is deprecated, use Sci-Keras (https://github.com/adriangb/sc


model = KerasRegressor(build_fn=build_model, verbose=0)
from sklearn.model_selection import GridSearchCV
param_grid = dict(epochs=[100, 150, 200],
                  batch_size=[5, 10, 20],
                  first_layer=[10, 20, 30],
                  hidden_layers=[[20], [30]],
                  activation=['sigmoid'],
                  optimizer=['adam'],
                  )
# Ignore K-Fold Cross Validation
cv = [(slice(None), slice(None))]
# cv = 3
grid_search = GridSearchCV(estimator=model,
                  param_grid=param_grid,
                  cv=cv,
                  return_train_score=True,
                  verbose=1,
                  scoring='neg_mean_squared_error',
                  )

# idx
search_steps = 1
for key, val in param_grid.items():
    search_steps *= len(val)
idx = search_steps*cv if (type(cv) is int) else search_steps
# Fitting
print(mylog.add_savepoint("START FITTING", 'fit'))
grid_search = grid_search.fit(X_train, y_train, verbose=0, validation_split=0.2)
print(mylog.add_savepoint("END FITTING", 'fit'))
print(mylog.add_duration('fit'))

[2023-04-14 01:57:17] START FITTING


Fitting 1 folds for each of 54 candidates, totalling 54 fits
54>53>52>51>
50>49>48>47>46>45>44>43>42>41>
40>39>38>37>36>35>34>33>32>31>
30>29>28>27>26>25>24>23>22>21>
20>19>18>17>16>15>14>13>12>11>
10>9>8>7>6>5>4>3>2>1>
0>[2023-04-14 02:04:20] END FITTING
0:7:2

# Menyimpan object keras di final_model
final_model = grid_search.best_estimator_.model
# Simpan hasil grid search pada dataframe
df_cv = pd.DataFrame(grid_search.cv_results_)

import os
#buat direktori baru
if not os.path.exists(drop_path):
  os.makedirs(drop_path)

# Save Model in JSON
fmodel_json = final_model.to_json()
fmodel_j_path = drop_path + '/{}.json'.format(project_title)
with open(fmodel_j_path, 'w') as json_file:
      json_file.write(fmodel_json)
mylog.add(f'Model JSON disimpan di {fmodel_j_path}')
print('save: {}'.format(fmodel_j_path))

# Save Weights of model
fmodel_w_path = drop_path + '/{}_weights.h5'.format(project_title)
final_model.save_weights(fmodel_w_path)
mylog.add(f'Model Weights disimpan di {fmodel_w_path}')
print('save: {}'.format(fmodel_w_path))

# Simpan model dan grid_search object
save_model_path = drop_path + '/' + project_title + '.h5'
final_model.save(save_model_path)
mylog.add(f'Model disimpan di {save_model_path}')
print('save: {}'.format(save_model_path))

# Simpan hasil GridSearch
save_grid_path = drop_path + '/{}.csv'.format(project_title)
df_cv.to_csv(save_grid_path)
mylog.add(f'Tabel GridSearch disimpan di {save_grid_path}')
print('save: {}'.format(save_grid_path))

save: /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so_2.0.0.json


save: /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so_2.0.0_weights.h5
save: /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so_2.0.0.h5
save: /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so_2.0.0.csv

# load_model_path = drop_path + '/20190512_2037_kualitas_air_ann_so.h5'
# load_cvgrid_path = drop_path + '/20190512_2037_kualitas_air_ann_so.csv'
# from keras.models import load_model
# final_model = load_model(load_model_path)
# df_cv = pd.read_csv(load_cvgrid_path, index_col=[0])
# df_cv.head()

## Menampilkan test dataset
test_dataset.head()

JAN FEB MAR APR MEI JUN JUL AGS SEP OKT NOV DES

TANGGAL

1970-01-01 00:00:00.000000003 1.30 1.60 2.28 1.26 1.00 1.46 1.25 1.60 1.40 1.23 1.43 1.16

1970-01-01 00:00:00.000000004 1.46 1.45 1.60 1.28 0.93 1.55 1.20 1.33 1.35 2.21 1.40 1.55

1970-01-01 00:00:00.000000005 1.75 1.45 1.45 1.26 1.10 1.50 1.31 1.20 1.25 1.43 1.29 1.41

1970-01-01 00:00:00.000000006 1.58 1.58 1.45 1.35 1.06 1.57 1.26 1.30 1.20 1.45 1.65 1.38

1970-01-01 00:00:00.000000007 1.70 1.45 1.58 1.43 0.93 1.15 1.30 1.45 1.20 2.28 1.51 1.38

## pandas.DataFrame ke numpy.array
array_test = test_dataset.values
array_test = sc.transform(array_test)

test_dataset_scale = pd.DataFrame(
    data=array_test,
    columns=test_dataset.columns,
    index=test_dataset.index
)

## timestep table
df_test = timeseries.timestep_table(test_dataset_scale, timesteps=n_timesteps)
array_test_ts = df_test.values
df_test.head()

JAN_tmin0 JAN_tmin1 JAN_tmin2 FEB_tmin0 FEB_tmin1 FEB_tmin2 MAR_tmin0 MAR_tmin1 MAR_tmin2 APR_tmin0 ...

TANGGAL

1970-01-01
1.000000 0.472727 0.181818 0.400000 0.400000 0.600000 0.023529 0.200000 1.000000 0.000000 ...
00:00:00.000000005

1970-01-01
0.690909 1.000000 0.472727 0.573333 0.400000 0.400000 0.023529 0.023529 0.200000 0.067164 ...
00:00:00.000000006

1970-01-01
0.909091 0.690909 1.000000 0.400000 0.573333 0.400000 0.176471 0.023529 0.023529 0.126866 ...
00:00:00.000000007

1970-01-01
0.818182 0.909091 0.690909 0.666667 0.400000 0.573333 0.023529 0.176471 0.023529 0.179104 ...
00:00:00.000000008

1970-01-01
0.909091 0.818182 0.909091 0.946667 0.666667 0.400000 0.258824 0.023529 0.176471 0.037313 ...
00:00:00.000000009

5 rows × 36 columns

## Pembagian X_test dan y_test untuk
## Kasus Single-Output Regression Neural Network
## Meninjau output_amonia

df_X_test = df_test.drop(drop_col, axis=1)
df_y_test = df_test[target_col]
X_test = df_X_test.values
y_test = df_y_test.values.flatten()
print(f"Dimensi X_test = {X_test.shape}")
print(f"Dimensi y_test = {y_test.shape}")

Dimensi X_test = (27, 33)


Dimensi y_test = (27,)
# Prediksi
predict = final_model.predict(X_test)
truth = y_test

1/1 [==============================] - 0s 80ms/step

# Transfer attribute from MinMax Scaler (specific for last column (output) only)
sc_test = MinMaxScaler()
sc_test.min_, sc_test.scale_, sc_test.data_min_, sc_test.data_max_ = sc.min_[-1], sc.scale_[-1], sc.data_min_[-1], sc.data_max_[-1]
# Mengembalikan ke skala original
predict_real = sc_test.inverse_transform(predict.reshape(-1,1))
truth_real = sc_test.inverse_transform(truth.reshape(-1,1))

# Dalam bentuk pandas.DataFrame
diff_table = pd.DataFrame(dict(predict=predict_real.flatten(),
                          truth=truth_real.flatten(),
                         ))
diff_table['diff'] = (diff_table.predict - diff_table.truth).abs()
diff_table.T

0 1 2 3 4 5 6 7 8 9 ... 17 18

predict 1.351687 1.526927 1.292052 1.326772 1.223638 1.246197 0.559753 1.638954 1.65678 1.547413 ... 1.744475 1.789237 1.8135

truth 1.410000 1.380000 1.380000 1.350000 1.260000 1.550000 0.160000 1.600000 1.70000 1.550000 ... 1.030000 0.960000 1.0000

diff 0.058313 0.146927 0.087948 0.023228 0.036362 0.303803 0.399753 0.038954 0.04322 0.002587 ... 0.714475 0.829237 0.8135

3 rows × 27 columns

metrics_train = final_model.evaluate(X_train, y_train, verbose=0)
metrics_test = final_model.evaluate(X_test, y_test, verbose=0)
for i, metrics in enumerate(final_model.metrics_names):
    print(f"Metrics: {metrics}")
    print(f"Train: {metrics_train[i]:.5f}")
    print(f"Test: {metrics_test[i]:.5f}")
    print()

Metrics: loss
Train: 0.02342
Test: 0.05267

Metrics: mse
Train: 0.02342
Test: 0.05267

Metrics: mae
Train: 0.10121
Test: 0.16308

## menghitung MSE dan MAE test set dengan skala original
from sklearn.metrics import mean_squared_error, mean_absolute_error
mse_real = mean_squared_error(truth_real, predict_real)
mae_real = mean_absolute_error(truth_real, predict_real)
print(f"MSE (Original Scale): {mse_real:.4f}")
print(f"MAE (Original Scale): {mae_real:.4f}")

MSE (Original Scale): 0.1504


MAE (Original Scale): 0.2756

#### PLOT OUT_AMONIA PREDICTION AND TRUTH (OBSERVED VALUE)
plt.plot(truth_real, 'b', label='Data Asli')
plt.plot(predict_real, 'k--', label='Data Prediksi')
plt.title('Grafik Nilai Prediksi dan Observasi')
plt.legend()
plt.show()
# PLOT TRUTH vs. PREDICT
plt.scatter(y=predict_real, x=truth_real)
plt.xlabel('Truth Value')
plt.ylabel('Prediction Value')
plt.title('Plot Titik Prediksi dan Observasi')
plt.show()

# Menggunakan seaborn
sns.jointplot(x='truth', y='predict', kind='reg', data=diff_table);
# statistik deskriptif nilai Beda (Residu)
diff_table['diff'].describe()

count 27.000000
mean 0.275612
std 0.278106
min 0.002587
25% 0.050766
50% 0.146927
75% 0.457852
max 0.829237
Name: diff, dtype: float64

# plot histogram
diff = diff_table['diff'].values
plt.hist(diff)
plt.xlabel('Nilai Beda/Residu')
plt.ylabel('Frekuensi')
plt.title('Histogram Residu Nilai Prediksi dan Asli')
plt.show()

# menghilangkan [] pada hidden layers di df_cv
df_cv['param_hidden_layers'] = df_cv['param_hidden_layers'].apply(lambda x:
(str(x)[1:-1]) if str(x)[0] == '[' else x)
# memilih kolom yang akan digunakan untuk interpretasi
col_grid = ['param_activation', 'param_batch_size', 'param_epochs',
            'param_first_layer', 'param_hidden_layers', 'param_optimizer',
            'mean_test_score', 'rank_test_score']
df_grid = df_cv[col_grid]
print(df_grid.shape)
df_grid.head()

(54, 8)
param_activation param_batch_size param_epochs param_first_layer param_hidden_layers param_optimizer mean_test_score rank_

0 sigmoid 5 100 10 20 adam -0.046197

1 sigmoid 5 100 10 30 adam -0.032158

2 sigmoid 5 100 20 20 adam -0.033082

3 sigmoid 5 100 20 30 adam -0.040537

4 sigmoid 5 100 30 20 adam -0.028902

# mengurutkan berdasarkan mean_test_score / rank_test_score
df_grid_sorted = df_grid.sort_values('rank_test_score')
df_grid_sorted.head()
param_activation param_batch_size param_epochs param_first_layer param_hidden_layers param_optimizer mean_test_score rank

14 sigmoid 5 200 20 20 adam -0.022776


# hasil grid terburuk
15 sigmoid
df_grid_sorted.tail() 5 200 20 30 adam -0.023117

16 sigmoid 5 200 30 20 adam -0.027168


param_activation param_batch_size param_epochs param_first_layer param_hidden_layers param_optimizer mean_test_score rank
23 sigmoid 10 100 30 30 adam -0.028620
25 sigmoid 10 150 10 30 adam -0.044756
10 sigmoid 5 150 30 20 adam -0.028655
48 sigmoid 20 200 10 20 adam -0.045086

0 sigmoid 5 100 10 20 adam -0.046197

37 sigmoid 20 100 10 30 adam -0.046315

36 sigmoid 20 100 10 20 adam -0.051604

# evaluasi 20 hasil terbaik
df_grid_top = df_grid_sorted.iloc[:20,:].copy()
# menghitung nilai unik tiap kolom
for col in df_grid_top.columns[:-2]:
    print(df_grid_top[col].value_counts())

sigmoid 20
Name: param_activation, dtype: int64
5 11
10 6
20 3
Name: param_batch_size, dtype: int64
200 8
150 7
100 5
Name: param_epochs, dtype: int64
30 10
20 5
10 5
Name: param_first_layer, dtype: int64
20 10
30 10
Name: param_hidden_layers, dtype: int64
adam 20
Name: param_optimizer, dtype: int64

# Grafik antara rank_test_score dan mean_test_score
sns.set(style='ticks')
relplot = sns.relplot(x='rank_test_score', y='mean_test_score', data=df_grid,
kind='line')
plt.gca().set_title("Kurva antara rank_test_score dengan mean_test_score")
plt.gca().invert_xaxis()

# Log summary
print(mylog.summary())
[2023-04-14 01:57:17] START FITTING
[2023-04-14 02:04:20] END FITTING
[2023-04-14 02:04:20] Duration: 0:7:2
[2023-04-14 02:06:34] Model JSON disimpan di /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so_2.
[2023-04-14 02:06:34] Model Weights disimpan di /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so
[2023-04-14 02:06:34] Model disimpan di /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann_so_2.0.0.h
[2023-04-14 02:06:34] Tabel GridSearch disimpan di /content/gdrive/My Drive/Colab Notebooks/_dropbox/20230414_0154_kualitas_air_ann

You might also like