0% found this document useful (0 votes)
9 views

Linear Regression

The document describes building a linear regression model in TensorFlow to predict company profits based on population size data. It loads and explores a dataset of 97 data points containing population and profit figures for different companies. It then sets up placeholders and variables for the input, output, weights, and bias in a linear regression model. The model is trained over 800 iterations to minimize the loss function using gradient descent optimization. The trained weights and bias are saved for future use.

Uploaded by

WONDYE DESTA
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Linear Regression

The document describes building a linear regression model in TensorFlow to predict company profits based on population size data. It loads and explores a dataset of 97 data points containing population and profit figures for different companies. It then sets up placeholders and variables for the input, output, weights, and bias in a linear regression model. The model is trained over 800 iterations to minimize the loss function using gradient descent optimization. The trained weights and bias are saved for future use.

Uploaded by

WONDYE DESTA
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1/23/24, 11:11 AM linear_regression.

ipynb - Colaboratory

1 import os
2 import numpy as np
3 import pandas as pd
4 import matplotlib.pyplot as plt
5 # if tensorflow is not installed uncomment the code below
6 #pip install tensorflow
7
8 #import tensorflow as tf # please use this if not successful then use the next two line
9
10 import tensorflow.compat.v1 as tf
11 tf.disable_v2_behavior()

WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/tensorflow/python/compat/v2_compat.py:108: disable_resource_variables (from tensorflow.python.ops.variable_scope) i


Instructions for updating:
non-resource variables are not supported in the long term

1 from google.colab import drive


2 drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

1 #pip install tensorflow #pip3 install tensorflow

1 path = '/content/drive/MyDrive/Machine-Learning/linear_data/population_profit.txt'
2 data = pd.read_csv(path, header=None, names=['Population','Profit'])

1 data.head()

Population Profit

0 6.1101 17.5920

1 5.5277 9.1302

2 8.5186 13.6620

3 7.0032 11.8540

4 5.8598 6.8233

1 data.tail(4)

Population Profit

93 5.3054 1.98690

94 8.2934 0.14454

95 13.3940 9.05510

96 5.4369 0.61705

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 1/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory
1 data.shape
2 #{}, [], ()

(97, 2)

1 data.Population

0 6.1101
1 5.5277
2 8.5186
3 7.0032
4 5.8598
...
92 5.8707
93 5.3054
94 8.2934
95 13.3940
96 5.4369
Name: Population, Length: 97, dtype: float64

1 data['Profit'].head()

0 17.5920
1 9.1302
2 13.6620
3 11.8540
4 6.8233
Name: Profit, dtype: float64

1 data.describe()

Population Profit

count 97.000000 97.000000

mean 8.159800 5.839135

std 3.869884 5.510262

min 5.026900 -2.680700

25% 5.707700 1.986900

50% 6.589400 4.562300

75% 8.578100 7.046700

max 22.203000 24.147000

1 data.info()
2 #data.describe()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 97 entries, 0 to 96
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Population 97 non-null float64

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 2/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory
1 Profit 97 non-null float64
dtypes: float64(2)
memory usage: 1.6 KB

1 num_of_sample = data.shape[0]
2 num_of_attribute=data.shape[1]
3 print('Number of sample', num_of_sample)
4 print('Number of attribute', num_of_attribute)

Number of sample 97
Number of attribute 2

1 data_x, data_y = data['Population'], data['Profit']


2 data.plot(kind='scatter', x='Population', y='Profit', figsize=(8,6))

<Axes: xlabel='Population', ylabel='Profit'>

1 plt.plot(data_x, data_y,'*')
2 plt.xlabel("Population in 10K's")
3 plt.ylabel("Profit in 10K's")
4 plt.legend(['Data Point'], bbox_to_anchor=(1,1), loc=3)

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 3/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory

<matplotlib.legend.Legend at 0x79e26e128a00>

1 print(data[0:10].shape, data_y[5:9].shape)

(10, 2) (4,)

1 print(data_x[10:15], data_y[8:13])

10 5.7107
11 14.1640
12 5.7340
13 8.4084
14 5.6407
Name: Population, dtype: float64 8 6.5987
9 3.8166
10 3.2522
11 15.5050
12 3.1551
Name: Profit, dtype: float64

1 #Simple Linear regression y=mx+b Creating X as Population and Y as Profits


2 with tf.name_scope('inputs'):
3 X = tf.placeholder(tf.float32, name="input")
4 Y = tf.placeholder(tf.float32, name="output")

1 # Creating weight and bias variables initialized to Zero


2 with tf.name_scope('parameters'):
3 w = tf.Variable(0.0, name='Weights')
4 b = tf.Variable(0.0, name='bias')

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 4/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory
1 # Defining a linear regression model to predict profit output from population input
2 # Y_predict = X*w + b
3 with tf.name_scope('regression_model'):
4 Y_predict = X*w + b

1 # Defining loss
2 with tf.name_scope('loss_function'):
3 loss = tf.reduce_mean(tf.square(Y-Y_predict, name = 'loss'))

1 #### Gradient descent optimizer for optimizing weight and bias


2 optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
3 train_op = optimizer.minimize(loss)

1 #### Creating a summary


2 w_hist = tf.summary.histogram("Weights", w)
3 b_hist = tf.summary.histogram("biasBiases", b)
4 y_hist = tf.summary.histogram("y_predicted", Y_predict)
5 cost = tf.summary.scalar("loss", loss)
6 merged_summary = tf.summary.merge_all()

1 #### Saving the trained model


2 saver = tf.train.Saver()

1 #### Calling TF operation through session


2 cost_history = np.empty(shape=[1], dtype=float)
3 with tf.Session() as sess:
4 ### Summary writer
5 Summary_writter = tf.summary.FileWriter('/content/drive/MyDrive/Machine-Learning/linear_data/linear-model-test/linear_reg_summary', sess.graph)
6 sess.run(tf.global_variables_initializer())
7 for i in range(800):
8 for x, y in zip(data_x, data_y):
9 _, loss_v, summary = sess.run([train_op, loss, merged_summary], feed_dict={X:x, Y:y})
10 cost_history = np.append(cost_history, loss_v)
11 if i % 20 == 0:
12 print("the loss is : ", loss_v, " at ", i)
13 Summary_writter.add_summary(summary,i)
14 w_value, b_value = sess.run([w,b])
15 saver.save(sess,"/content/drive/MyDrive/Machine-Learning/linear_data/linear-model-test/model-final-lin-reg")
16

the loss is : 5.8409615 at 0


the loss is : 3.011311 at 20
the loss is : 1.8686026 at 40
the loss is : 1.3639505 at 60
the loss is : 1.1241888 at 80
the loss is : 1.0042403 at 100
the loss is : 0.94222325 at 120
the loss is : 0.9095278 at 140
the loss is : 0.89209586 at 160
the loss is : 0.88274276 at 180
the loss is : 0.8777063 at 200
the loss is : 0.87499225 at 220
the loss is : 0.8735263 at 240
the loss is : 0.87273496 at 260

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 5/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory
the loss is : 0.8723056 at 280
the loss is : 0.8720758 at 300
the loss is : 0.8719516 at 320
the loss is : 0.87187946 at 340
the loss is : 0.87183625 at 360
the loss is : 0.8718193 at 380
the loss is : 0.8718144 at 400
the loss is : 0.87181044 at 420
the loss is : 0.87180996 at 440
the loss is : 0.87180996 at 460
the loss is : 0.87180996 at 480
the loss is : 0.87180996 at 500
the loss is : 0.87180996 at 520
the loss is : 0.87180996 at 540
the loss is : 0.87180996 at 560
the loss is : 0.87180996 at 580
the loss is : 0.87180996 at 600
the loss is : 0.87180996 at 620
the loss is : 0.87180996 at 640
the loss is : 0.87180996 at 660
the loss is : 0.87180996 at 680
the loss is : 0.87180996 at 700
the loss is : 0.87180996 at 720
the loss is : 0.87180996 at 740
the loss is : 0.87180996 at 760
the loss is : 0.87180996 at 780

1 w_value, b_value

(0.9960518, -3.9217443)

1 len= cost_history.shape[0]
2 iters = len
3 fig, ax = plt.subplots(figsize=(6,4))
4 ax.plot(np.arange(iters), cost_history, 'r')
5 ax.set_xlabel('Iterations')
6 ax.set_ylabel('Cost')
7 ax.set_title('Error vs Training Epoch')

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 6/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory

Text(0.5, 1.0, 'Error vs Training Epoch')

1 w_value

0.9960518

1 b_value

-3.9217443

1 print(data_x[0:10],data_y[0:10])

0 6.1101
1 5.5277
2 8.5186
3 7.0032
4 5.8598
5 8.3829
6 7.4764
7 8.5781
8 6.4862
9 5.0546
Name: Population, dtype: float64 0 17.5920
1 9.1302
2 13.6620
3 11.8540
4 6.8233
5 11.8860
6 4.3483
7 12.0000
8 6.5987
9 3.8166
Name: Profit, dtype: float64

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 7/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory
1 data_x[0:10]

0 6.1101
1 5.5277
2 8.5186
3 7.0032
4 5.8598
5 8.3829
6 7.4764
7 8.5781
8 6.4862
9 5.0546
Name: Population, dtype: float64

1 np.array(data_x[0:10])

array([6.1101, 5.5277, 8.5186, 7.0032, 5.8598, 8.3829, 7.4764, 8.5781,


6.4862, 5.0546])

1 print(np.array(data_x[0:10]))
2 print(np.array(data_y[0:10]))

[6.1101 5.5277 8.5186 7.0032 5.8598 8.3829 7.4764 8.5781 6.4862 5.0546]
[17.592 9.1302 13.662 11.854 6.8233 11.886 4.3483 12. 6.5987
3.8166]

1 ####### test
2 x_test = np.array(data_x[0:10])
3 y_test = np.array(data_y[0:10])
4 y_test_predicted = x_test*w_value + b_value
5 print('predicted value',y_test_predicted)
6 print('TRUE VALUE',y_test)
7 print(w_value, b_value)

predicted value [2.16423169 1.58413112 4.56322242 3.05380554 1.91491992 4.42805819


3.52513724 4.6224875 2.53884676 1.11289902]
TRUE VALUE [17.592 9.1302 13.662 11.854 6.8233 11.886 4.3483 12. 6.5987
3.8166]
0.9960518 -3.9217443

1 print(w_value, b_value) ###### test


2 x_test = np.array(data_x)
3 y_test = np.array(data_y)
4 y_test_predicted = x_test*w_value + b_value
5 ###print(y_test_predicted)

0.9960518 -3.9217443

1 plt.plot(x_test, y_test, "o", x_test, y_test_predicted, "-")


2 plt.xlabel("Population in 10K's")
3 plt.ylabel("Profit in 10K USD")
4 plt.legend(['Data', 'Linear Regression'], bbox_to_anchor = (1,1), loc = 4)
5 plt.show()

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 8/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory

1 plt.scatter(x_test, y_test,marker= "o")


2 plt.plot(x_test, y_test_predicted,"b")
3 plt.xlabel("Population in 10K's")
4 plt.ylabel("Profit in 10K USD")
5 plt.legend(['Data', 'Linear Regression'], bbox_to_anchor = (1,1), loc = 4)
6 plt.show()

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 9/10
1/23/24, 11:11 AM linear_regression.ipynb - Colaboratory

1 tf.reset_default_graph()
2 with tf.Session() as sess:
3 imp meta=tf.train.import meta graph("/content/drive/MyDrive/Machine-Learning/linear data/linear-model-test/model-final-lin-reg.meta")

https://colab.research.google.com/drive/19QQe4ulOcsyQS5d6irtediXca_rmUhkp?authuser=1#scrollTo=FBaM7O0YLlCI&printMode=true 10/10

You might also like