
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Train Model with StackOverflow Question Dataset Using TensorFlow in Python
Tensorflow is a machine learning framework that is provided by Google. It is an open-source framework used in conjunction with Python to implement algorithms, deep learning applications and much more. It is used in research and for production purposes.
It has optimization techniques that help in performing complicated mathematical operations quickly.
This is because it uses NumPy and multi-dimensional arrays. These multi-dimensional arrays are also known as ‘tensors’. The framework supports working with deep neural network. It is highly scalable, and comes with many popular datasets. It uses GPU computation and automates the management of resources. It comes with multitude of machine learning libraries, and is well-supported and documented. The framework has the ability to run deep neural network models, train them, and create applications that predict relevant characteristics of the respective datasets.
The ‘tensorflow’ package can be installed on Windows using the below line of code −
pip install tensorflow
Tensor is a data structure used in TensorFlow. It helps connect edges in a flow diagram. This flow diagram is known as the ‘Data flow graph’. Tensors are nothing but multidimensional array or a list. They can be identified using three main attributes −
Rank − It tells about the dimensionality of the tensor. It can be understood as the order of the tensor or the number of dimensions in the tensor that has been defined.
Type − It tells about the data type associated with the elements of the Tensor. It can be a one dimensional, two dimensional or n dimensional tensor.
Shape − It is the number of rows and columns together.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
Example
Following is the code snippet −
print("A bag-of-words linear model is built to train the stackoverflow dataset") binary_model = tf.keras.Sequential([layers.Dense(4)]) binary_model.compile( loss=losses.SparseCategoricalCrossentropy(from_logits=True), optimizer='adam', metrics=['accuracy']) history = binary_model.fit( binary_train_ds, validation_data=binary_val_ds, epochs=10)
Code credit − https://www.tensorflow.org/tutorials/load_data/text
Output
A bag-of-words linear model is built to train the stackoverflow dataset Epoch 1/10 188/188 [==============================] - 4s 19ms/step - loss: 1.2450 - accuracy: 0.5243 - val_loss: 0.9285 - val_accuracy: 0.7645 Epoch 2/10 188/188 [==============================] - 1s 3ms/step - loss: 0.8304 - accuracy: 0.8172 - val_loss: 0.7675 - val_accuracy: 0.7895 Epoch 3/10 188/188 [==============================] - 1s 3ms/step - loss: 0.6615 - accuracy: 0.8625 - val_loss: 0.6824 - val_accuracy: 0.8050 Epoch 4/10 188/188 [==============================] - 1s 3ms/step - loss: 0.5604 - accuracy: 0.8833 - val_loss: 0.6291 - val_accuracy: 0.8125 Epoch 5/10 188/188 [==============================] - 1s 3ms/step - loss: 0.4901 - accuracy: 0.9034 - val_loss: 0.5923 - val_accuracy: 0.8210 Epoch 6/10 188/188 [==============================] - 1s 3ms/step - loss: 0.4370 - accuracy: 0.9178 - val_loss: 0.5656 - val_accuracy: 0.8255 Epoch 7/10 188/188 [==============================] - 1s 3ms/step - loss: 0.3948 - accuracy: 0.9270 - val_loss: 0.5455 - val_accuracy: 0.8290 Epoch 8/10 188/188 [==============================] - 1s 3ms/step - loss: 0.3601 - accuracy: 0.9325 - val_loss: 0.5299 - val_accuracy: 0.8295 Epoch 9/10 188/188 [==============================] - 1s 3ms/step - loss: 0.3307 - accuracy: 0.9408 - val_loss: 0.5177 - val_accuracy: 0.8335 Epoch 10/10 188/188 [==============================] - 1s 3ms/step - loss: 0.3054 - accuracy: 0.9472 - val_loss: 0.5080 - val_accuracy: 0.8340
Explanation
The neural network is created using the ‘Sequential’ API.
For data that has been vectorized in ‘binary’ format, a bag-of-words model is trained, which is a linear model.