Python Programming Tutorial For Machine Learning Beginners Using
Python Programming Tutorial For Machine Learning Beginners Using
Google Colab is an online platform that lets you write and run Python code through the browser. It
is particularly useful for machine learning projects because it provides free access to GPUs (Graphics
Processing Units), which can speed up the computation needed for machine learning.
2. Understanding Comments
In Python, the hash symbol # is used to start a comment in the code. A comment is a line of text
in your program that is not executed as part of the program. Its primary purpose is to annotate
the code to help programmers understand the code's functionality or intent more easily.
Comments can also be used to temporarily disable parts of your code during testing without
deleting it.
2. Exploring Dictionaries:
● Purpose: Dictionaries hold data as key-value pairs, which is similar to how a real dictionary
works with word definitions.
● Example:
What is an Array?
An array is a data structure that stores a collection of items. In programming, arrays are used to
organize data so that a related set of values can be easily sorted or searched. Unlike Python's
built-in list type, which can store items of different data types, arrays typically require all
elements to be of the same type, making them more efficient for certain operations.
Arrays in Python can be created and manipulated using the NumPy library, which provides a
high-performance array object that is central to doing numerical computations.
What is import?
In Python, import is a keyword used to include the code contained in another Python source
file. In other words, import allows you to access and use functions, classes, and variables defined
For example:
import numpy
This line tells Python to load the NumPy library, making all of NumPy's functions and features
available in your script.
The as keyword in Python is used in conjunction with the import statement to create an alias
for the imported module. This lets you refer to the module with a different name, usually a
shorter one, which is handy if you need to call it frequently in your code.
For example:
import numpy as np
Here, np is an alias for numpy. This means that instead of typing numpy.array() to create a
new array, you can use the shorter np.array().
What is NumPy?
NumPy, which stands for Numerical Python, is an open-source Python library that is widely used
in data science and scientific computing. It's known for its powerful array object, but it also
provides:
This example shows how to create a NumPy array and then add a number to every element in
the array using NumPy's ability to handle vectorized operations efficiently. Such capabilities
make NumPy an invaluable tool for data processing in Python.
This dataset includes various passenger attributes such as age, sex, passenger class, and
whether the passenger survived the sinking of the Titanic.
1. Importing Libraries:
url =
'https://raw.githubusercontent.com/datasciencedojo/datasets/master/
titanic.csv'
data = pd.read_csv(url)
Here’s how the output might look, displaying key data for the first few passengers:
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare
Cabin Embarked
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN
S
It's useful to get a quick overview of the dataset, including checking for missing values and
understanding the distribution of numerical values.
print(data.describe())
print(data.isnull().sum())
Visualizing data can provide insights that are not obvious from raw numbers alone.
This approach with the Titanic dataset makes it easier for beginners to grasp basic data handling
and analysis operations in Python, providing a foundation to build on for more complex
machine learning tasks.
K-means clustering is an unsupervised learning algorithm that seeks to partition a set of data
points into a specified number of clusters K. The goal is to divide the data such that the sum of
the squared distance between the data points and the centroid (mean) of their respective
clusters is minimized
1. Importing Libraries
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
● pandas is used for data manipulation and analysis.
● sklearn.cluster contains the K-means clustering algorithm.
● matplotlib.pyplot is used for creating visualizations to see the results.
2. Creating an Example Dataset
X = pd.DataFrame({
"x": [1, 2, 3, 6, 7, 8],
"y": [1, 1, 2, 6, 7, 8]
kmeans = KMeans(n_clusters=2)
kmeans.fit(X)
labels = kmeans.labels_
X['Cluster'] = labels
● kmeans.labels_ retrieves the cluster labels for each data point in X. These
labels indicate the cluster to which each data point belongs.
● Adding these labels to the DataFrame X as a new column named Cluster
helps in tracking and visualizing which point belongs to which cluster.
5. Visualizing the Clusters
Thank you for following this Python Programming Tutorial for Machine Learning Beginners.
If you have any questions or feedback, please feel free to reach out!
Happy coding!