0% found this document useful (0 votes)
65 views

Machine Learning Lab File: Submitted To: Submitted by

1. The document is a machine learning lab file submitted by Vishal Rathi that describes experiments with Python libraries for machine learning. 2. The first experiment covers basic Python libraries like NumPy, Matplotlib, and Pandas. NumPy is used for scientific computing. Matplotlib produces publication-quality figures. Pandas enables data analysis and modeling in Python. 3. The second experiment covers reading data from a CSV file using Pandas and displaying the first 5 lines of the loaded data. This demonstrates how to load data from external files for analysis.

Uploaded by

Vishal Rathi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Machine Learning Lab File: Submitted To: Submitted by

1. The document is a machine learning lab file submitted by Vishal Rathi that describes experiments with Python libraries for machine learning. 2. The first experiment covers basic Python libraries like NumPy, Matplotlib, and Pandas. NumPy is used for scientific computing. Matplotlib produces publication-quality figures. Pandas enables data analysis and modeling in Python. 3. The second experiment covers reading data from a CSV file using Pandas and displaying the first 5 lines of the loaded data. This demonstrates how to load data from external files for analysis.

Uploaded by

Vishal Rathi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

MACHINE LEARNING LAB FILE

SUBMITTED TO: SUBMITTED BY:


SANJAY PATHIDAR Vishal Rathi

ASSOCIATE PROFESSOR 2K17/CO/378

BATCH A5

SLOT C
EXPERIMENT 1

AIM: To study basic python libraries


THEORY:

NumPy is the fundamental package for scientific computing with Python. It contains among
other things:

• a powerful N-dimensional array object


• sophisticated (broadcasting) functions
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and random
number capabilities

FUNCTIONS OF NUMPY LIBRARY

MATHEMATICAL FUNCTIONS
ARCSIN, ARCOS and ARCTAN functions return the trigonometric
inverse of sin, cos, and tan of the given angle.

NUMPY.AROUND() is a function that returns the value rounded to the desired precision.

NUMPY.FLOOR() is a function returns the largest integer not greater than the input parameter.

STRING FUNCTIONS
ADD() is a function that returns element-wise string concatenation for two arrays of str or Unicode.
MULTIPLY() is a function that returns the string with multiple concatenation, element-wise.

CENTER() is a function that returns a copy of the given string with elements centered in a string of specified
length.

SPLIT() is a function that returns a list of the words in the string, using separate or delimiter.

SORTING FUNCTIONS
NUMPY.SORT() function returns a sorted copy of the input array.

NUMPY.ARGSORT() function performs an indirect sort on input array, along the given axis and using a specified
kind of sort to return the array of indices of data.

NUMPY.LEXSORT() function performs an indirect sort using a sequence of keys. The keys can be seen as a
column in a spreadsheet.
STATICTICAL FUNCTIONS
NUMPY.AMIN() and NUMPY.AMAX() functions return the minimum and the maximum from the elements in the
given array along the specified axis.

NUMPY.PTP() function returns the range (maximum-minimum) of values along an axis.

NUMPY.MEDIAN() returns the value separating the higher half of a data sample from the lower half – Median.

NUMPY.PERCENTILE() returns Percentile (or a centile) that is a measure used in statistics indicating the value
below which a given percentage of observations in a group of observations fall.

Matplotlib is a Python 2D plotting library which produces publication quality figures in a


variety of hardcopy formats and interactive environments across platforms. Matplotlib
can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web
application servers, and four graphical user interface toolkits.

Matplotlib tries to make easy things easy and hard things possible. You can generate
plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc., with just a few
lines of code. For examples, see the sample plots and thumbnail gallery.

For simple plotting the pyplot module provides a MATLAB-like interface, particularly
when combined with IPython. For the power user, you have full control of line styles,
font properties, axes properties, etc, via an object oriented interface or via a set of
functions familiar to MATLAB users.

FUNCTIONS OF MATPLOTLIB LIBRARY


Matplotlib comes with a wide variety of plots. Plots helps to understand trends, patterns, and
to make correlations. They’re typically instruments for reasoning about quantitative
information. Some of the sample plots are covered here.
• LINE PLOT
# importing matplotlib module from
matplotlib import pyplot as plt

# Function to plot plt.plot(x,y)

# function to show the plot plt.show()

• BAR PLOT
# importing matplotlib module from matplotlib
import pyplot as plt

# Function to plot the bar plt.bar(x,y)

# function to show the plot plt.show()

• HISTOGRAM
# importing matplotlib module from matplotlib
import pyplot as plt

# Function to plot histogram plt.hist(y)

# Function to show the plot plt.show()

• SCATTER PLOT

# importing matplotlib module from matplotlib


import pyplot as plt

# Function to plot scatter plt.scatter(x, y)

# function to show the plot plt.show()


3. Pandas:
Python has long been great for data munging and
preparation, but less so for data analysis and
modeling. pandas helps fill this gap, enabling you
to carry out your entire data analysis workflow in
Python without having to switch to a more
domain specific language like R.
pandas does not implement significant modeling
functionality outside of linear and panel
regression; for this, look to statsmodels and scikit-
learn. More work is still needed to make Python a first class statistical modeling
environment, but we are well on our way toward that goal.

FUNCTIONS OF MATPLOTLIB LIBRARY


1. Index
dataflair_index =pd.date_range('1/1/2000', periods=8)

2. Series
dataflair_s1 = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])

3. DataFrame
dataflair_df1 = pd.DataFrame(np.random.randn(8, 3),
index=dataflair_index,columns=['A', 'B', 'C'])

4. Panel
dataflair_wp1 = pd.Panel(np.random.randn(2, 5, 4), items=['Item1',
'Item2'],major_axis=pd.date_range('1/1/2000', periods=5),minor_axis=['A', 'B', 'C',
'D'])
EXPERIMENT 2

AIM: To study read from csv file (pandas)


THEORY:
Data in the form of tables is also called CSV (comma separated values) - literally
"comma-separated values." This is a text format intended for the presentation of tabular
data. Each line of the file is one line of the table. The values of individual columns are
separated by a separator symbol - a comma (,), a semicolon (;) or another symbol. CSV
can be easily read and processed by Python.

LIBRARY USED: pandas

PROCEDURE:
To read data from CSV files, you must use the reader function to generate a reader object.

The reader function is developed to take each row of the file and make a list of all columns.
Then, you have to choose the column you want the variable data for.

1. # Load the Pandas libraries with alias 'pd'


2. import pandas as pd
3.
4. # Read data from file 'filename.csv'
5. # (in the same directory that your python process is based)
6. # Control delimiters, rows, column names with read_csv (see later)
7. data = pd.read_csv("filename.csv")
8.
9. # Preview the first 5 lines of the loaded data

10. data.head()

Function Description
Read a comma-separated values (csv) file into DataFrame.Also supports
read_csv optionally iterating or breaking of the file into chunks.
Preview the first 5 lines of the loaded data
head
CONCLUSION:
We read a csv file and stored it in the variable, ‘data’. We successfully displayed the first five
lines of our dataset.
INDEX
Sr. No. TOPIC DATE SIGN

You might also like