NUMPY
NUMPY
Mathematical Operation
Faster than LIST and Consume less memory than LIST
Create an Array
1. To know th Dimention of an array (ar.ndim)
2. To pass the dimension of your Choice ( ndmin)
3. To create array with Zero Values
a. ar = np.zeros(4)
b. ar_1 = np.zeros((3,3))
4. To Create array with one value in it
a. ar = np.ones(3)
b. ar_1 = np.ones((3,2))
5. To Create Empty Array
a. ar = np.empty(4)
6. To Create array with specified Value
a. ar = np.full((3,3) , 5)
7. To create a particular range array
a. ar = np.arange(5)
b. ar_1 = np.arange(4,10)
8. To create a array with specified equal interval
a. ar = np.linspace(0,10,5)
9. To create Diagonal Element array or Identity array (eye function)
a. ar = np.eye(3)
b. ar_1 = np.eye(3,5)
Airthmetic Operation
1. SUM = ar+3, ar1+ar2 , np.add(ar1,ar2)
2. SUBTRACT = ar-3, ar1-ar2 ,np.subtract(ar1,ar2)
3. Multiply = ar*3 ,ar1*ar2, ar.multiply(ar1,ar2)
4. Divide = ar/2 , ar1/ar2 , np.divide(ar1,ar2)
5. Mod = ar%2 , ar1%ar2 , np.mod(ar1,ar2)
6. Power = ar**2 , ar1**ar2 , np.power(ar1,ar2)
7. Reciprocal = np.reciprocal(ar1,ar2)
Airtmetic Function
1. Min - np.min(ar) , np.min(ar ,axis = 1) , np.min(ar ,axis = 0)
2. Max - np.max(ar) , np.max(ar ,axis = 1) , np.max(ar ,axis = 0)
3. Argmin/argmax -Tells the position of Max and Min - np.argmax(ar)/ np.argmin(ar)
4. Sqrt - np.sqrt(ar)
5. Sin - np.sin(ar)
6. Cos - np.cos(ar)
7. Cumsum - np.cumsum(ar) / np.cumsum(ar ,axis = 0) / np.cumsum(ar ,axis = 1)
Broadcasting
1. Rule: 1) Same Dimension
2) (1x3 3x1) must 1 in array = possible
3) 1x3 1x4 not 1 so it is not posssible to do AP
4) 2x1 2x3 = possible
2. Airthmetic operation follow broadcasting Rule
Indexing and Slicing
Indexing : print(ar[0]) / print(ar[0,3]) /print(ar[1,-1]) / print(ar[0,1,2])
Iteration
Print element one by one
Ar.copy()
Ar.view()
Split:
np.array_split(ar,2) – 1D
np.array_split(ar,3,axis = 1) – 2D
ar_1 = [True,False,True,False,True]
(ar[ar_1])
Airthmetic Function
1. Shuffle - np.random.shuffle(ar)
2. Unique - np.unique(ar ,return_index = True ,return_counts = True) / gives index and count of
unique number
3. Resize – Resize the array -/ np.resize(ar,(3,3)
4. Flatten – Convert in 1D array / ar.flatten()
5. Ravel - ar.ravel()
a. ar.ravel(order ="F")
b. ar.ravel(order = "A")
c. ar.ravel(order = "K")
Matrix
In matrix all arithmetic operation is similar except multiplication and result will be different for array
and matix
1. Dot - ar.dot(ar_1)
2. Transpose - np.transpose(ar)
3. Swapaxes - np.swapaxes(ar ,0,1)
4. Inverse - np.linalg.inv(ar)
5. Power – we can also use dot / ar.dot(ar)/
a. When n = 0 then it gives identity matrix / np.linalg.matrix_power(ar, 0))
b. When n<0 then it show inverse value of matrix / np.linalg.matrix_power(ar , -2)
c. When n>0 then it shows dot function / np.linalg.matrix_power(ar , 2)
6. Determinant - np.linalg.det(ar)
Pandas
Allow us to handle large data sets
Clean messy data and make them in a readable format
More flexible than Numpy
1. Series
Creating From List : ar = [1,2,3,4]
ar_1 = pd.Series(ar)
Give index number of your Choice and also change the datatype of element:
o ar = pd.Series([1,2,3,4] ,index = ["a","b","c","d"],dtype = "f" )
Also can name your data :
o ar = pd.Series([1,2,3] ,index = ["a","b","c"] ,name = "Python")
We can pass the Tupple and Dict
o r = pd.Series({"Name" :a["Ravi","Nisha","Renu"] ,"Age" : [23,34,23],
"Sex" :["M","F","F"] ,"No.":[11,2,13,14]})
Create single dataseries : ar = pd.Series(12)
Assign Value to multiple index : ar = pd .Series(23 ,index= ["a","b","c","d"])
We don’t have to follow broadcast Rule for AIrthmetic Function and Operation
o ar = pd.Series([11,12,13,14,15] ,index = [1,2,3,4,5])
ar_1 = pd.Series([12,13,14] ,index = [1,2,3])
print(ar+ar_1)
2. DataFrame
Creating From the List : ar = [1,2,3,4,5]
ar_1 = pd.DataFrame(ar)
In, Dictionary we have to pass equal Range:
o print(pd.DataFrame({"Name" :["Ravi","Renu","Nisha"] ,"Age":[11,12,13],"Sex":
["M","F","F"]}))
Printing Particular Column
o print(pd.DataFrame(ar ,columns = ["Name",”Age”]))
Changing Index number
o print(pd.DataFrame(ar ,index = ["a","b","c"]))
How to get particular element in DataFrame
o ar_1["Name"][1] -- column_name and Index Number
We can pass uneven list in Dataframe
o print(pd.DataFrame([[1,2,3,4,5],[1,2,3]]))
Airthmetic Operation
1. Insert
2. Delete
ar.pop("Nam") -- 1st Method
del ar["Python"] -- 2nd Method
1. To write a file
ar.to_csv("ec.csv")
2. To Read
pd.read_csv("orders.csv")
To Read a Particular Rows pass nrows
pd.read_csv("orders.csv" ,nrows = 3)
To Skip column header pass skiprows and If skiprow is 1 then column header will be 1 st row
pd.read_csv(“orders.csv” ,skiprows = 0)
1. Dropna
Drop a value which has missing value - Data.dropna()
Drop a column which has missing value - Data.dropna(axis = 1)
Drop a row which has all value NAN (fully Blank Row) - Data.dropna(how = "any")
Drop a Particular row which has specific column has null value -
Data.dropna(subset = ["Order Id"])
Replace all null value in other dataset - Data.dropna(inplace = True)
Multiple NAN Value in a Row but we have to drop who has only 1 null value -
Data.dropna(thresh = 1)
2. Fillna
Fill all null value with something - Data.fillna("python")
Fill column with specific value –
Data.fillna({"Order Id":"Python","City":"Ghaziabad","State":"Uttar Pradesh"})
Fill null value with forward value - Data.fillna(method = "ffill")
Fill null value with backward value - Data.fillna(method = "bfill")
Fill null value with forward value along axis - Data.fillna(method = "ffill" , axis = 1)
Inplace changes the actual data - Data.fillna(12,inplace = True)
Limit parameter used for limit null value where atmost 2 null value in a row -
Data.fillna("Python" ,limit = 2)
Merge on the basis of Common Key - pd.merge(ar ,ar_1 ,on = "A") -- If anything missing in
primary key then it shows on common elements
Left join (All left value will show and common value) - pd.merge(ar ,ar_1 ,how = "left")
Right join (All right value will show and common value) - pd.merge(ar ,ar_1 ,how = "right")
Inner join (only common value) - pd.merge(ar ,ar_1 ,how = "inner")
Outer join (All Values) and also add Indicator - pd.merge(ar ,ar_1 ,how = "outer" ,indicator =
True)
If both column header has same name then you have to give index parameter
pd.merge(ar ,ar_1 ,right_index = True,left_index = True)
We can also add suffixes if column name is similar –
pd.merge(ar ,ar_1 ,right_index = True,left_index = True , suffixes =("Name","Python")
1. Group By
2. Join
var1.join(var2)
inner join - var1.join(var2 ,how = "inner")
outer join - var1.join(var2 ,how = "outer")
left join - var1.join(var2 ,how = "left")
Right Join - var1.join(var2 ,how = "right")
Giving suffix (when column has similar name)
var1.join(var2 ,how = "left" ,rsuffix = "_12",lsuffix = "_12")
Melt and Pivot table
2. PIVOT
1. Bar Graph
Plt.xlabel(“”) - Name of the X label
Plt.ylabel(“”) – name of the Y label
Plt.legend(“”) -
Plt.title(“”) – name of the title
Fontsize – 0 to 100 – we can set the size of label and title
Width – bargraph width (0 to 1) after 1 it will overlap
Color – color of the bars
Align - can be two type edge and center (x legend position)
Edgecolor – Backgroud color of bars
Linewidth - width of edgecolor (0,100)
Linestyle – style of edgecolor can be dotted (for dotted pass “:”)
Label – can only work with legend
Alpha – make graph color dull (0 to 1)
Create a bar graph
plt.bar(x,y ,width = 0.4 ,color = c ,align = "edge" )
2. Scatter Plot
Create a scatter plot
plt.scatter(day,no , color = c ,sizes = s ,alpha = 0.7 ,marker = "*" ,edgecolor = "g" ,linewidth = 2)
3. Histogram Plot
plt.hist(no ,color = "b" ,edgecolor = "black" ,range = (0,100))
l = [10,20,30,40,50,60]
plt.hist(no ,color = "b" ,edgecolor = "black" ,bins = l)
plt.hist(no ,color = "b" ,edgecolor = "black" ,cumulative = -1 ,bottom = 10 )
plt.hist(no ,color = "b" ,edgecolor = "r" ,bottom = 10 ,align = "left" ,histtype = "step")
plt.hist(no ,color = "b" ,edgecolor = "r" ,bottom = 10 ,align = "left" ,orientation =
"horizontal",rwidth = 0.8)
plt.hist(no ,color = "b" ,edgecolor = "r" ,log = True)
plt.axvline(35 , color = "y" ,label = "Average")
4. Pie Chart
plt.pie(x ,labels = y ,colors = c, explode =ex ,autopct = "%0.1f%%" ,shadow = True ,radius = 1 ,
labeldistance = 1.2 ,startangle = 180 ,textprops = {"fontsize":12} ,counterclock = False ,
wedgeprops ={"linewidth":2 ,"edgecolor" : "m"}
)
5. Stem Plot
Create a Stem Plot
plt.stem(x,y ,linefmt = ":" ,markerfmt = "r+" ,bottom = 0 ,basefmt= "g",label= "python" )
Horizontal Stem Plot
plt.stem(x,y ,linefmt = ":" ,markerfmt = "r+" ,bottom = 0 ,basefmt= "g",label=
"python" ,orientation = "horizontal")
baseline = "zero" and wiggle and sys ,change the shape of the graph
8. Step plot
plt.step(x,y ,marker = "o" ,color = "r" ,ms = 10 ,mfc = "g" ,label = "python")
9. Fill-between Plot
Fill the color under the graph
plt.plot(x,y ,color = "r")
plt.fill_between(x,y)
Use Of where
plt.plot(x,y)
plt.fill_between(x,y,color = "r" ,where=(x>=2) & (x<=4) ,alpha = 0.5)
10. Subplot
Used for multiple graph plotting
plt.subplot(2,2,1)
plt.plot(x,y)
plt.subplot(2,2,2)
plt.pie([1] ,colors = "r")
11. Savefig
plt.savefig("R2" ,dpi = 2000 ,facecolor = "y" ,transparent = True ,bbox_inches = "tight")
save image in as a PDF and JPN and PNG in a folder
transparent – fill thec color under x and y label
facecolor – fill the color outside the x and y label
bbox_inches – reduce the outside space of the graph
12. Import Image
from PIL import Image
fname = r'C:\Users\ravib\Desktop\Passport.jpg'
open Image using PIL
image = Image.open(fname)
plt.imshow(image)
plt.savefig("R" ,dpi = 2000)
13. Xticks , Yticks,xlim,ylim,axis
plt.xticks(x ,labels = ["C++","C","Python","Java","R"]) – round off and set the X- label
plt.yticks(y) - round off and set the Y- label
plt.xlim(0,10) – Give a range of X label
plt.ylim(0,15) – Give a range of Y abel
plt.axis([0,11,0,6]) – it also give range of x and y label
Types –
1. Descriptive – Collection ,Analyzing ,Interpretation of the data
Understanding the main features of Data
2. Inferential – Drawing Conclusion of the Data
Population
Sample