NUMPY
Mathematical Operation
Faster than LIST and Consume less memory than LIST
Create an Array
1. To know th Dimention of an array (ar.ndim)
2. To pass the dimension of your Choice ( ndmin)
3. To create array with Zero Values
a. ar = np.zeros(4)
b. ar_1 = np.zeros((3,3))
4. To Create array with one value in it
a. ar = np.ones(3)
b. ar_1 = np.ones((3,2))
5. To Create Empty Array
a. ar = np.empty(4)
6. To Create array with specified Value
a. ar = np.full((3,3) , 5)
7. To create a particular range array
a. ar = np.arange(5)
b. ar_1 = np.arange(4,10)
8. To create a array with specified equal interval
a. ar = np.linspace(0,10,5)
9. To create Diagonal Element array or Identity array (eye function)
a. ar = np.eye(3)
b. ar_1 = np.eye(3,5)
Random Value Array
1.rand()- show value btw 0 to 1
2.randn()-value close to 0,can be positive and negative
3.ranf()- [0.0 , 1.0) only include 0.0 not 1.0
4. randint() - give random number in given range
Example:
1. To Crate Random array btw value 0 and 1
a. ar = np.random.rand(5)
b. ar_1 = np.random.rand(3,3)
2. To create a Random value array with value close to 0 ,can be positive and negative
a. ar = np.random.randn(5)
b. ar_1 = np.random.randn(3,3)
3. To create a Random value array with value [0.0 ,1.0) include 0.0 not 1.0
a. ar = np.random.ranf(4)
b. ar_1 = np.random.ranf((2,4))
4. To Create an array with random variable with start and end and also no. of values with integer
a. ar = np.random.randint(1,10,4) --[start,end,no.of values]
DATA TYPES
1. To know the datatype of a gven array
a. ar.dtype
2. Datatype Conversion
a. Method 1
ar = np.array([1,2,3,4] , dtype = np.int8)
b. Method 2
ar_1 = np.array([1,2,3,4] ,dtype= 'f')
3. Datatype conversion Function (astype)
a. ar = np.array([1,2,3,4])
ar_1 = ar.astype(float)
Shape and Reshape
1. ar.shape() – To know the shape of array
2. ar.reshape(2,3),ar.rehape(2,2,3) - To reshape of your array
3. ar.flatten() – To covert Multi-dimensional array to 1-D
4. ar.ndim = To know the Dimension of an array
5. To change the dimension of an array ,ndmin
ar = np.array([1,2,3,4] , ndmin = 3)
6. To know the shape of an array
ar.shape
Airthmetic Operation
1. SUM = ar+3, ar1+ar2 , np.add(ar1,ar2)
2. SUBTRACT = ar-3, ar1-ar2 ,np.subtract(ar1,ar2)
3. Multiply = ar*3 ,ar1*ar2, ar.multiply(ar1,ar2)
4. Divide = ar/2 , ar1/ar2 , np.divide(ar1,ar2)
5. Mod = ar%2 , ar1%ar2 , np.mod(ar1,ar2)
6. Power = ar**2 , ar1**ar2 , np.power(ar1,ar2)
7. Reciprocal = np.reciprocal(ar1,ar2)
Airtmetic Function
1. Min - np.min(ar) , np.min(ar ,axis = 1) , np.min(ar ,axis = 0)
2. Max - np.max(ar) , np.max(ar ,axis = 1) , np.max(ar ,axis = 0)
3. Argmin/argmax -Tells the position of Max and Min - np.argmax(ar)/ np.argmin(ar)
4. Sqrt - np.sqrt(ar)
5. Sin - np.sin(ar)
6. Cos - np.cos(ar)
7. Cumsum - np.cumsum(ar) / np.cumsum(ar ,axis = 0) / np.cumsum(ar ,axis = 1)
Broadcasting
1. Rule: 1) Same Dimension
2) (1x3 3x1) must 1 in array = possible
3) 1x3 1x4 not 1 so it is not posssible to do AP
4) 2x1 2x3 = possible
2. Airthmetic operation follow broadcasting Rule
Indexing and Slicing
Indexing : print(ar[0]) / print(ar[0,3]) /print(ar[1,-1]) / print(ar[0,1,2])
Slicing : print(ar[1:3]) / print(ar[::2]) / print(ar[1,1:4])
Iteration
Print element one by one
For 1D array – one for loop
For 2D array – Two for loop
For 3D array – Three for loop
1. Nditer - gives element one by one / for i in np.nditer(var) /
for i in np.nditer(var ,flags =["buffered"],op_dtypes = "S")-Buffer used for for storage
2. Ndenumerate - - Gives position of each element one by one - / for i in np.ndenumerate(ar)
Copy and View
Copy owns tha data (New array) and View does not own (original data) so if any change happen in
View it will refelect the original data and vice versa.
Ar.copy()
Ar.view()
Join and Split
Join :
1. Concatenate - np.concatenate((ar,ar_1)) / np.concatenate((ar,ar_1),axis = 0) /
np.concatenate((ar,ar_1),axis = 1)
2. Stack – Similar to concatenate / np.stack((ar,ar_1),axis = 1) / np.stack((ar,ar_1),axis = 0)
a. Vstack - np.vstack((ar,ar_1)) / along vertically
b. Hstack - np.hstack((ar,ar_1)) / along horizontally
c. Dstack - np.dstack((ar,ar_1) / along height
Split:
np.array_split(ar,2) – 1D
np.array_split(ar,3,axis = 1) – 2D
Search ,Sort and Filter
Search : np.where(ar/2==0) / np.where(ar%2==0)
Search Sorted : putting value in a sorted way / np.searchsorted(ar,5) /
np.searchsorted(ar,5, side = "right")/ np.searchsorted(ar,5, side = "left") /
np.searchsorted(ar,[3,5,7], side = "right")
Sort : np.sort(ar) / np.sort(ar ,axis = 0) / np.sort(ar ,axis = 1)
Filter : ar = np.array([1,2,3,4,5])
ar_1 = [True,False,True,False,True]
(ar[ar_1])
Airthmetic Function
1. Shuffle - np.random.shuffle(ar)
2. Unique - np.unique(ar ,return_index = True ,return_counts = True) / gives index and count of
unique number
3. Resize – Resize the array -/ np.resize(ar,(3,3)
4. Flatten – Convert in 1D array / ar.flatten()
5. Ravel - ar.ravel()
a. ar.ravel(order ="F")
b. ar.ravel(order = "A")
c. ar.ravel(order = "K")
Insert and Delete
Insert : np.insert(var,0,22) / np.insert(var,(0,4),22) / np.insert(ar,2,9,axis = 0) /
np.insert(ar,2,9,axis = 1) / np.insert(ar , 2 ,[11,12,13] , axis = 0) /
np.insert(ar , 2 ,[11,12] , axis = 1)
Append : add value In last / np.append(ar,23) / np.append(ar ,[[11,12,13]] , axis = 0) /
np.append(ar ,[[11,12],[1,6]], axis = 1)
Delete : np.delete(ar , 0 ) / position at o element will delete
Matrix
In matrix all arithmetic operation is similar except multiplication and result will be different for array
and matix
1. Dot - ar.dot(ar_1)
2. Transpose - np.transpose(ar)
3. Swapaxes - np.swapaxes(ar ,0,1)
4. Inverse - np.linalg.inv(ar)
5. Power – we can also use dot / ar.dot(ar)/
a. When n = 0 then it gives identity matrix / np.linalg.matrix_power(ar, 0))
b. When n<0 then it show inverse value of matrix / np.linalg.matrix_power(ar , -2)
c. When n>0 then it shows dot function / np.linalg.matrix_power(ar , 2)
6. Determinant - np.linalg.det(ar)
Pandas
Allow us to handle large data sets
Clean messy data and make them in a readable format
More flexible than Numpy
Two Data Structure :Series and DataFrame
1. Series
Creating From List : ar = [1,2,3,4]
ar_1 = pd.Series(ar)
Give index number of your Choice and also change the datatype of element:
o ar = pd.Series([1,2,3,4] ,index = ["a","b","c","d"],dtype = "f" )
Also can name your data :
o ar = pd.Series([1,2,3] ,index = ["a","b","c"] ,name = "Python")
We can pass the Tupple and Dict
o r = pd.Series({"Name" :a["Ravi","Nisha","Renu"] ,"Age" : [23,34,23],
"Sex" :["M","F","F"] ,"No.":[11,2,13,14]})
Create single dataseries : ar = pd.Series(12)
Assign Value to multiple index : ar = pd .Series(23 ,index= ["a","b","c","d"])
We don’t have to follow broadcast Rule for AIrthmetic Function and Operation
o ar = pd.Series([11,12,13,14,15] ,index = [1,2,3,4,5])
ar_1 = pd.Series([12,13,14] ,index = [1,2,3])
print(ar+ar_1)
2. DataFrame
Creating From the List : ar = [1,2,3,4,5]
ar_1 = pd.DataFrame(ar)
In, Dictionary we have to pass equal Range:
o print(pd.DataFrame({"Name" :["Ravi","Renu","Nisha"] ,"Age":[11,12,13],"Sex":
["M","F","F"]}))
Printing Particular Column
o print(pd.DataFrame(ar ,columns = ["Name",”Age”]))
Changing Index number
o print(pd.DataFrame(ar ,index = ["a","b","c"]))
How to get particular element in DataFrame
o ar_1["Name"][1] -- column_name and Index Number
We can pass uneven list in Dataframe
o print(pd.DataFrame([[1,2,3,4,5],[1,2,3]]))
Airthmetic Operation
For every operation it will add new column
1. Addition ,Subtraction ,Multiplication and Division
ar["Add"] = ar["A"]+ ar["B"]
ar["subtract"] = ar["A"]- ar["B"]
ar["Multiply"] = ar["A"]* ar["B"]
ar["Divide"] = ar["A"]/ ar["B"]
2. Putting Condition to Check
ar["Right"] = ar["Add"]>8
it will give new column of True and False according to condition
Insert and Delete Function
1. Insert
It will add new column according to index
ar.insert(0,"Nam",["a","b","c","d"])
For adding the end the end of Table
ar["Python"] = ar["Add"][0:2] -- it will add value of Add column value in index from 0 to 2.
2. Delete
ar.pop("Nam") -- 1st Method
del ar["Python"] -- 2nd Method
Write and Read a File
1. To write a file
ar.to_csv("ec.csv")
2. To Read
pd.read_csv("orders.csv")
To Read a Particular Rows pass nrows
pd.read_csv("orders.csv" ,nrows = 3)
To read a Particular columns pass usecols and Column name
pd.read_csv(“orders.csv” ,usecols = [“City”,’Country’])
To read a Particular columns pass usecols and Index number
pd.read_csv(“orders.csv” ,usecols = [0,2])
To Skip column header pass skiprows and If skiprow is 1 then column header will be 1 st row
pd.read_csv(“orders.csv” ,skiprows = 0)
To remove index column and change with some other parameter
pd.read_csv(“orders.csv” ,index_col = “Order Id”)
Change the Column heading with any row pass header
pd.read_csv(“orders.csv” ,header = 2)
Change the name of the column so pass names
pd.read_csv(“orders.csv” ,names =[“a”,”b”,”c”,”d”]
Remove header pass header = None
Pd.read_csv(“orders.csv” ,header = None )
Change Datatype of any column
pd.read_csv("orders.csv" ,dtype = {"Order Id" : "float"})
Table Related Query
Get index Number and also count of entries -Data.index
Get all columns Name -Data.columns
Changing Column name in Lower case and remove space from ‘_’
Col.str.lower()
Col.str.replace(" ","_")
Get Statistics of the data -Data.describe()
Access the data of Particular Range -Data[5:8]
Convert DataFrame index row into array -Data.index.array
Convert DataFrame into array
1st Method - Data.to_numpy()
2nd Method - np.asarray(Data)
Sort Data on the basis of Index -Data.sort_index(axis = 0 ,ascending = False).head(5)
Changing Row element -Data["region"][0] = "Python"
LOC Function
1st Changing the element - Data.loc[0,'region'] = "Golgappe"
2nd Get element - Data.loc[[2,3],["city","region"]]
3rd Get column data - Data.loc[:,["city","region"]]
4th Get particular data from index number - Data.iloc[0,3] -- iloc function
Drop a Column -Data.drop("city",axis = 1)
Drop a Row - Data.drop(0,axis = 0)
Count null value - Data.isnull().sum()
Dropna and Fillna
1. Dropna
Drop a value which has missing value - Data.dropna()
Drop a column which has missing value - Data.dropna(axis = 1)
Drop a row which has all value NAN (fully Blank Row) - Data.dropna(how = "any")
Drop a Particular row which has specific column has null value -
Data.dropna(subset = ["Order Id"])
Replace all null value in other dataset - Data.dropna(inplace = True)
Multiple NAN Value in a Row but we have to drop who has only 1 null value -
Data.dropna(thresh = 1)
2. Fillna
Fill all null value with something - Data.fillna("python")
Fill column with specific value –
Data.fillna({"Order Id":"Python","City":"Ghaziabad","State":"Uttar Pradesh"})
Fill null value with forward value - Data.fillna(method = "ffill")
Fill null value with backward value - Data.fillna(method = "bfill")
Fill null value with forward value along axis - Data.fillna(method = "ffill" , axis = 1)
Inplace changes the actual data - Data.fillna(12,inplace = True)
Limit parameter used for limit null value where atmost 2 null value in a row -
Data.fillna("Python" ,limit = 2)
Replace and Interpolate
1. Replace – Handling missing value
Replace a particular value with something else –
Data.replace(to_replace = 240 ,value = 22).head(2)
Data.replace(to_replace = "Consumer" ,value = "Python")
Replace where number series is going - Data.replace([1,2,3,4,5,6],34)
Replace alphabets with other –
Data.replace("[A-za-z]","python" ,regex = True)
Data.replace("[A-Z]","python" ,regex = True)
Forward replace - Data.replace(1,method = "ffill")
Backward Replace - Data.replace(1,method = "bfill")
Limit parameter (Only limited to data replace ) = Data.replace(2,method = "ffill",limit = 1)
Inplace (Permanent Changes) - Data.replace(inplace = True)
2. Interpolate – null value fill automatically on the basis of the sequence
It does not work on string and text value
Replace all value with their sequence - Data.interpolate()
Linearaly replace all the null value - Data.interpolate(method = "linear")
Along the axis - Data.interpolate(method = "linear" ,axis = 0) .head(5) -- For aixs = 1 datatype
must be same ,not the mix of string and int
Limit Parameter (If you don’t want to fill all the null value ) - Data.interpolate(limit = 2)
Give Direction to the limit –
Data.interpolate(limit_direction = "forward" ,limit = 2)
Data.interpolate(limit_direction = "backward" ,limit = 2)
Data.interpolate(limit_direction = "both" ,limit = 2)
Limit area –
Data.interpolate(limit_area = "inside")
Data.interpolate(limit_area = "outside")
Merge and Concat
1. Merge (Similar to Join in SQL)
Merge on the basis of Common Key - pd.merge(ar ,ar_1 ,on = "A") -- If anything missing in
primary key then it shows on common elements
Left join (All left value will show and common value) - pd.merge(ar ,ar_1 ,how = "left")
Right join (All right value will show and common value) - pd.merge(ar ,ar_1 ,how = "right")
Inner join (only common value) - pd.merge(ar ,ar_1 ,how = "inner")
Outer join (All Values) and also add Indicator - pd.merge(ar ,ar_1 ,how = "outer" ,indicator =
True)
If both column header has same name then you have to give index parameter
pd.merge(ar ,ar_1 ,right_index = True,left_index = True)
We can also add suffixes if column name is similar –
pd.merge(ar ,ar_1 ,right_index = True,left_index = True , suffixes =("Name","Python")
2. Concat (Similar to Union in SQL)
Series Concat - pd.concat([ar,ar_1])
DataFrame Concat - pd.concat([ar,ar_1])
Concat along axis - pd.concat([var1,var2] , axis = 1)
Concat as Outer join - pd.concat([var1,var2],axis = 1 ,join = "outer")
Concat as Inner join - pd.concat([var1,var2],axis = 1 ,join = "inner")
You can also give the key to your dataframe - pd.concat([var1,var2],axis =1 ,keys =
["d1","d2"])
Group By and Join
1. Group By
Group By along with some Indicator –
Var_new =var.groupby("Name")
How to show group by table –
for x,y in var_new:
print(x)
print(y)
print()
how to get particular common item - var_new.get_group("a")
How to access min and mean –
var_new.min()
var_new.mean()
2. Join
var1.join(var2)
inner join - var1.join(var2 ,how = "inner")
outer join - var1.join(var2 ,how = "outer")
left join - var1.join(var2 ,how = "left")
Right Join - var1.join(var2 ,how = "right")
Giving suffix (when column has similar name)
var1.join(var2 ,how = "left" ,rsuffix = "_12",lsuffix = "_12")
Melt and Pivot table
1. Melt () Distribute Data into value and variable
pd.melt(var1)
Index Change and set as days - pd.melt(var1 ,id_vars = ["days"])
Changing the variable name - pd.melt(var1 ,id_vars = ["days"] ,var_name = "Python")
Changing the value name –
pd.melt(var1 ,id_vars = ["days"] ,var_name = "Python" ,value_name = "Ravi")
2. PIVOT
Set index as Days
var1.pivot(index = "days" ,columns="st_name")
Only for specific value - var1.pivot(index = "days" ,columns="st_name" ,values ="eng")
Show value as Mean –
var1.pivot_table(index = "st_name" ,columns = "days" ,aggfunc="mean")
Show value as Sum - var1.pivot_table(index = "st_name" ,columns = "days" ,aggfunc=”sum")
Show the total margin as per their function –
var1.pivot_table(index = "st_name" ,columns = "days" ,aggfunc="mean" ,margins = "True")
MATPLOTLIB
Used for Data Visualization
Types - Matplotlib - low level ,provides lots of freedom
Pandas visualization - easy to use interface ,built on matplotlib
Seaborn - high level interface, great default styles
ggplot - based on R's ggplot2 ,uses grammer of graphics
plotly - can create interactive plots
Graphs - linear ,Scatter,Bar,Stem,Step,Hist,Box,Pie,Fill_between Plot¶
1. Bar Graph
Plt.xlabel(“”) - Name of the X label
Plt.ylabel(“”) – name of the Y label
Plt.legend(“”) -
Plt.title(“”) – name of the title
Fontsize – 0 to 100 – we can set the size of label and title
Width – bargraph width (0 to 1) after 1 it will overlap
Color – color of the bars
Align - can be two type edge and center (x legend position)
Edgecolor – Backgroud color of bars
Linewidth - width of edgecolor (0,100)
Linestyle – style of edgecolor can be dotted (for dotted pass “:”)
Label – can only work with legend
Alpha – make graph color dull (0 to 1)
Create a bar graph
plt.bar(x,y ,width = 0.4 ,color = c ,align = "edge" )
Double bar graph
p = np.arange(len(x))
p1 = [i + width for i in p ]
plt.bar(p,y ,width ,color = "r" ,label = "Ranking")
plt.bar(p1,z ,width ,color = "y" ,label = "Ranking 1" ) # Align = center ,edge (x label)
plt.xticks(p + width/2,x ,rotation = 10 )
Horizontal Bar graph
plt.barh(p,y ,width ,color = "r" ,label = "Ranking")
2. Scatter Plot
Create a scatter plot
plt.scatter(day,no , color = c ,sizes = s ,alpha = 0.7 ,marker = "*" ,edgecolor = "g" ,linewidth = 2)
Color in Range(color bar)
colors = [10,25,57,22,68,12,70]
s = [100,200,130,214,461,230,150]
plt.scatter(day,no , c = colors,sizes = s ,cmap = "Accent" )
t = plt.colorbar()
t.set_label("Color Bar")
Multiple Scatter graph
plt.scatter(day,no , c = colors,sizes = s ,cmap = "Accent" )
plt.scatter(day,no2 ,color = "r",sizes = s )
3. Histogram Plot
plt.hist(no ,color = "b" ,edgecolor = "black" ,range = (0,100))
l = [10,20,30,40,50,60]
plt.hist(no ,color = "b" ,edgecolor = "black" ,bins = l)
plt.hist(no ,color = "b" ,edgecolor = "black" ,cumulative = -1 ,bottom = 10 )
plt.hist(no ,color = "b" ,edgecolor = "r" ,bottom = 10 ,align = "left" ,histtype = "step")
plt.hist(no ,color = "b" ,edgecolor = "r" ,bottom = 10 ,align = "left" ,orientation =
"horizontal",rwidth = 0.8)
plt.hist(no ,color = "b" ,edgecolor = "r" ,log = True)
plt.axvline(35 , color = "y" ,label = "Average")
4. Pie Chart
plt.pie(x ,labels = y ,colors = c, explode =ex ,autopct = "%0.1f%%" ,shadow = True ,radius = 1 ,
labeldistance = 1.2 ,startangle = 180 ,textprops = {"fontsize":12} ,counterclock = False ,
wedgeprops ={"linewidth":2 ,"edgecolor" : "m"}
)
5. Stem Plot
Create a Stem Plot
plt.stem(x,y ,linefmt = ":" ,markerfmt = "r+" ,bottom = 0 ,basefmt= "g",label= "python" )
Horizontal Stem Plot
plt.stem(x,y ,linefmt = ":" ,markerfmt = "r+" ,bottom = 0 ,basefmt= "g",label=
"python" ,orientation = "horizontal")
linefmt = ":" ,To make the dotted line
markerfmt = "r+" , To change the color and shape of points
bottom = 0 , To change the axis of y -axis
basefmt= "g" , To change the bottom line color
orientation = "horizontal" , To change from vertical to horizontal
6. Box and Whisker Plot
Single box plot
plt.boxplot(x ,widths = 0.4 ,labels = ["Python"] ,showmeans = True ,sym = "g+" ,boxprops =
{"color" :"r"} ,capprops = dict(color = "g")
, whiskerprops = dict(color = "b") ,flierprops = dict(markeredgecolor = "black"))
Multiple Box Plot
y = [x,x1]
plt.boxplot(y ,widths = 0.4 ,labels = ["Python" ,"C"] ,showmeans = True ,sym = "g+" ,boxprops =
{"color" :"r"} ,capprops = dict(color = "g")
, whiskerprops = dict(color = "b") ,flierprops = dict(markeredgecolor = "black"))
notch – (True and False) Shape change
vert = (True and False) ,make graph horizontal and Vertical
widths –(0 to 1) change the width of the graph
patch_artist = True ,To fill the graph inside the box
showmeans = True - will show the mean
whis = 3 ,to join with the outlier
sym="g+" to change the color and shape of outlier
boxprops = {"color" :"r"} to change the outline color of box
capprops = dict(color = "g") to change the color the max and min line
whiskerprops = dict(color = "b") to change the color of line that join box to max and min
flierprops = dict(markeredgecolor = "Red") To change the color of outlier symbol
7. Stack and Area Plot
Single Stack Plot
plt.stackplot(x,Area1,colors = "c")
Multiple Stack Plot
plt.stackplot(x,Area1,Area2,Area3 ,labels = l ,colors = ["r","g","m"] ,baseline = "wiggle")
baseline = "zero" and wiggle and sys ,change the shape of the graph
8. Step plot
plt.step(x,y ,marker = "o" ,color = "r" ,ms = 10 ,mfc = "g" ,label = "python")
marker = “o” highlight the points of x and y
ms – marker size (0 to 100)
mfc – marker face color
9. Fill-between Plot
Fill the color under the graph
plt.plot(x,y ,color = "r")
plt.fill_between(x,y)
Fill the color in particular range
plt.plot(x,y)
plt.fill_between(x = [3,5],y1 = 4,y2 = 6 ,color = "yellow" )
Use Of where
plt.plot(x,y)
plt.fill_between(x,y,color = "r" ,where=(x>=2) & (x<=4) ,alpha = 0.5)
10. Subplot
Used for multiple graph plotting
plt.subplot(2,2,1)
plt.plot(x,y)
plt.subplot(2,2,2)
plt.pie([1] ,colors = "r")
11. Savefig
plt.savefig("R2" ,dpi = 2000 ,facecolor = "y" ,transparent = True ,bbox_inches = "tight")
save image in as a PDF and JPN and PNG in a folder
transparent – fill thec color under x and y label
facecolor – fill the color outside the x and y label
bbox_inches – reduce the outside space of the graph
12. Import Image
from PIL import Image
fname = r'C:\Users\ravib\Desktop\Passport.jpg'
open Image using PIL
image = Image.open(fname)
plt.imshow(image)
plt.savefig("R" ,dpi = 2000)
13. Xticks , Yticks,xlim,ylim,axis
plt.xticks(x ,labels = ["C++","C","Python","Java","R"]) – round off and set the X- label
plt.yticks(y) - round off and set the Y- label
plt.xlim(0,10) – Give a range of X label
plt.ylim(0,15) – Give a range of Y abel
plt.axis([0,11,0,6]) – it also give range of x and y label
14. Text and Annotate
o plt.text(2,4,"JAVA",fontsize = 15 ,style = "italic" ,bbox = {"facecolor":"r"}) – give a text box
in a graph
o plt.annotate("python" ,xy = (3,2),xytext =(4,4) ,arrowprops = dict(facecolor="bl
o ack",shrink=100)) - give arrow to the text
SAS
1. Import the datafile
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=WORK.IMPORT;
GETNAMES=YES;
RUN;
Proc Import DataFile= "C:\Users\ravib\Desktop\New folder\StudentsPerformance.csv"
Out = Student_Performance
DBMS = csv;
Run;
2. Finding Mean and Other stats of the data
Proc Means Data = Work.Order n mean mode median q1 q3 uclm lclm var std stderr ;
Var Cost_price;
Run;
3. Finding the Freq of any Data
Proc Freq Data = Work.Order;
Tables Segment * Country /Nocum;
Run;
4. Creating the Chart
Proc gchart Data = Work.Order;
Vbar3D Segment; # vbar = vertical ,hbar = horizontal
Run;
5. Printing the Data into Table
Proc Print Data = Work.Order;
Run;
6. Sort the data
Proc sort Data = Work.Order out New;
By Cost_price ;
Run;
7. Creating the Table
Dm ‘log’ Clear;
Data HowtoCreate;
Input No Name $ Age;
Datalines;
1 Ravi 26
2 Nisha 28
3 Renu 32
;
Run;
Statistics
Statistics is the collecting ,organizing and analysing the data.
Types –
1. Descriptive – Collection ,Analyzing ,Interpretation of the data
Understanding the main features of Data
2. Inferential – Drawing Conclusion of the Data
Population
Sample