0% found this document useful (0 votes)
2 views

introduction to python

Unit 3 of the data science course introduces Python programming, highlighting its features, data types, and basic syntax. It covers user input, type casting, variables, and the use of NumPy for numerical computing, including array creation, indexing, slicing, and arithmetic operations. Additionally, it introduces Pandas for data manipulation and analysis, detailing DataFrame creation, row and column selection, filtering, and modifications.

Uploaded by

gsid4600
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

introduction to python

Unit 3 of the data science course introduces Python programming, highlighting its features, data types, and basic syntax. It covers user input, type casting, variables, and the use of NumPy for numerical computing, including array creation, indexing, slicing, and arithmetic operations. Additionally, it introduces Pandas for data manipulation and analysis, detailing DataFrame creation, row and column selection, filtering, and modifications.

Uploaded by

gsid4600
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Unit 3 data science

#introduction to python:

*python is a popular programming language.it was created by guido van Rossum, and released in
1991 at cwi(centrum wiskunde & informatica) netherland.

*python is general purpose ,high level programming language.

*python is a dynamic .

#why python:

1.simpy & easy to learn

2.platform independent

3.fre & open source

4.interpreted(byte-code-compiled)

5.embeddable & extensible

6.portable & robust

7. rich library support.

Use: web frameworks and application,gui-based desktop app, ml, data science

“hello world” in python

Print (“hello world”)

#input from user:

1.Name=input(“enter your name”)

Print(“hello , ”+ name)

Print(type(name))

Output:

Enter your name:Aman

Hello, Aman

<class ‘str’>

2.num=(input(“enter a number”))

Print (num)

Print(type(num))

Num=num+2

Print(num)

Output:

Enter a number 10
Unit 3 data science

10

<Class ‘str’>

Error must be str, not int

#Type casting

2.num=int((input(“enter a number”)))

Print (num)

Print(type(num))

Num=num+2

Print(num)

Output:

Enter a number 10

10

<Class ‘int’>

12

#variable in python:

Variable are used to store and mange data.

Variable name must start with a letter or the underscore character. variable name cannot start with a
number. Variable name can only contain alpha-numeric characters and undescores(A-z,0_9,and).

Variable name are case sensitive.

Variable name cannot be any of the python keyword.

Ex : var1=5

Var_1=5

_var1=5

1var, var 1> invalid

#data types in python:

1.numeric type: 4.sequence type:

Int, float ,complex. List ,tuple, range.

2.text type: 5.mapping type:

Str dict

3.boolean type: 6.set type

Bool set
Unit 3 data science

7.binary type:

Bytes ,bytearray

8.none type.

# various datatype in python:

1. Mutable data type


List, sets, dictionary
Both read and write
2. Immutable data type
Numbers ,strings ,tuples

Only read

#numpy stands for numerical python powerful python library that is widely used for scientific
computing, data analysis, and numerical computing task.

.pip install numpy.

.in numpy ,array is the fundamental object.

.arrays are used to store homogeneous data elements in a contiguous block of memory.

#how to create numpy array

1.Import numpy as np

List1=[10,20,30,40,50]

Array1=np.array(List1)

Print(Array1)

Type(Array1)

Output:

[10 20 30 40 50]

Numpy.ndarrray

2.import numpy as np

List1=[[10,20,30,4],[40,50,60],[70,80,90]]

Array1=np.array(List1)

Print(Array1)

Output:

[[10 20 30]

[40 50 60]

[70 80 90]]

3.Import numpy as np
Unit 3 data science

Array1=np.arange(1,8)

Print(Array1)

Output

[1 2 3 4 5 6 7]

4.Import numpy as np

Array1=np.arange(11,17).reshape((3,2))

Print(Array1)

output

[[11 12]

[13 14]

[15 16]]

5. import numpy as np

Array1=np.zeroes(4)

Print (Array1)

Output:[ 0. 0. 0. 0.]

5. import numpy as np

Array1=np.ones(4)

Print (Array1)

Output:[ 1. 1. 1. 1.]

#Attributes of numpy array

1.ndim

2.shape

3.size

4.dtype

5.itemsize

Ex; Import numpy as np

List1=[10,20,30,40,50]

Array1=np.array(List1)

Array1.ndim

Array1.shape

Array1.size
Unit 3 data science

Array1.dtype

Array1.itemsize

Output:

(5,)

Dtype(‘int32’)
4

2.import numpy as np

List1=[[10,20,30],[40,50,60],[70,80,90]]

Array1=np.array(List1)

Array1.ndim

Array1.shape

Array1.size

Array1.dtype

Array1.itemsize

Output:

(3,3)

3.import numpy as np

Array3=np.array([[[1,2,3],[4,5,6]],

[[7,8,9],[10,11,12]]])

Print(Array3)

Array3.dim

Array3.shape

Array3.size

Array3.dtype

Array3.itemsize

Output

[[[1 2 3]
Unit 3 data science

[4 5 6]]

[[7 8 9]

[10 11 12]]]

(2,2,3)

12

#indexing in numpy Array

Import numpy as np

Array1=np.array([10,20,30,40,50])

Print(Array1[0])

Print(Array1[-1])

Output;

10

50

2.array1=np.array([[10,20,30],[40,50,60],[70,80,90]])

Print(array1[1,2])

Print(array1[0,:])

Print(array1[:,1])

Output:

60

10 20 30

20 50 80

#slicing in numpy array.

Slicing is a way to extract a subset of data from a numpy array.

Import numpy as np

Array1=np.array([10,20,30,40,50,60,70])

Print(Array1[1:3])

Print(Array1[1:6:2])

Print(Array1[-1:-3:-1]) //[start :stop :step]

Print(Array1[:: 2])
Unit 3 data science

Print(Array1[: : -1])

Output:

20 30

20 40 60

70 60

10 30 50 70

70 60 50 40 30 20 10

Import numpy as np

Array1=np.array([[15,16,17],[25,26,27],[35,36,37],[45,46,47]])

Print(Array1[1,])

Print(Array1[:,1])

Print(Array1[1:3,1;3])

Print(Array1[1:3,])

Print(Array1[:,1:3])

Print(Array1[1:3,1])

Print(Array1[1;3,:1])

Print(Array1[1:3,1:])

0 1 2

0 15 16 17

1 25 26 27

2 35 36 37

3 45 46 47

visualization

Output;

25 26 27

16 26 36 46

26 27 36 37

25 26 27 35 36 37

16 17 26 27 36 37 46 47

26 36
Unit 3 data science

25 35

26 27 36 37

#arithmetuc operations

1.addition(+),-,*,/,//,**,%

Import numpy as np

X=np.array([[1,2],

[3,4]])

Y=np.array([[11,12],

[13,14]])

Z=x-y

Z=x%y // remainder

Z=x//y //floor division point value remove.

Z=x@y //matrix multiplication

Print(x.transpose())

Print(z)

#shorting

1.np.sort(): it will returns a sorted copy of an array.

2.np.argsort90: it will return the indices that would sort an array.

3.ndarray.sort():use array name and sort it in place.

import numpy as np

x=np.array([[12,11,15],

[21,25,20],

[18,27,16]])

Y=np.sort(x,axis=0)//columns wise

Y=np.argsort(x,axis=1)// indexing wise sort

Print(y)

In place;

x.sort()

print(x)

#short in 1d

Import numpy as np
Unit 3 data science

X=np.array([7,2,3,9,6])

1.Y=np.sort(x)

Print(x)

2.Y=np.argsort(x)

Print(y)

3.x.sort(x)

Print(x)

//Y=np.mean(x)

Print(y)

#statical operation

1.max() 2. Min() 3. Sum() 4. Mean() 5. Median() 6. Prod()

7.var() 8std()

#pandas store tabular data using a dataframe.

A dataframe is a two dimensional labelled data structure like a table in databases.

Every dataframe contains rows and columns,and therefore has both a row and column index.

Each column can have a different type of values.

Import pandas as pd

St_data=(1,”varun”30,”male”,”chandigarh”),

(2,”ravi”,31,”male”,”delhi”),

(3,”Preeti”29,”female’,”Jaipur”),

(4,”amrit”32,”male”,”Mumbai”),

(5,”pinki”,28,”female”,”banglore”)]

Df=pd.DataFrame(std_data,columns=[‘stu_id’,’name’,’age’,’gender’,’address’])

df

2.import pandas as pd

df=pd.read_csv(“student.csv”)

df

#df.head() // first five row

#df.tail() //last five row

# df.shape // no.row and no.column

#df.size //
Unit 3 data science

#df.column [[ ‘age’,’address’]]// name of column

#df.dtypes

#df.values

#df.index

#selecting row and columns

#selcting a single column

#df[‘age’]

#selecting multiple columns

#df[[‘name’,’address’]]

#selecting a single row by index label

#def.loc[0]

#selecting multiple row by index label

#def.loc[[0,2,4]]

#selecting a single raw by integer index

#def.iloc[0]

##selecting multiple row by integer index

#def.iloc[[0,2]]

#filtering rows

df[df[‘age’]>29]

#adding a new column to a dataframe

df[‘phone_no’]=[10,20,30,40,50]

df.insert(3,’phone_no’,[10,20,30,40,50]

print(df)

#deleting a column from dataframe

df=df.drop(columns=[‘phone_no’])

df

#deleting a column from dataframe

df=df.drop(columns=[‘phone_no’])

df

#rename the “old_name” column to ‘new_name ‘

#df=df.rename(columns={‘old_name’:’new_name’})
Unit 3 data science

df=df.rename(columns={‘age’: ‘student_age’})

df

#deleting a column from dataframe

del df[phone_no’]

#deleting a row from dataframe

df=df.drop(4)

df

#adding a new row in existing dataframe

Df.loc[4]=[5,’pinki’,28,’female’,’banglore’]

#updating the value

Df.loc[2,’student_age’]=71

Df

#updating the multiple values

Df.loc[[0,2],’adress’]=[andaman’,’nicobar’]

You might also like