0% found this document useful (0 votes)
13 views19 pages

457 Labs

Uploaded by

fatialqaffas31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views19 pages

457 Labs

Uploaded by

fatialqaffas31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

print ("Hello Qm457 Students welcome to the first lesson in python")

Hello Qm457 Students welcome to the first lesson in python

Integers
#Integers are number without decimals
type(1)
print(1)

floats
#floats are number with decimals
type (4.5)
print (4.5)

4.5

Variable Definitions
age= 19
print (age)

19

string
##strings contain a sequence of character
color = "blue"
print (color)
type(color)

age= 25
print ("my age is: ", age)
print("A students")

blue
my age is: 25
A students

## Quotes within strings


"I'm 19 years old"
print ("I'm 19 years old")
'my favorite course is "Qm457"'

I'm 19 years old

'my favorite course is "Qm457"'

String indexing
# This is string
# String M o h d
# Index 0 1 2 3
"Mohd"

'Mohd'

my_string="Mohd"
my_string[0]

'M'

my_string[1]

'o'

my_string[2]

'h'

my_string[3]

'd'

my_string[4]

----------------------------------------------------------------------
-----
IndexError Traceback (most recent call
last)
Cell In[17], line 1
----> 1 my_string[4]

IndexError: string index out of range

my_string="Mohd"
my_string[-1]

my_string[-2]

my_string[-3]
my_string[-4]

String Slicing
# string_variable [start:stop:step]
# we will apply the stop and step / string_variable>[start:stop]
Mohadhasanali="Mohadhasanali"

Mohadhasanali[0:5]

Mohadhasanali[5:10]

Mohadhasanali[10:13]

# Default start and step


Mohadhasanali[:10]

# Default end and step


Mohadhasanali[10:]

f-strings
# to define f-string we just add an f before the single or double
quotes
# within the string we surround the variables or experession with
curly braces {}
# This replaces their value in the string when we run the program
first_name="Mohamed"
favorite_language="python"
print(f"Hi, I'm {first_name}. I'm learning {favorite_language}.")

value = 20
print (f"{value} multiplied by 3 is:{value * 3}")

Booleans
# True and False
type(true)
type(false)

type(True)
type(False)
lists
#list of numbers
[1, 2, 3, 4, 5]

#float
[3.4, 2.4, 2.6, 3.5]

#letters
["a", "b", "c", "d"]

#letters and numbers & float


[1, "Mohamed", 3.4]

letters=["mohd", "hasan", "ali"]


print(letters)

letters[0]="mohd"
letters[1]="hasan"
letters[2]="ali"
letters

my_list=['mohd', 'hasan', 'ali']


my_list.append("thabet")
my_list

my_list=['mohd', 'hasan', 'ali']


my_list.remove("hasan")
my_list

Tuples
print (1, 2, 3, 4, 5)
print ("a", "b", "c", "d")
print (3.4, 2.4, 2.6, 3.5)

my_tuple=(1, 2, 3, 4, 5)

my_tuple[0]

my_tuple[1]

my_tuple[2]

my_tuple[3]

my_tuple[4]

my_tuple[6]

my_tuple[-1]
my_tuple[-2]

my_tuple[-3]

my_tuple[-4]

my_tuple[-5]

my_tuple[-6]

Tuple length
my_tuple=(1, 2, 3, 4, 100, 50, 60)
len(my_tuple)

Nasted Tuples
my_tuple=([1, 2, 3], (4, 5, 6))

my_tuple[2]

my_tuple[1]

my_tuple[0]

Tuple assignment
a, b= 1,2
a

a=1
b=2
a, b=b, a
a

Dectionaries
{"a":1, "b":2, "c":3}
my_dict={"a":1, "b":2, "c":3}
print(my_dict)
{'a': 1, 'b': 2, 'c': 3}

#data structures and data frames


import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

mc = pd.read_excel("mc.xlsx")

print("shpe of dataframe:", mc.shape)

shpe of dataframe: (2240, 29)

print("nFirst 5 rows of the dataframe")

nFirst 5 rows of the dataframe

mc.head()

ID Year_Birth Education Marital_Status Income Kidhome


Teenhome \
0 5524 1957 Graduation Single 58138.0 0
0
1 2174 1954 Graduation Single 46344.0 1
1
2 4141 1965 Graduation Together 71613.0 0
0
3 6182 1984 Graduation Together 26646.0 1
0
4 5324 1981 PhD Married 58293.0 1
0

Dt_Customer Recency MntWines ... NumWebVisitsMonth AcceptedCmp3


\
0 2012-09-04 58 635 ... 7 0

1 2014-03-08 38 11 ... 5 0

2 2013-08-21 26 426 ... 4 0

3 2014-02-10 26 11 ... 6 0

4 2014-01-19 94 173 ... 5 0

AcceptedCmp4 AcceptedCmp5 AcceptedCmp1 AcceptedCmp2 Complain \


0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
Z_CostContact Z_Revenue Response
0 3 11 1
1 3 11 0
2 3 11 0
3 3 11 0
4 3 11 0

[5 rows x 29 columns]

print("nLast 5 rows of the dataframe")

nLast 5 rows of the dataframe

mc.tail()

ID Year_Birth Education Marital_Status Income


Kidhome \
2235 10870 1967 Graduation Married 61223.0 0

2236 4001 1946 PhD Together 64014.0 2

2237 7270 1981 Graduation Divorced 56981.0 0

2238 8235 1956 Master Together 69245.0 0

2239 9405 1954 PhD Married 52869.0 1

Teenhome Dt_Customer Recency MntWines ... NumWebVisitsMonth


\
2235 1 2013-06-13 46 709 ... 5

2236 1 2014-06-10 56 406 ... 7

2237 0 2014-01-25 91 908 ... 6

2238 1 2014-01-24 8 428 ... 3

2239 1 2012-10-15 40 84 ... 7

AcceptedCmp3 AcceptedCmp4 AcceptedCmp5 AcceptedCmp1


AcceptedCmp2 \
2235 0 0 0 0
0
2236 0 0 0 1
0
2237 0 1 0 0
0
2238 0 0 0 0
0
2239 0 0 0 0
0

Complain Z_CostContact Z_Revenue Response


2235 0 3 11 0
2236 0 3 11 0
2237 0 3 11 0
2238 0 3 11 0
2239 0 3 11 1

[5 rows x 29 columns]

mc.head(10)

ID Year_Birth Education Marital_Status Income Kidhome


Teenhome \
0 5524 1957 Graduation Single 58138.0 0
0
1 2174 1954 Graduation Single 46344.0 1
1
2 4141 1965 Graduation Together 71613.0 0
0
3 6182 1984 Graduation Together 26646.0 1
0
4 5324 1981 PhD Married 58293.0 1
0
5 7446 1967 Master Together 62513.0 0
1
6 965 1971 Graduation Divorced 55635.0 0
1
7 6177 1985 PhD Married 33454.0 1
0
8 4855 1974 PhD Together 30351.0 1
0
9 5899 1950 PhD Together 5648.0 1
1

Dt_Customer Recency MntWines ... NumWebVisitsMonth AcceptedCmp3


\
0 2012-09-04 58 635 ... 7 0

1 2014-03-08 38 11 ... 5 0

2 2013-08-21 26 426 ... 4 0

3 2014-02-10 26 11 ... 6 0

4 2014-01-19 94 173 ... 5 0


5 2013-09-09 16 520 ... 6 0

6 2012-11-13 34 235 ... 6 0

7 2013-05-08 32 76 ... 8 0

8 2013-06-06 19 14 ... 9 0

9 2014-03-13 68 28 ... 20 1

AcceptedCmp4 AcceptedCmp5 AcceptedCmp1 AcceptedCmp2 Complain \


0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
6 0 0 0 0 0
7 0 0 0 0 0
8 0 0 0 0 0
9 0 0 0 0 0

Z_CostContact Z_Revenue Response


0 3 11 1
1 3 11 0
2 3 11 0
3 3 11 0
4 3 11 0
5 3 11 0
6 3 11 0
7 3 11 0
8 3 11 1
9 3 11 0

[10 rows x 29 columns]

mc.columns

Index(['ID', 'Year_Birth', 'Education', 'Marital_Status', 'Income',


'Kidhome',
'Teenhome', 'Dt_Customer', 'Recency', 'MntWines', 'MntFruits',
'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts',
'MntGoldProds', 'NumDealsPurchases', 'NumWebPurchases',
'NumCatalogPurchases', 'NumStorePurchases',
'NumWebVisitsMonth',
'AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp1',
'AcceptedCmp2', 'Complain', 'Z_CostContact', 'Z_Revenue',
'Response'],
dtype='object')
mc.size

64960

mc.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2240 entries, 0 to 2239
Data columns (total 29 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ID 2240 non-null int64
1 Year_Birth 2240 non-null int64
2 Education 2240 non-null object
3 Marital_Status 2240 non-null object
4 Income 2216 non-null float64
5 Kidhome 2240 non-null int64
6 Teenhome 2240 non-null int64
7 Dt_Customer 2240 non-null datetime64[ns]
8 Recency 2240 non-null int64
9 MntWines 2240 non-null int64
10 MntFruits 2240 non-null int64
11 MntMeatProducts 2240 non-null int64
12 MntFishProducts 2240 non-null int64
13 MntSweetProducts 2240 non-null int64
14 MntGoldProds 2240 non-null int64
15 NumDealsPurchases 2240 non-null int64
16 NumWebPurchases 2240 non-null int64
17 NumCatalogPurchases 2240 non-null int64
18 NumStorePurchases 2240 non-null int64
19 NumWebVisitsMonth 2240 non-null int64
20 AcceptedCmp3 2240 non-null int64
21 AcceptedCmp4 2240 non-null int64
22 AcceptedCmp5 2240 non-null int64
23 AcceptedCmp1 2240 non-null int64
24 AcceptedCmp2 2240 non-null int64
25 Complain 2240 non-null int64
26 Z_CostContact 2240 non-null int64
27 Z_Revenue 2240 non-null int64
28 Response 2240 non-null int64
dtypes: datetime64[ns](1), float64(1), int64(25), object(2)
memory usage: 507.6+ KB

mc.Marital_Status.value_counts()

Marital_Status
Married 864
Together 580
Single 480
Divorced 232
Widow 77
Alone 3
Absurd 2
YOLO 2
Name: count, dtype: int64

plt.figure(figsize = (15,10))
plt.hist(mc.Marital_Status)

(array([480., 580., 864., 0., 232., 77., 0., 3., 2., 2.]),
array([0. , 0.7, 1.4, 2.1, 2.8, 3.5, 4.2, 4.9, 5.6, 6.3, 7. ]),
<BarContainer object of 10 artists>)

mc.Education.value_counts()

Education
Graduation 1127
PhD 486
Master 370
2n Cycle 203
Basic 54
Name: count, dtype: int64
plt.figure(figsize=(15,10))
plt.hist(mc.Education)

(array([1127., 0., 486., 0., 0., 370., 0., 54., 0.,


203.]),
array([0. , 0.4, 0.8, 1.2, 1.6, 2. , 2.4, 2.8, 3.2, 3.6, 4. ]),
<BarContainer object of 10 artists>)

mc.isnull().sum()

ID 0
Year_Birth 0
Education 0
Marital_Status 0
Income 24
Kidhome 0
Teenhome 0
Dt_Customer 0
Recency 0
MntWines 0
MntFruits 0
MntMeatProducts 0
MntFishProducts 0
MntSweetProducts 0
MntGoldProds 0
NumDealsPurchases 0
NumWebPurchases 0
NumCatalogPurchases 0
NumStorePurchases 0
NumWebVisitsMonth 0
AcceptedCmp3 0
AcceptedCmp4 0
AcceptedCmp5 0
AcceptedCmp1 0
AcceptedCmp2 0
Complain 0
Z_CostContact 0
Z_Revenue 0
Response 0
dtype: int64

mc["Income"].median()

51381.5

m=mc["Income"].median()
m

51381.5

mc["Income"].fillna(value=m,inplace=True)

/var/folders/kl/wl42mn2d1mzgnbj8qrc0vp5m0000gp/T/
ipykernel_5300/786660505.py:1: FutureWarning: A value is trying to be
set on a copy of a DataFrame or Series through chained assignment
using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never
work because the intermediate object on which we are setting values
always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try


using 'df.method({col: value}, inplace=True)' or df[col] =
df[col].method(value) instead, to perform the operation inplace on the
original object.

mc["Income"].fillna(value=m,inplace=True)

mc.isnull().sum()

ID 0
Year_Birth 0
Education 0
Marital_Status 0
Income 0
Kidhome 0
Teenhome 0
Dt_Customer 0
Recency 0
MntWines 0
MntFruits 0
MntMeatProducts 0
MntFishProducts 0
MntSweetProducts 0
MntGoldProds 0
NumDealsPurchases 0
NumWebPurchases 0
NumCatalogPurchases 0
NumStorePurchases 0
NumWebVisitsMonth 0
AcceptedCmp3 0
AcceptedCmp4 0
AcceptedCmp5 0
AcceptedCmp1 0
AcceptedCmp2 0
Complain 0
Z_CostContact 0
Z_Revenue 0
Response 0
dtype: int64

#count number of unique number in Income


unique_income= r
print(unique_income)

1975

#what are the max.min.avg of Income


print(mc["Income"].max(axis=0))
print(mc["Income"].min(axis=0))
print(mc["Income"].mean(axis=0))

666666.0
1730.0
52237.97544642857

#BOX PLOT
plt.figure(figsize=(10, 6))
sns.boxplot(x="Income",data=mc, color='lightblue')

<Axes: xlabel='Income'>
#distribution & box plot of income
sns.distplot(mc.Income)

/var/folders/kl/wl42mn2d1mzgnbj8qrc0vp5m0000gp/T/
ipykernel_5300/3302670981.py:2: UserWarning:

`distplot` is a deprecated function and will be removed in seaborn


v0.14.0.

Please adapt your code to use either `displot` (a figure-level


function with
similar flexibility) or `histplot` (an axes-level function for
histograms).

For a guide to updating your code to use the new functions, please see
https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

sns.distplot(mc.Income)

<Axes: xlabel='Income', ylabel='Density'>


mc["Income"].mean()

52237.97544642857

AcceptCmp=mc["AcceptedCmp1"]+mc["AcceptedCmp2"]+mc["AcceptedCmp3"]
+mc["AcceptedCmp4"]+mc["AcceptedCmp5"]
mc["AcceptCmp"]=AcceptCmp
mc.head(5)

ID Year_Birth Education Marital_Status Income Kidhome


Teenhome \
0 5524 1957 Graduation Single 58138.0 0
0
1 2174 1954 Graduation Single 46344.0 1
1
2 4141 1965 Graduation Together 71613.0 0
0
3 6182 1984 Graduation Together 26646.0 1
0
4 5324 1981 PhD Married 58293.0 1
0

Dt_Customer Recency MntWines ... AcceptedCmp3 AcceptedCmp4 \


0 2012-09-04 58 635 ... 0 0
1 2014-03-08 38 11 ... 0 0
2 2013-08-21 26 426 ... 0 0
3 2014-02-10 26 11 ... 0 0
4 2014-01-19 94 173 ... 0 0

AcceptedCmp5 AcceptedCmp1 AcceptedCmp2 Complain


Z_CostContact \
0 0 0 0 0 3

1 0 0 0 0 3

2 0 0 0 0 3

3 0 0 0 0 3

4 0 0 0 0 3

Z_Revenue Response AcceptCmp


0 11 1 0
1 11 0 0
2 11 0 0
3 11 0 0
4 11 0 0

[5 rows x 30 columns]

plt.hist(mc.AcceptCmp)

(array([1777., 0., 325., 0., 0., 83., 0., 44., 0.,


11.]),
array([0. , 0.4, 0.8, 1.2, 1.6, 2. , 2.4, 2.8, 3.2, 3.6, 4. ]),
<BarContainer object of 10 artists>)
variable=sns.countplot(x="AcceptCmp", data=mc)
value_counts = mc["AcceptCmp"].value_counts()
print(value_counts)

AcceptCmp
0 1777
1 325
2 83
3 44
4 11
Name: count, dtype: int64

You might also like