0% found this document useful (0 votes)

20 views

Python Summary

Uploaded by

Thông Nguyễn Minh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as XLSX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Python Summary

Uploaded by

Thông Nguyễn Minh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as XLSX, PDF, TXT or read online on Scribd

You are on page 1/ 20

1 "String".

upper()
2 Series.index
3 Series/+-* number
4 Series.size
5 Series.is_unique
6 Series.values
7 Series1+Series2 Nan if there is no matching between two
8 sales_h1 = sales_q1.add(sales_q2, fill_value=0) Series.add(other, level=None, fill_value
9 Series.value_counts() Series.value_counts(normalize=F

normalize = True:
10 dict(series)
11 sorted(series)
12 Series.squeeze(axis=None) Series or DataFrames with a single element a
13 Series.sort_values() Series.sort_values(*, axis=0, ascending=
Normalize: If True then the object returned w
14 Series.sort_index()
15 value in [] or "value" in series.values
16 Series.get(key, default=None) Returns default value if not found.
17 pokemon[[1, 2, 4]] = ["Firemon", "Flamemon", "Blazemon"] overwrite value
18 pokemon_df = pd.read_csv("pokemon.csv", usecols = ["Pokemon"])
pokemon_series = pokemon_df.squeeze("columns").copy()
19 google = google.sort_values() # google.sort_values(inplace = True) ca
20 google.describe()
21 Series.apply()
22 Series.map(arg, na_action=None) arg: mapping correspondence
re is no matching between two series
d(other, level=None, fill_value=None, axis=0)[source]
alue_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True) = false → also show Nan
dropna
Normalize If True then the object returned will cont

ataFrames with a single element are squeezed to a scalar. DataFrames with a single column or a single row are squeezed to a Series. Otherwise the object is
t_values(*, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key=None)[source]
If True then the object returned will contain the relative frequencies of the unique values.

efault value if not found. can be used for dataframe as well

sort_values(inplace = True) can be recreated by the below syntax

ping correspondence
so show Nan
en the object returned will contain the relative frequencies of the unique values.

to a Series. Otherwise the object is unchanged.

, key=None)[source]
Time method
list tuple

要素の順番(order) あり(シーケンス) あり(シーケンス)

変更可(Mutable) 不可(Immutable)

重複要素の重複を許容する要素の重複を許容する

listよりメモリ使用量スペースが少な
辞書のキーにできない
い

setの要素にできない辞書のキーにできる
補足
setの要素にできる
簡易的なClassの代わりにnamed
tuplesが使える
l1 = list() t1 = tuple()
空の状態での作成
l1 = [] t1 = ()
t1 = ('a','b','c')
t1 = 'a', 'b', 'c'
初期化 l1 =['a', 'b','c'] #一要素ではカンマを忘れずに
t1 = ('a',)
t1 = 'a',
初期化 l1 = list(['a', 'b', 'c']) t1 = tuple(('a', 'b', 'c'))
(Class指定) l1 = list(('a', 'b', 'c')) t1 = tuple(['a', 'b', 'c'])

要素数の取得 len(l1) len(t1)

# 末尾へ
l1.append('d')
l1 += ['d']
追加 -
# 特定の位置へ
l1.insert(1, 'e')
l1[1:1] = 'e'
l1 = ['a', 'b', 'c']
l1[2] = 'x'
#無いとIndexError
置換 -
l1[9] = 'x'
#これはOK(末尾追加)
l1[9:] = 'x'
削除( by Position ) del l1[2] -

削除( by Value ) l1.remove('a') -

削除( by Key ) - -
l1.clear() # 無理矢理だが
削除( 全件クリア )
del l1[:] t1 = tuple()
要素の参照 # start
(スライス) l1[0]
# start:end
l1[0:2]
listと同じ
# last
l1[-1]
# by 2
l1[::2]
# デフォ＝末尾から(-1)
# 無いとIndexError
取得&削除 l1.pop() -
# 位置指定
l1.pop(2)
#追加
append()
LIFO(Stack) -
#取り出し(pop(-1)と同じ)
pop()
#追加
append()
FIFO(Queue) -
#取り出し
pop(0)
要素の位置を取得 l1.index('b') listと同じ
#True/False
存在チェック listと同じ
'a' in l1
l1 =[[1,2],[3,4],[5,6]] t1 = ((1,2),(3,4),(5,6))
二次元 # 要素の参照 # 要素の参照
l1[1][2] t1[1][1]
l1 = [1,2,3]

マージ l2 = [4,5,6]
Merge -
l1.extend(l2)

l1 = ['a','b','c'] t1 = (1, 2, 3)
l2 = ['d', 'e', 'f'] t2 = (4, 5, 6)
l1 +=l2 t3 = t1 + t2
# これは結果が異なる
マージ(2)
l1 = ['a','b']
l2 = ['c', 'd']
マージ(2)

l1.append(l2)
--> ['a', 'b', ['c', 'd']]

特定の値を持つ要素
l1.count('a') t1.count('a')
の数を取得

ソート l1.sort()
-
(破壊的) l1.sort(reverse=True)
ソート # sorted()=>list
(非破壊的) l2 = sorted(l1) t2 = tuple(sorted(t1))

並び順を逆に
l1.reverse() -
(破壊的)
並び順を逆に l2 = reversed(l1) t2 = tuple(reversed(t1))
(非破壊的) l2 = l1[::-1] t2 = t1[::-1]
a = [1, 2, 3] a = (1, 2, 3)
b=a b=a
---
コピー(浅い)
import copy
a = (1, 2, 3)
b = copy.copy(a)
a = [1, 2, 3] import copy
b = a.copy() a = (1, 2, 3)
--- b = copy.deepcopy(a)
コピー(深い) c = list(a) ---
--- c = tuple(a)
d = a[:] ---
d = a[:]

値の合計 sum(l1) listと同じ

値の最大 max(l1) listと同じ

値の最小 min(l1) listと同じ

l1 = ['a', 'b', 'c']

変換(Stringへ) ','.join(l1) listと同じ
--> a,b,c

変換(Listへ) - list(t1)

変換(Tupleへ) tuple(l1) -
変換(Tupleへ) tuple(l1) -

変換(Setへ) set(l1) set(t1)

l1 = [['a', 'b'], ['c', 'd'], ['e', 'f']] t1 = (('a', 'b'), ('c', 'd'), ('e', 'f'))
d1 = dict(l1) d1 = dict(t1)
---
変換(Dictへ)
k = ['a', 'b', 'c']
v = [1, 2, 3]
d1 = dict(zip(k, v))
複数のシーケンスか
ら
順番に取り出し

zip(l1,l2) listと同じ

内包表記 [x for x in l1] tuple(x for x in t1)

mutableな可能可能
オブジェクトの格納 l1 =['a', [1, 2, 3]] t1 = ('a', [1, 2, 3])

集合演算(和) - -

集合演算(差) - -

集合演算(積) - -

集合演算(対象差) - -

キーによる参照 - -

キーの取得とループ - -
値の取得とループ - -

キー&値ペアの
取得とループ - -

キーと値の入れ替え - -
set dictionary

なし 3.7～あり ※注
可(Mutable) 可(Mutable)
キーの重複を許容しない
要素の重複を許容しない
値の重複を許容する

集合演算が可能 keyはユニークであること

keyが重複した場合は値を上書き
要素はユニーク
(upsert)
追加・置換はUpsert

listやtupleの重複排除に利用可

d1 = dict()
s1 = set()
d1 = {}

s1 = {'a', 'b', 'c'} d1 = {'a': 1, 'b': 2, 'c': 3}

s1 = set({'a', 'b', 'c'}) d1 = dict(a=1, b=2, c=3)

s1 = set(['a', 'b', 'c']) d1 = dict({'a':1, 'b':2, 'c':3})

s1 = set(('a', 'b', 'c')) d1 = dict((('a',1), ('b',2), ('c',3)))

len(s1) len(d1)
s1.add('d') d1[key] = val
s1 |= {'d'} d1.update({'e': 4})
d1.update(e=4)
d1.update(dict(e=4))

追加と同じ(upsert) 追加と同じ(upsert)

- -
s1.remove('d')
-
s1 -= {'d'}
- del d1[key]
s1.clear() d1.clear()
s1 = set() d1 = {}

- -

#無いとKeyError # 無いとKeyError
s1.pop('a') d1.pop(key)
# 無いとdefault # 無いとdefault
s1.pop('a', default) d1.pop(key, default)

- -

- -
key in d1 #True/False
listと同じ
val in d1.values() #True/False
s1 = {(1,2), (3,4)} # valにdictを格納可能
#setの入れ子は不可 d1 = {'a': {'x': 1}, 'b': {'y': 2}}
× s1 = {{1, 2}, {3, 4}}
s1 = {1, 2, 3} d1 = {'a': 1, 'b': 2}
s2 = {4, 5, 6} d2 = {'b': 9, 'c':3}
s3 = s1.union(s2) d1.update(d2)
※key重複時は後者(d2)の値を反映
s1 = {1, 2, 3}
s2 = {4, 5, 6}
s3 = s1 | s2

-
-

d1 = {'a': 3, 'b': 2, 'c': 1, 'd': 3}

len({k: v for k, v in d1.items() if v ==
- 3})
---
sum(v == 3 for v in d1.values())

- -

# sorted()=>list
d2 = sorted(d1.items(), key=lambda
s2 = set(sorted(s1))
x: x[1])
# 用途??

- -

a = {1, 2, 3} a = {'a': 1, 'b': 2, 'c': 3}

b=a b=a

a = {1, 2, 3} a = {'a': 1, 'b': 2, 'c': 3}

b = a.copy() b = a.copy()
---
c = set(a)

sum(d1.keys())
listと同じ
sum(d1.values())
max(d1.keys())
listと同じ
max(d1.values())
min(d1.keys())
listと同じ
min(d1.values())
,'.join(d1.keys()) >
listと同じ ,'.join(d1.values())

list(d1.keys())
list(s1)
list(d1.values())
tuple(d1.keys())
tuple(s1)
tuple(s1) tuple(d1.values())
tuple(d1.items())
set(d1.keys())
- set(d1.values())
set(d1.items())
s1 = {('a',1),('b',2),('c', 3)}
d1 = dict(s1))
---
-
s1 = {'a', 'b', 'c'}
s2 = {1, 2, 3}
d1 = dict(zip(s1, s2))

zip(s1,s2)は可能だが

組や順番は未保証
s1 = {'a', 'b', 'c'} -
s2 = {1, 2, 3}
l3 = zip(s1, s2)
--> {('a', 1), ('c', 3), ('b', 2)}
{x for x in s1} {k: v for k, v in d1.items()}
不可 Keyは不可(Type Error)
s1 = {'a', [1, 2, 3]} d1 = {[1, 2, 3]: 1}
-->TypeError Valueは可能

d1 = {'a': [1, 2, 3]}

s1 | s2
-
s1.union(s2)
s1 - s2
-
s1.difference(s2)
s1 & s2
-
s1.intersection(s2)
s1 ^ s2>
s1.symmetric_ -
difference(s2)
#キーが無いとKeyError発生
d1[key]
#無いとNoneが返る
-
d1.get(key)
#無いとdefaultが返る
d1.get(key,default)
d1.keys()
-
for key in d1.keys():
d1.values()
-
for val in d1.values():
# ( k, v )のペアがtupleで戻る
- d1.items()
for key, value in d1.items():
- d2 = {v: k for k, v in d1.items()}
Category Continuous
Chi square t-test
Category
Anova
t-test Correlation
Continuous

Paired t test ・A paired t-test is used when we are interested in the difference between two variables fo
・Often the two variables are separated by time.
・For example, in the Dixon and Massey data set we have cholesterol levels in 1952 and chol

Two samples t test a method used to test whether the unknown population means of two groups are equal or not.
e between two variables for the same subject.

erol levels in 1952 and cholesterol levels in 1962 for each subject

wo groups are equal or not.

Confidence interval for difference of two means, dependent samples
Weight loss example, lbs

Background The 365 team has developed a diet and an exercise program for losing weight. It seems that it works like a charm. However,
You have a sample of 10 people who have already completed the 12-week program. The second sheet in shows the data in
Task 1 Calculate the mean and standard deviation of the dataset
Task 2 Determine the appropriate statistic to use
Task 3 Calculate the 95% confidence interval
Task 4 Interpret the result
Optional You can try to calculate the 90% and 99% confidence intervals to see the difference. There is no solution provided for these

Solution:

Subject Weight before (lbs) Weight after (lbs) Difference

1 228.58 204.74 -23.83 Task 1: Mean -20.02
2 244.01 223.95 -20.06 St. deviation 6.86
3 262.46 232.94 -29.52
4 224.32 212.04 -12.28 Task 2: Population variance is unknown
5 202.14 191.74 -10.41 We have a small sample
6 246.98 233.47 -13.51 We assume that the population is normally d
7 195.86 177.60 -18.25 The appropriate statistic to use is the t-statist
8 231.88 213.85 -18.03
9 243.32 218.85 -24.47
10 266.74 236.86 -29.87

Note that the solution is exactly the same no matter the u

hat it works like a charm. However, you are interested in how much weight are you likely to lose.
second sheet in shows the data in kg, if you feel more comfortable using kg as a unit of measurement

re is no solution provided for these cases.

Task 3:

95% CI, t9,0.025 2.26

n variance is unknown
a small sample T CI low CI high
me that the population is normally distributed 95% -24.93 -15.12
opriate statistic to use is the t-statistic

Task 4: You are 95% confident that you will lose between 24.93lbs and 15.12lbs,
given that you follow the program as strict as the sample

is exactly the same no matter the unit of measurement

en 24.93lbs and 15.12lbs,
# A custom IQR function
def iqr(column):
return column.quantile(0.75) - column.quantile(0.25)

# Print IQR of the temperature_c column
print(sales["temperature_c"].agg(iqr))

Python Cheat Sheet 2.0
100% (1)
Python Cheat Sheet 2.0
10 pages
Py 1679789071
No ratings yet
Py 1679789071
2 pages
GEC PRACTICALS
No ratings yet
GEC PRACTICALS
31 pages
Python Container Operations
No ratings yet
Python Container Operations
5 pages
6.0001 Final Cheat Sheet PDF
No ratings yet
6.0001 Final Cheat Sheet PDF
2 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Week 3 GGG
No ratings yet
Week 3 GGG
17 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
CS1010S Cheatsheet
No ratings yet
CS1010S Cheatsheet
3 pages
Himadri
No ratings yet
Himadri
6 pages
DAV Practicals
No ratings yet
DAV Practicals
26 pages
FDA_BATCH2PROGRAM
No ratings yet
FDA_BATCH2PROGRAM
18 pages
DataAnalytics Lab Manual (1)
No ratings yet
DataAnalytics Lab Manual (1)
35 pages
Python 41 AM2
No ratings yet
Python 41 AM2
8 pages
Python
No ratings yet
Python
24 pages
Content
No ratings yet
Content
12 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Ap Python
No ratings yet
Ap Python
12 pages
Stat Lab
No ratings yet
Stat Lab
24 pages
GE Python Visualization 2023
No ratings yet
GE Python Visualization 2023
16 pages
sowmi DS
No ratings yet
sowmi DS
27 pages
dsa
No ratings yet
dsa
26 pages
MMT List Py
No ratings yet
MMT List Py
59 pages
numpy_dataframe
No ratings yet
numpy_dataframe
12 pages
Miuul_Data_Scientist_Bootcamp_CheatSheet_Collections
No ratings yet
Miuul_Data_Scientist_Bootcamp_CheatSheet_Collections
7 pages
Pds Record Document Ds II
No ratings yet
Pds Record Document Ds II
36 pages
Dictionary: Dict ('Name': 'Geeks', 1: (1, 2, 3, 4) )
No ratings yet
Dictionary: Dict ('Name': 'Geeks', 1: (1, 2, 3, 4) )
14 pages
Python Cheat Sheet
No ratings yet
Python Cheat Sheet
32 pages
2023 Data Analysis and Visualization Using Python
100% (2)
2023 Data Analysis and Visualization Using Python
9 pages
DAV Previous Year
No ratings yet
DAV Previous Year
7 pages
Python Week 4 All GrPA's Solutions
100% (2)
Python Week 4 All GrPA's Solutions
8 pages
11.Dictionary Datatype
No ratings yet
11.Dictionary Datatype
9 pages
Creation of Series Using List, Dictionary & Ndarray
No ratings yet
Creation of Series Using List, Dictionary & Ndarray
65 pages
Python
No ratings yet
Python
1 page
Python Program Practice 2
No ratings yet
Python Program Practice 2
3 pages
Python Week-3
No ratings yet
Python Week-3
11 pages
An Introduction To R Language
No ratings yet
An Introduction To R Language
11 pages
University of Engineering and Technology Taxila: Engr. Asma Shafi Muhammad Jarrar Mehdi (22-TE-04) OOP Lab Manuals
No ratings yet
University of Engineering and Technology Taxila: Engr. Asma Shafi Muhammad Jarrar Mehdi (22-TE-04) OOP Lab Manuals
26 pages
Import
No ratings yet
Import
15 pages
GE- COMPUTER SCIENCE DATA ANALYSIS
No ratings yet
GE- COMPUTER SCIENCE DATA ANALYSIS
16 pages
Python Program Practice 3
No ratings yet
Python Program Practice 3
3 pages
Iot Da1
No ratings yet
Iot Da1
16 pages
PYTHONa 7
No ratings yet
PYTHONa 7
15 pages
AL Notes
No ratings yet
AL Notes
61 pages
IR
No ratings yet
IR
12 pages
data science practicals
No ratings yet
data science practicals
47 pages
Programs of Python Pandas
No ratings yet
Programs of Python Pandas
15 pages
DP prog
No ratings yet
DP prog
10 pages
23 Final Solution
No ratings yet
23 Final Solution
7 pages
Dictionaries Tuples Assignment For Python
No ratings yet
Dictionaries Tuples Assignment For Python
5 pages
Dsa Lab File - 240410 - 163630
No ratings yet
Dsa Lab File - 240410 - 163630
83 pages
python interviews
No ratings yet
python interviews
154 pages
Lecture Python5
No ratings yet
Lecture Python5
15 pages
Pandas & Mysql
No ratings yet
Pandas & Mysql
20 pages
Python Formula Sheet
No ratings yet
Python Formula Sheet
3 pages
DATASCIENCE_INTERNSHIP[1]
No ratings yet
DATASCIENCE_INTERNSHIP[1]
43 pages

Python Summary

Uploaded by

Python Summary

Uploaded by

1 "String".

efault value if not found. can be used for dataframe as well

sort_values(inplace = True) can be recreated by the below syntax

to a Series. Otherwise the object is unchanged.

要素の順番(order) あり(シーケンス) あり(シーケンス)

要素数の取得 len(l1) len(t1)

削除( by Value ) l1.remove('a') -

値の合計 sum(l1) listと同じ

値の最大 max(l1) listと同じ

値の最小 min(l1) listと同じ

l1 = ['a', 'b', 'c']

変換(Setへ) set(l1) set(t1)

内包表記 [x for x in l1] tuple(x for x in t1)

s1 = {'a', 'b', 'c'} d1 = {'a': 1, 'b': 2, 'c': 3}

s1 = set({'a', 'b', 'c'}) d1 = dict(a=1, b=2, c=3)

s1 = set(('a', 'b', 'c')) d1 = dict((('a',1), ('b',2), ('c',3)))

d1 = {'a': 3, 'b': 2, 'c': 1, 'd': 3}

a = {1, 2, 3} a = {'a': 1, 'b': 2, 'c': 3}

a = {1, 2, 3} a = {'a': 1, 'b': 2, 'c': 3}

d1 = {'a': [1, 2, 3]}

wo groups are equal or not.

Subject Weight before (lbs) Weight after (lbs) Difference

Note that the solution is exactly the same no matter the u

re is no solution provided for these cases.

95% CI, t9,0.025 2.26

is exactly the same no matter the unit of measurement

You might also like