Python Summary
Python Summary
upper()
2 Series.index
3 Series/+-* number
4 Series.size
5 Series.is_unique
6 Series.values
7 Series1+Series2 Nan if there is no matching between two
8 sales_h1 = sales_q1.add(sales_q2, fill_value=0) Series.add(other, level=None, fill_value
9 Series.value_counts() Series.value_counts(normalize=F
normalize = True:
10 dict(series)
11 sorted(series)
12 Series.squeeze(axis=None) Series or DataFrames with a single element a
13 Series.sort_values() Series.sort_values(*, axis=0, ascending=
Normalize: If True then the object returned w
14 Series.sort_index()
15 value in [] or "value" in series.values
16 Series.get(key, default=None) Returns default value if not found.
17 pokemon[[1, 2, 4]] = ["Firemon", "Flamemon", "Blazemon"] overwrite value
18 pokemon_df = pd.read_csv("pokemon.csv", usecols = ["Pokemon"])
pokemon_series = pokemon_df.squeeze("columns").copy()
19 google = google.sort_values() # google.sort_values(inplace = True) ca
20 google.describe()
21 Series.apply()
22 Series.map(arg, na_action=None) arg: mapping correspondence
re is no matching between two series
d(other, level=None, fill_value=None, axis=0)[source]
alue_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True) = false → also show Nan
dropna
Normalize If True then the object returned will cont
ataFrames with a single element are squeezed to a scalar. DataFrames with a single column or a single row are squeezed to a Series. Otherwise the object is
t_values(*, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key=None)[source]
If True then the object returned will contain the relative frequencies of the unique values.
ping correspondence
so show Nan
en the object returned will contain the relative frequencies of the unique values.
重複 要素の重複を許容する 要素の重複を許容する
listよりメモリ使用量スペースが少な
辞書のキーにできない
い
setの要素にできない 辞書のキーにできる
補足
setの要素にできる
簡易的なClassの代わりにnamed
tuplesが使える
l1 = list() t1 = tuple()
空の状態での作成
l1 = [] t1 = ()
t1 = ('a','b','c')
t1 = 'a', 'b', 'c'
初期化 l1 =['a', 'b','c'] #一要素ではカンマを忘れずに
t1 = ('a',)
t1 = 'a',
初期化 l1 = list(['a', 'b', 'c']) t1 = tuple(('a', 'b', 'c'))
(Class指定) l1 = list(('a', 'b', 'c')) t1 = tuple(['a', 'b', 'c'])
マージ l2 = [4,5,6]
Merge -
l1.extend(l2)
l1 = ['a','b','c'] t1 = (1, 2, 3)
l2 = ['d', 'e', 'f'] t2 = (4, 5, 6)
l1 +=l2 t3 = t1 + t2
# これは結果が異なる
マージ(2)
l1 = ['a','b']
l2 = ['c', 'd']
マージ(2)
l1.append(l2)
--> ['a', 'b', ['c', 'd']]
特定の値を持つ要素
l1.count('a') t1.count('a')
の数を取得
ソート l1.sort()
-
(破壊的) l1.sort(reverse=True)
ソート # sorted()=>list
(非破壊的) l2 = sorted(l1) t2 = tuple(sorted(t1))
並び順を逆に
l1.reverse() -
(破壊的)
並び順を逆に l2 = reversed(l1) t2 = tuple(reversed(t1))
(非破壊的) l2 = l1[::-1] t2 = t1[::-1]
a = [1, 2, 3] a = (1, 2, 3)
b=a b=a
---
コピー(浅い)
import copy
a = (1, 2, 3)
b = copy.copy(a)
a = [1, 2, 3] import copy
b = a.copy() a = (1, 2, 3)
--- b = copy.deepcopy(a)
コピー(深い) c = list(a) ---
--- c = tuple(a)
d = a[:] ---
d = a[:]
変換(Listへ) - list(t1)
変換(Tupleへ) tuple(l1) -
変換(Tupleへ) tuple(l1) -
l1 = [['a', 'b'], ['c', 'd'], ['e', 'f']] t1 = (('a', 'b'), ('c', 'd'), ('e', 'f'))
d1 = dict(l1) d1 = dict(t1)
---
変換(Dictへ)
k = ['a', 'b', 'c']
v = [1, 2, 3]
d1 = dict(zip(k, v))
複数のシーケンスか
ら
順番に取り出し
zip(l1,l2) listと同じ
集合演算(和) - -
集合演算(差) - -
集合演算(積) - -
集合演算(対象差) - -
キーによる参照 - -
キーの取得とループ - -
値の取得とループ - -
キー&値ペアの
取得とループ - -
キーと値の入れ替え - -
set dictionary
なし 3.7~あり ※注
可(Mutable) 可(Mutable)
キーの重複を許容しない
要素の重複を許容しない
値の重複を許容する
集合演算が可能 keyはユニークであること
keyが重複した場合は値を上書き
要素はユニーク
(upsert)
追加・置換はUpsert
listやtupleの重複排除に利用可
d1 = dict()
s1 = set()
d1 = {}
len(s1) len(d1)
s1.add('d') d1[key] = val
s1 |= {'d'} d1.update({'e': 4})
d1.update(e=4)
d1.update(dict(e=4))
追加と同じ(upsert) 追加と同じ(upsert)
- -
s1.remove('d')
-
s1 -= {'d'}
- del d1[key]
s1.clear() d1.clear()
s1 = set() d1 = {}
- -
#無いとKeyError # 無いとKeyError
s1.pop('a') d1.pop(key)
# 無いとdefault # 無いとdefault
s1.pop('a', default) d1.pop(key, default)
- -
- -
- -
key in d1 #True/False
listと同じ
val in d1.values() #True/False
s1 = {(1,2), (3,4)} # valにdictを格納可能
#setの入れ子は不可 d1 = {'a': {'x': 1}, 'b': {'y': 2}}
× s1 = {{1, 2}, {3, 4}}
s1 = {1, 2, 3} d1 = {'a': 1, 'b': 2}
s2 = {4, 5, 6} d2 = {'b': 9, 'c':3}
s3 = s1.union(s2) d1.update(d2)
※key重複時は後者(d2)の値を反映
s1 = {1, 2, 3}
s2 = {4, 5, 6}
s3 = s1 | s2
-
-
- -
# sorted()=>list
d2 = sorted(d1.items(), key=lambda
s2 = set(sorted(s1))
x: x[1])
# 用途??
- -
- -
sum(d1.keys())
listと同じ
sum(d1.values())
max(d1.keys())
listと同じ
max(d1.values())
min(d1.keys())
listと同じ
min(d1.values())
,'.join(d1.keys()) >
listと同じ ,'.join(d1.values())
list(d1.keys())
list(s1)
list(d1.values())
tuple(d1.keys())
tuple(s1)
tuple(s1) tuple(d1.values())
tuple(d1.items())
set(d1.keys())
- set(d1.values())
set(d1.items())
s1 = {('a',1),('b',2),('c', 3)}
d1 = dict(s1))
---
-
s1 = {'a', 'b', 'c'}
s2 = {1, 2, 3}
d1 = dict(zip(s1, s2))
zip(s1,s2)は可能だが
組や順番は未保証
s1 = {'a', 'b', 'c'} -
s2 = {1, 2, 3}
l3 = zip(s1, s2)
--> {('a', 1), ('c', 3), ('b', 2)}
{x for x in s1} {k: v for k, v in d1.items()}
不可 Keyは不可(Type Error)
s1 = {'a', [1, 2, 3]} d1 = {[1, 2, 3]: 1}
-->TypeError Valueは可能
s1 | s2
-
s1.union(s2)
s1 - s2
-
s1.difference(s2)
s1 & s2
-
s1.intersection(s2)
s1 ^ s2>
s1.symmetric_ -
difference(s2)
#キーが無いとKeyError発生
d1[key]
#無いとNoneが返る
-
d1.get(key)
#無いとdefaultが返る
d1.get(key,default)
d1.keys()
-
for key in d1.keys():
d1.values()
-
for val in d1.values():
# ( k, v )のペアがtupleで戻る
- d1.items()
for key, value in d1.items():
- d2 = {v: k for k, v in d1.items()}
Category Continuous
Chi square t-test
Category
Anova
t-test Correlation
Continuous
Paired t test ・A paired t-test is used when we are interested in the difference between two variables fo
・Often the two variables are separated by time.
・For example, in the Dixon and Massey data set we have cholesterol levels in 1952 and chol
Two samples t test a method used to test whether the unknown population means of two groups are equal or not.
e between two variables for the same subject.
erol levels in 1952 and cholesterol levels in 1962 for each subject
Background The 365 team has developed a diet and an exercise program for losing weight. It seems that it works like a charm. However,
You have a sample of 10 people who have already completed the 12-week program. The second sheet in shows the data in
Task 1 Calculate the mean and standard deviation of the dataset
Task 2 Determine the appropriate statistic to use
Task 3 Calculate the 95% confidence interval
Task 4 Interpret the result
Optional You can try to calculate the 90% and 99% confidence intervals to see the difference. There is no solution provided for these
Solution:
Task 3:
Task 4: You are 95% confident that you will lose between 24.93lbs and 15.12lbs,
given that you follow the program as strict as the sample