Merge Append Python Code
Merge Append Python Code
Merge DataFrames
“Merging” two datasets is the process of bringing two datasets together into one, and aligning the rows from
each based on common attributes or columns.
In [1]:
import numpy as np
import pandas as pd
In [2]:
user_usage = pd.read_csv('data/user_usage.csv')
user_usage.head(3)
Out[2]:
In [6]:
user_usage.shape
Out[6]:
(240, 4)
In [3]:
user_device = pd.read_csv('data/user_device.csv')
user_device.head(3)
Out[3]:
In [7]:
user_device.shape
Out[7]:
(272, 6)
In [4]:
android_devices = pd.read_csv('data/android_devices.csv')
android_devices.head(3)
Out[4]:
Unnamed: Unnamed: U
use_id user_id platform platform_version device use_type_id
6 7
SM-
2 22784 28473 android 5.1 1 NaN NaN
G903F
Sample problem
We would like to determine if the usage patterns for users differ between different devices. For example, do
users using Samsung devices use more call minutes than those using LG devices?
We want to form a single dataframe with columns for user usage figures (calls per month, sms per month etc)
and also columns with device information (model, manufacturer, etc). We will need to “merge” (or “join”) our
sample datasets together into one single dataset for analysis.
Lets see how we can correctly add the “device” and “platform” columns to the user_usage dataframe using the
Pandas Merge command.
In [5]:
result = pd.merge(user_usage,
user_device[['use_id', 'platform', 'device']],
on='use_id')
result.head()
Out[5]:
GT-
0 21.97 4.82 1557.33 22787 android
I9505
SM-
1 1710.08 136.88 7267.55 22788 android
G930F
SM-
2 1710.08 136.88 7267.55 22789 android
G930F
SM-
4 71.59 79.26 1557.33 22792 android
G361F
In [8]:
result.shape
Out[8]:
(159, 6)
In [21]:
agg_device = result.groupby("device").count()['platform']
agg_device
Out[21]:
device
A0001 2
C6603 1
D2303 2
D5503 2
D5803 1
D6603 2
E6653 1
EVA-L09 1
F3111 4
GT-I8190N 1
GT-I9195 3
GT-I9300 2
GT-I9505 11
GT-I9506 1
GT-I9515 3
GT-N7100 2
HTC Desire 510 5
HTC Desire 530 1
HTC Desire 620 1
HTC Desire 626 2
HTC Desire 825 3
HTC One M9 1
HTC One S 2
HTC One mini 2 3
HTC One_M8 1
HUAWEI CUN-L01 1
HUAWEI VNS-L31 1
LG-H815 1
Lenovo K51c78 1
Moto G (4) 4
MotoE2(4G-LTE) 1
Nexus 5X 1
ONE A2003 2
ONEPLUS A3003 7
SM-A300FU 4
SM-A310F 2
SM-A500FU 1
SM-G360F 2
SM-G361F 5
SM-G531F 1
SM-G800F 1
SM-G900F 30
SM-G903F 2
SM-G920F 5
SM-G925F 4
SM-G930F 3
SM-G935F 5
SM-J320FN 5
SM-N9005 1
SM-N910F 5
VF-795 1
Vodafone Smart ultra 6 1
X11 2
localhost:8888/notebooks/Machine Learning/Data Analytics/4.3_Merge_append.ipynb 4/5
2/24/2020 4.3_Merge_append - Jupyter Notebook
iPhone6,2 1
iPhone7,2 1
Name: platform, dtype: int64
Append Dataframes
In [23]:
#df = df1.append(df2)
In [ ]: