0% found this document useful (0 votes)
16 views6 pages

Facebook - Jupyter Notebook

The document discusses loading and manipulating data from CSV files into Pandas dataframes. It shows how to import CSV data, view the first few rows, check column names, create subsets of columns, and merge/concatenate multiple dataframes.

Uploaded by

xifavo8319
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views6 pages

Facebook - Jupyter Notebook

The document discusses loading and manipulating data from CSV files into Pandas dataframes. It shows how to import CSV data, view the first few rows, check column names, create subsets of columns, and merge/concatenate multiple dataframes.

Uploaded by

xifavo8319
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

In [1]: import pandas as pd

In [5]: df=pd.read_csv(r"C:\Users\Admin\eclipse\Downloads\pseudo_facebook.csv\pseudo_f

In [6]: df.head()

Out[6]: userid age dob_day dob_year dob_month gender tenure friend_count friendships_initi

0 2094382 14 19 1999 11 male 266.0 0

1 1192601 14 2 1999 11 female 6.0 0

2 2083884 14 16 1999 11 male 13.0 0

3 1203168 14 25 1999 12 female 93.0 0

4 1733186 14 4 1999 12 male 82.0 0

 

In [38]: df.columns

Out[38]: Index(['userid', 'age', 'dob_day', 'dob_year', 'dob_month', 'gender', 'tenur


e',
'friend_count', 'friendships_initiated', 'likes', 'likes_received',
'mobile_likes', 'mobile_likes_received', 'www_likes',
'www_likes_received'],
dtype='object')

a. Create data subsets

In [39]: sub1=df['mobile_likes_received']

In [40]: sub1

Out[40]: 0 0
1 0
2 0
3 0
4 0
...
98998 11887
98999 10592
99000 11462
99001 5760
99002 9530
Name: mobile_likes_received, Length: 99003, dtype: int64
In [10]: subsets=df[['dob_day','likes','gender','mobile_likes','friendships_initiated']

In [11]: subsets

Out[11]: dob_day likes gender mobile_likes friendships_initiated

0 19 0 male 0 0

1 2 0 female 0 0

2 16 0 male 0 0

3 25 0 female 0 0

4 4 0 male 0 0

... ... ... ... ... ...

98998 4 3996 female 3505 341

98999 12 4401 female 4399 1720

99000 10 11959 female 11959 1524

99001 11 4506 female 4506 185

99002 15 9410 female 9410 768

99003 rows × 5 columns

b.Merge Data

In [12]: df2=pd.read_csv(r"C:\Users\Admin\Desktop\ml_data\startup_funding.csv")
In [14]: df2.head(10)

Out[14]: SNo Date StartupName IndustryVertical SubVertical CityLocation InvestorsName

Predictive
0 0 01/08/2017 TouchKin Technology Care Bangalore Kae Capital
Platform

Digital Triton
1 1 02/08/2017 Ethinos Technology Marketing Mumbai Investment
Agency Advisors

Online
Kashyap
platform for
Consumer Deorah, Anand
2 2 02/08/2017 Leverage Edu Higher New Delhi
Internet Sankeshwar,
Education
Deepak Jain,...
Services

Kunal Shah,
DIY
Consumer LetsVenture,
3 3 02/08/2017 Zepo Ecommerce Mumbai
Internet Anupam Mittal,
platform
Hetal ...

healthcare
Consumer Narottam Thudi,
4 4 02/08/2017 Click2Clinic service Hyderabad
Internet Shireesh Palle
aggregator

Reliance
Peer to Peer
Consumer Corporate
5 5 01/07/2017 Billion Loans Lending Bangalore
Internet Advisory
platform
Services Ltd

Energy
management Infuse
6 6 03/07/2017 Ecolibriumenergy Technology Ahmedabad
solutions Ventures, JLL
provider

Asset
Online
Management
marketplace
7 7 04/07/2017 Droom eCommerce Gurgaon (Asia) Ltd,
for
Digital Garage
automobiles
Inc

online
Kalaari Capital,
marketplace
8 8 05/07/2017 Jumbotail eCommerce Bangalore Nexus India
for food and
Capital Advisors
grocery

B2B International
marketplace Finance
9 9 05/07/2017 Moglix eCommerce Noida
for Industrial Corporation,
products Rocketship,...

 

In [17]: df3=pd.concat([df,df2],axis=1) #axis=0 it indicate add data in row manner


#axis=1 it indicate add data in column manner
In [18]: df3.head()

Out[18]: userid age dob_day dob_year dob_month gender tenure friend_count friendships_initi

0 2094382 14 19 1999 11 male 266.0 0

1 1192601 14 2 1999 11 female 6.0 0

2 2083884 14 16 1999 11 male 13.0 0

3 1203168 14 25 1999 12 female 93.0 0

4 1733186 14 4 1999 12 male 82.0 0

5 rows × 25 columns
 
c. Sort Data

In [22]: df3.sort_values(by='StartupName',ascending=False)

Out[22]: userid age dob_day dob_year dob_month gender tenure friend_count friendships_

56 1264260 14 11 1999 7 male 18.0 0

2230 2008255 21 21 1992 10 female 25.0 1

1173 1073170 35 1 1978 1 male 7.0 0

526 2143083 23 5 1990 9 male 101.0 0

878 1992445 28 10 1985 5 male 255.0 0

... ... ... ... ... ... ... ... ...

98998 1268299 68 4 1945 4 female 541.0 2118

98999 1256153 18 12 1995 3 female 21.0 1968

99000 1195943 15 10 1998 5 female 111.0 2002

99001 1468023 23 11 1990 4 female 416.0 2560

99002 1397896 39 15 1974 5 female 397.0 2049

99003 rows × 25 columns


 

d. Transposing Data

In [23]: result = df3.transpose()


# it convert rows into columns and viceversa
In [24]: result.head()

Out[24]: 0 1 2 3 4 5 6 7 8

userid 2094382 1192601 2083884 1203168 1733186 1524765 1136133 1680361 1365174

age 14 14 14 14 14 14 13 13 13

dob_day 19 2 16 25 4 1 14 4 1

dob_year 1999 1999 1999 1999 1999 1999 2000 2000 2000

dob_month 11 11 11 12 12 12 1 1 1

5 rows × 99003 columns


 

e. Shape and reshape Data

In [25]: df3.shape

Out[25]: (99003, 25)

In [36]: df.values.reshape((-1,1))

Out[36]: array([[2094382],
[14],
[19],
...,
[9530],
[0],
[2913]], dtype=object)

In [ ]: ​

You might also like