0% found this document useful (0 votes)

16 views6 pages

Facebook - Jupyter Notebook

The document discusses loading and manipulating data from CSV files into Pandas dataframes. It shows how to import CSV data, view the first few rows, check column names, create subsets of columns, and merge/concatenate multiple dataframes.

Uploaded by

xifavo8319

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views6 pages

Facebook - Jupyter Notebook

Uploaded by

xifavo8319

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

In [1]: import pandas as pd

In [5]: df=pd.read_csv(r"C:\Users\Admin\eclipse\Downloads\pseudo_facebook.csv\pseudo_f

In [6]: df.head()

Out[6]: userid age dob_day dob_year dob_month gender tenure friend_count friendships_initi

0 2094382 14 19 1999 11 male 266.0 0

1 1192601 14 2 1999 11 female 6.0 0

2 2083884 14 16 1999 11 male 13.0 0

3 1203168 14 25 1999 12 female 93.0 0

4 1733186 14 4 1999 12 male 82.0 0

 

In [38]: df.columns

Out[38]: Index(['userid', 'age', 'dob_day', 'dob_year', 'dob_month', 'gender', 'tenur

e',
'friend_count', 'friendships_initiated', 'likes', 'likes_received',
'mobile_likes', 'mobile_likes_received', 'www_likes',
'www_likes_received'],
dtype='object')

a. Create data subsets

In [39]: sub1=df['mobile_likes_received']

In [40]: sub1

Out[40]: 0 0
1 0
2 0
3 0
4 0
...
98998 11887
98999 10592
99000 11462
99001 5760
99002 9530
Name: mobile_likes_received, Length: 99003, dtype: int64
In [10]: subsets=df[['dob_day','likes','gender','mobile_likes','friendships_initiated']

In [11]: subsets

Out[11]: dob_day likes gender mobile_likes friendships_initiated

0 19 0 male 0 0

1 2 0 female 0 0

2 16 0 male 0 0

3 25 0 female 0 0

4 4 0 male 0 0

... ... ... ... ... ...

98998 4 3996 female 3505 341

98999 12 4401 female 4399 1720

99000 10 11959 female 11959 1524

99001 11 4506 female 4506 185

99002 15 9410 female 9410 768

99003 rows × 5 columns

b.Merge Data

In [12]: df2=pd.read_csv(r"C:\Users\Admin\Desktop\ml_data\startup_funding.csv")
In [14]: df2.head(10)

Out[14]: SNo Date StartupName IndustryVertical SubVertical CityLocation InvestorsName

Predictive
0 0 01/08/2017 TouchKin Technology Care Bangalore Kae Capital
Platform

Digital Triton
1 1 02/08/2017 Ethinos Technology Marketing Mumbai Investment
Agency Advisors

Online
Kashyap
platform for
Consumer Deorah, Anand
2 2 02/08/2017 Leverage Edu Higher New Delhi
Internet Sankeshwar,
Education
Deepak Jain,...
Services

Kunal Shah,
DIY
Consumer LetsVenture,
3 3 02/08/2017 Zepo Ecommerce Mumbai
Internet Anupam Mittal,
platform
Hetal ...

healthcare
Consumer Narottam Thudi,
4 4 02/08/2017 Click2Clinic service Hyderabad
Internet Shireesh Palle
aggregator

Reliance
Peer to Peer
Consumer Corporate
5 5 01/07/2017 Billion Loans Lending Bangalore
Internet Advisory
platform
Services Ltd

Energy
management Infuse
6 6 03/07/2017 Ecolibriumenergy Technology Ahmedabad
solutions Ventures, JLL
provider

Asset
Online
Management
marketplace
7 7 04/07/2017 Droom eCommerce Gurgaon (Asia) Ltd,
for
Digital Garage
automobiles
Inc

online
Kalaari Capital,
marketplace
8 8 05/07/2017 Jumbotail eCommerce Bangalore Nexus India
for food and
Capital Advisors
grocery

B2B International
marketplace Finance
9 9 05/07/2017 Moglix eCommerce Noida
for Industrial Corporation,
products Rocketship,...

 

In [17]: df3=pd.concat([df,df2],axis=1) #axis=0 it indicate add data in row manner

#axis=1 it indicate add data in column manner
In [18]: df3.head()

Out[18]: userid age dob_day dob_year dob_month gender tenure friend_count friendships_initi

0 2094382 14 19 1999 11 male 266.0 0

1 1192601 14 2 1999 11 female 6.0 0

2 2083884 14 16 1999 11 male 13.0 0

3 1203168 14 25 1999 12 female 93.0 0

4 1733186 14 4 1999 12 male 82.0 0

5 rows × 25 columns
 
c. Sort Data

In [22]: df3.sort_values(by='StartupName',ascending=False)

Out[22]: userid age dob_day dob_year dob_month gender tenure friend_count friendships_

56 1264260 14 11 1999 7 male 18.0 0

2230 2008255 21 21 1992 10 female 25.0 1

1173 1073170 35 1 1978 1 male 7.0 0

526 2143083 23 5 1990 9 male 101.0 0

878 1992445 28 10 1985 5 male 255.0 0

... ... ... ... ... ... ... ... ...

98998 1268299 68 4 1945 4 female 541.0 2118

98999 1256153 18 12 1995 3 female 21.0 1968

99000 1195943 15 10 1998 5 female 111.0 2002

99001 1468023 23 11 1990 4 female 416.0 2560

99002 1397896 39 15 1974 5 female 397.0 2049

99003 rows × 25 columns

 

d. Transposing Data

In [23]: result = df3.transpose()

# it convert rows into columns and viceversa
In [24]: result.head()

Out[24]: 0 1 2 3 4 5 6 7 8

userid 2094382 1192601 2083884 1203168 1733186 1524765 1136133 1680361 1365174

age 14 14 14 14 14 14 13 13 13

dob_day 19 2 16 25 4 1 14 4 1

dob_year 1999 1999 1999 1999 1999 1999 2000 2000 2000

dob_month 11 11 11 12 12 12 1 1 1

5 rows × 99003 columns

 

e. Shape and reshape Data

In [25]: df3.shape

Out[25]: (99003, 25)

In [36]: df.values.reshape((-1,1))

Out[36]: array([[2094382],
[14],
[19],
...,
[9530],
[0],
[2913]], dtype=object)

In [ ]:

Realme 2 Invoice PDF
No ratings yet
Realme 2 Invoice PDF
1 page
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Debt Recovery Policy
No ratings yet
Debt Recovery Policy
5 pages
Procedure For Material Preservation
100% (2)
Procedure For Material Preservation
7 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
Trading Journal Template
No ratings yet
Trading Journal Template
39 pages
International Business 7th Edition Wild Test Bank PDF Download
100% (3)
International Business 7th Edition Wild Test Bank PDF Download
46 pages
Donors Tax
No ratings yet
Donors Tax
6 pages
Maersk Line - B2B Social Media-"It's Communication, Not Marketing"
100% (2)
Maersk Line - B2B Social Media-"It's Communication, Not Marketing"
23 pages
Design Thinking Methodology Book
88% (24)
Design Thinking Methodology Book
119 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
The History of Amazon - Coursera
100% (2)
The History of Amazon - Coursera
4 pages
Industry/Market Feasibility
No ratings yet
Industry/Market Feasibility
6 pages
Scisor Lift Tables
No ratings yet
Scisor Lift Tables
3 pages
1745064423339-Coders of Delhi
No ratings yet
1745064423339-Coders of Delhi
12 pages
Data Preprocessing - Ipynb - Colaboratory
No ratings yet
Data Preprocessing - Ipynb - Colaboratory
7 pages
Presentation On Genera Banking Activities of SEBL
No ratings yet
Presentation On Genera Banking Activities of SEBL
15 pages
Reviewer in Events Management
No ratings yet
Reviewer in Events Management
7 pages
Thank You For Your Order: Order Details Order Summary
No ratings yet
Thank You For Your Order: Order Details Order Summary
1 page
Business Case - Aerofit - Descriptive Statistics Probability (Final)
100% (1)
Business Case - Aerofit - Descriptive Statistics Probability (Final)
1 page
Data Wrangling - Jupyter Notebook
No ratings yet
Data Wrangling - Jupyter Notebook
5 pages
P1-1a 6081901141
No ratings yet
P1-1a 6081901141
2 pages
Data Analysis Process
No ratings yet
Data Analysis Process
95 pages
Curriculum Mba
No ratings yet
Curriculum Mba
84 pages
Customer Segmentation 1683225943
No ratings yet
Customer Segmentation 1683225943
34 pages
30apr Ipynb
No ratings yet
30apr Ipynb
42 pages
Project Report ON "Performance Appraisal": Dabur India Limited (Pantnager Unit)
No ratings yet
Project Report ON "Performance Appraisal": Dabur India Limited (Pantnager Unit)
67 pages
01 Working With CSV Files
No ratings yet
01 Working With CSV Files
27 pages
Customer Churn Syntax
No ratings yet
Customer Churn Syntax
66 pages
PMT2 24
No ratings yet
PMT2 24
56 pages
Topic 2 - Financial Statements For Decisions (STU)
No ratings yet
Topic 2 - Financial Statements For Decisions (STU)
42 pages
Document From Gr7
No ratings yet
Document From Gr7
29 pages
Campus Connect Projects
No ratings yet
Campus Connect Projects
27 pages
Hierarchical Clusteringipynb
No ratings yet
Hierarchical Clusteringipynb
58 pages
Panda Joins
No ratings yet
Panda Joins
25 pages
Data Warehousing Schemas and Objects
No ratings yet
Data Warehousing Schemas and Objects
24 pages
DSBDAL
No ratings yet
DSBDAL
87 pages
Atty Cabaniero Tax Questions (Not Mine)
No ratings yet
Atty Cabaniero Tax Questions (Not Mine)
21 pages
Flipkart Business Analyst Interview Questions
No ratings yet
Flipkart Business Analyst Interview Questions
16 pages
50 SQL Interview Questions With Answers
No ratings yet
50 SQL Interview Questions With Answers
15 pages
Promotion Distribution Management
No ratings yet
Promotion Distribution Management
25 pages
ML Lab Manual 1-10
No ratings yet
ML Lab Manual 1-10
58 pages
Sowmi DS
No ratings yet
Sowmi DS
27 pages
Assignment 1
No ratings yet
Assignment 1
8 pages
Case Study 1&2
No ratings yet
Case Study 1&2
10 pages
Salesperson Performance: Behavior, Role Perceptions, and Satisfaction
No ratings yet
Salesperson Performance: Behavior, Role Perceptions, and Satisfaction
25 pages
SQL
No ratings yet
SQL
22 pages
Py Spark
No ratings yet
Py Spark
8 pages
Shivakumar IE6400 Lecture6 Lab1 STUDENT Pandas Part 2
No ratings yet
Shivakumar IE6400 Lecture6 Lab1 STUDENT Pandas Part 2
12 pages
Instagram User Analytics
No ratings yet
Instagram User Analytics
12 pages
Assignment 5
No ratings yet
Assignment 5
14 pages
Dataset Join
No ratings yet
Dataset Join
12 pages
Dododo
No ratings yet
Dododo
10 pages
Facebook Analysis
No ratings yet
Facebook Analysis
11 pages
Assignment1 Output
No ratings yet
Assignment1 Output
10 pages
Bloomberg Gei Slides Sse Training
No ratings yet
Bloomberg Gei Slides Sse Training
10 pages
Aerofit
No ratings yet
Aerofit
7 pages
Sma Exp 3
No ratings yet
Sma Exp 3
7 pages
B - 59 - SMA - Exp 4
No ratings yet
B - 59 - SMA - Exp 4
9 pages
Neel
No ratings yet
Neel
12 pages
6 Sea-Shon Chen
No ratings yet
6 Sea-Shon Chen
8 pages
Apache Spark Builtin Functions
No ratings yet
Apache Spark Builtin Functions
9 pages
Dsbda 4
No ratings yet
Dsbda 4
13 pages
Ass 4
No ratings yet
Ass 4
9 pages
Memorandum
No ratings yet
Memorandum
8 pages
T I M e S T A M P G R o U P L A N D I N G - P A G e C o N V e R T e D
No ratings yet
T I M e S T A M P G R o U P L A N D I N G - P A G e C o N V e R T e D
6 pages
Test 12
No ratings yet
Test 12
12 pages
Total Likes Type Category Post Month Post Weekday Post Hour Paid Lifetime Post Total Reach Lifetime Post Total Impressions
No ratings yet
Total Likes Type Category Post Month Post Weekday Post Hour Paid Lifetime Post Total Reach Lifetime Post Total Impressions
10 pages
Assignment
No ratings yet
Assignment
10 pages
Rohit Manna SEM3 PCA2 MSD392 MSC (DA)
No ratings yet
Rohit Manna SEM3 PCA2 MSD392 MSC (DA)
5 pages
MOS and Earning Power
No ratings yet
MOS and Earning Power
8 pages
Untitled
No ratings yet
Untitled
3 pages
Experiment 6
No ratings yet
Experiment 6
4 pages
Jansport Backpack Thesis Utility Consumer Behaviour PDF
No ratings yet
Jansport Backpack Thesis Utility Consumer Behaviour PDF
1 page
Clean and Analyse Social Media Data
No ratings yet
Clean and Analyse Social Media Data
3 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Sma 3
No ratings yet
Sma 3
3 pages
10 Streamlit
No ratings yet
10 Streamlit
7 pages
Experiment3.Ipynb - Colab
No ratings yet
Experiment3.Ipynb - Colab
3 pages
Python CheatSheet
No ratings yet
Python CheatSheet
2 pages
Reading An Entire File at Once: Generating Current Date
No ratings yet
Reading An Entire File at Once: Generating Current Date
2 pages
Logistic Regression 007
No ratings yet
Logistic Regression 007
1 page
Users of A Music Streaming Service Will Churn or Stay: @staticmethod
No ratings yet
Users of A Music Streaming Service Will Churn or Stay: @staticmethod
1 page
4509022956
No ratings yet
4509022956
4 pages
JBS Finalizes The TOLEDO Group Acquisition
No ratings yet
JBS Finalizes The TOLEDO Group Acquisition
2 pages
3D Printing of Medical Models from Ct-Mri Images: A Practical Step-By-Step Guide
From Everand
3D Printing of Medical Models from Ct-Mri Images: A Practical Step-By-Step Guide
Eric Luis
No ratings yet
The Complete ITaaS Delivery Model™ - Revised Edition
From Everand
The Complete ITaaS Delivery Model™ - Revised Edition
Philippe A. Abdoulaye
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
The DAP Strategy: A New Way of Working to De-Risk & Accelerate Your Digital Transformation
From Everand
The DAP Strategy: A New Way of Working to De-Risk & Accelerate Your Digital Transformation
Raj Sundarason
No ratings yet

Facebook - Jupyter Notebook

Uploaded by

Facebook - Jupyter Notebook

Uploaded by

In [1]: import pandas as pd

0 2094382 14 19 1999 11 male 266.0 0

1 1192601 14 2 1999 11 female 6.0 0

2 2083884 14 16 1999 11 male 13.0 0

3 1203168 14 25 1999 12 female 93.0 0

4 1733186 14 4 1999 12 male 82.0 0

Out[38]: Index(['userid', 'age', 'dob_day', 'dob_year', 'dob_month', 'gender', 'tenur

a. Create data subsets

Out[11]: dob_day likes gender mobile_likes friendships_initiated

... ... ... ... ... ...

98998 4 3996 female 3505 341

98999 12 4401 female 4399 1720

99000 10 11959 female 11959 1524

99001 11 4506 female 4506 185

99002 15 9410 female 9410 768

99003 rows × 5 columns

Out[14]: SNo Date StartupName IndustryVertical SubVertical CityLocation InvestorsName

In [17]: df3=pd.concat([df,df2],axis=1) #axis=0 it indicate add data in row manner

0 2094382 14 19 1999 11 male 266.0 0

1 1192601 14 2 1999 11 female 6.0 0

2 2083884 14 16 1999 11 male 13.0 0

3 1203168 14 25 1999 12 female 93.0 0

4 1733186 14 4 1999 12 male 82.0 0

56 1264260 14 11 1999 7 male 18.0 0

2230 2008255 21 21 1992 10 female 25.0 1

1173 1073170 35 1 1978 1 male 7.0 0

526 2143083 23 5 1990 9 male 101.0 0

878 1992445 28 10 1985 5 male 255.0 0

... ... ... ... ... ... ... ... ...

98998 1268299 68 4 1945 4 female 541.0 2118

98999 1256153 18 12 1995 3 female 21.0 1968

99000 1195943 15 10 1998 5 female 111.0 2002

99001 1468023 23 11 1990 4 female 416.0 2560

99002 1397896 39 15 1974 5 female 397.0 2049

99003 rows × 25 columns

In [23]: result = df3.transpose()

5 rows × 99003 columns

e. Shape and reshape Data

Out[25]: (99003, 25)

You might also like