0% found this document useful (0 votes)
128 views22 pages

1 Pandas Basic I

article_read[(article_read.country == 'country_2')][['user_id', 'country', 'topic']].head()

Uploaded by

Arsyil Rohman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views22 pages

1 Pandas Basic I

article_read[(article_read.country == 'country_2')][['user_id', 'country', 'topic']].head()

Uploaded by

Arsyil Rohman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

PANDAS BASIC I

How to Open Data Files in Pandas


There are two types of data structures in pandas: Series and DataFrames.

1 2
SERIES: A PANDAS SERIES IS A ONE DATAFRAME: A PANDAS DATAFRAME IS
DIMENSIONAL DATA STRUCTURE (“A A TWO (OR MORE) DIMENSIONAL DATA
ONE DIMENSIONAL NDARRAY”) THAT STRUCTURE – BASICALLY A TABLE WITH
CAN STORE VALUES — AND FOR EVERY ROWS AND COLUMNS. THE COLUMNS
VALUE IT HOLDS A UNIQUE INDEX, HAVE NAMES AND THE ROWS HAVE
TOO. INDEXES.
Loading a .csv file into
a pandas DataFrame
Data sources:
1. Make csv file
2. Download csv file directly from
python script
3. Import .csv file
4. load the .csv data using the URL
directly
Buat file csv dengan data sebagai berikut:
animal,uniq_id,water_need
elephant,1001,500
elephant,1002,600
elephant,1003,550
tiger,1004,300
tiger,1005,320
tiger,1006,330
tiger,1007,290
tiger,1008,310
zebra,1009,200
1. Make .csv File zebra,1010,220
zebra,1011,240
zebra,1012,230
zebra,1013,220
zebra,1014,100
zebra,1015,80
lion,1016,420
lion,1017,600
lion,1018,500
lion,1019,390
kangaroo,1020,410
kangaroo,1021,430
kangaroo,1022,410

Save as zoo
Loading zoo.csv file
into a pandas Data
Frame
import pandas as pd
import numpy as np
pd.read_csv('zoo.csv', delimiter = ' , ')
You can download pandas_tutorial_read.csv from
http://46.101.230.157/dilan/pandas_tutorial_read.csv link
to your server and then load it to your Jupyter using wget module

Install modul wget

2. Download Import modul wget

.CSV File
dengan module Tuliskan alamat tempat data akan di download
Simpan di drive yang sama dengan script di download

wget
Check if pandas_tutorial_read.csv file downloaded to
your server

See? It’s there…


Loading
pandas_tutorial_read.csv
File Into a Pandas Data
Frame
pd.read_csv('pandas_tutorial_read.csv', delimiter=';')
Make
Data Header
➢ You can download pandas_tutorial_read.csv data to your
computer by click this link
3. Import .CSV http://46.101.230.157/dilan/pandas_tutorial_read.csv

➢ Save it to your computer in same drive with python script


File
➢ Loading .csv File Into a Pandas Data Frame like in slide 9
before
4. Loading
pandas_tutorial_read.csv
data using the URL
directly
the data won’t be downloaded
to your computer.
Selecting data from a
dataframe in pandas
1. Print the whole dataframe
2. Print a sample of your dataframe
3. Select specific columns of your dataframe
4. Filter for specific values in your dataframe
First, give the data name so you can call it easier
1. Print The Whole >>> article_read = pd.read_csv('pandas_tutorial_read.csv', delimiter=';',
Dataframe names = ['my_datetime', 'event', 'country', 'user_id', 'source', 'topic’])
>>> article_read
PRINT ONLY THE FIRST 5 LINES
>>> article_read.head()

2. Print a Sample of
Your Dataframe
PRINT THE LAST FEW LINES
>>> article_read.tail()

2. Print a Sample of
Your Dataframe
PRINT FEW RANDOM LINES
>>> article_read.sample(5)

2. Print a Sample of
Your Dataframe
PRINT THE ‘country’ AND THE ‘user_id’ COLUMNS
ONLY
>>> article_read[['country', 'user_id']]

3. Select Specific
Columns of Your
Dataframe
CHANGE THE ORDER OF THE COLUMN NAMES
>>> article_read[['user_id', 'country']]

3. Select Specific
Columns of Your
Dataframe
3. Select Specific You can get a Series using any of these two
syntaxes (and selecting only one column):
Columns of Your >>> article_read.user_id
Dataframe >>> article_read['user_id']

Sometimes (especially in
predictive analytics projects), you
want to get Series objects instead
of DataFrames.
See a list of only the users who came from the ‘SEO’ source. In this
case you have to filter for the ‘SEO’ value in the ‘source’ column:

>>> article_read[article_read.source == 'SEO']

4. Filter for specific


values in your
dataframe
Test Your Self!
Select the user_id, the country and the
topic columns for the users who are
from country_2!
Print the first five rows only!

You might also like