0% found this document useful (0 votes)
8 views8 pages

2 - 2 Pandas Series

The document provides an overview of using Pandas Series to analyze the population of the G7 countries. It explains how to create Series from lists and dictionaries, access elements, and perform operations such as conditional selection and aggregation. The document also highlights the similarities between Series and other data structures like numpy arrays and Python dictionaries.

Uploaded by

bisratengda613
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
8 views8 pages

2 - 2 Pandas Series

The document provides an overview of using Pandas Series to analyze the population of the G7 countries. It explains how to create Series from lists and dictionaries, access elements, and perform operations such as conditional selection and aggregation. The document also highlights the similarities between Series and other data structures like numpy arrays and Python dictionaries.

Uploaded by

bisratengda613
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
sai11/28, 4:85 PM 2.2 Pandas Sores to Students import pandas as pd import nunpy as np Pandas Series We'll start analyzing ‘The Group of Seven". Which is a political formed by Canada, France, Germany, Italy, Japan, the United Kingdom and the United States. We'll start by analyzing population, and for that, we'll use a pandas.Series object. # In millions B7_pop = pd.Series([35.467, 63.951, 80.940, 60.665, 127.061, 64.511, 318.523]) 87_pop Qu 2 35.467 1 63.951 2 80.940 3 60.665 4 127.061 5 64.511 5 318.523 dtype: floated Someone might not know we're representing population in millions of inhabitants. Series can have a name , to better document the purpose of the Series g7_pop.nane = ‘G7 Population in millions" e7_pop 35.467 63.951 80.940 60.665 127.061 64.511 318.523 Name: G7 Population in millions, dtype: floated ounuNnee® Series are pretty similar to numpy arrays: e7_pop.dtype Qu dtype( ‘Floated" ) 87_pop. values Qu array([ 35.467, 63.951, 80.94 , 60.665, 127.061, 64.511, 318.523]) They're actually backed by numpy arrays ‘type(g7_pop. values) ‘ieC:/Usersluser/Downloads/2_? Pandas Series to Studonts:himl 8 sai11/28, 4:85 PM ‘ielC:/Usersluser/Downloads!2_? Pandas Ser 2.2 Pandas Series to Students numpy nndarray And they look like simple Python lists or Numpy Arrays. But they're actually more similar to Python dict s. A Series has an index., that's similar to the automatic index assigned to Python's lists: e7_pop @ 35.467 1 63.951 2 80.940 3 60,665 4 127.061 5 64.511 8 318.523 Name: G7 Population in millions, dtype: floated 87_pop[2) 35.467 87_pop[1] 83.951 g7_pop. index RangeIndex(start=@, stop=7, step=1) l= ['a', ‘bY, 'e'] But, in contrast to lists, we can explicitly define the index g7_pop. index = [ *Canada’ , “France”, “Germany, “Ttaly', “Japan’, “united Kingdon’ , “united States", e7_pop canada 35.467 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511, United States 318.523 Name: G7 Population in millions, dtype: floates Compare it with the following table: to Studonts:himl 218 sai11/28, 4:85 PM 2.2 Pandas Sores to Students (Expressed in milions) Canada 35.467 France 63.951 Germany 80.94 Italy 60.665 Japan 127.061 United Kingdom 64.511 United States 318.523 We can say that Series look like “ordered dictionaries’. We can actualy create Series out of dictionaries pd.Series({ “canada’: 35.467, “France’: 63.951, *Germany': 80.94, ‘Italy’: 68.665, “Japan': 127.061, ‘united Kingdom’: 64.511, “United States’: 318.523 }, name="G7 Population in millions") ut canada 35.467 France 63.951 Germany 80.948 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: floatea pd.Series( (35.467, 63.951, 80.94, 60.665, 127.061, 64.511, 318.523], index=['Canada", ‘France’, ‘Germany’, ‘Italy’, ‘Japan', ‘United Kingdon’, “united states" ], name='G7 Population in millions’) Qu canada 35.467 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: floatea You can also create Series out of other series, specifying indexes pd.Series(g7_pop, index=['France', ‘Germany’, ‘Italy’, 'Spain']) ‘ieC:/Usersluser/Downloads/2_? Pandas Series to Studonts:himl 318 sai11/28, 4:85 PM ‘ielC:/Usersluser/Downloads!2_? Pandas Ser 2.2 Pandas Series to Students France 63.951. Germany 80.94€ Italy 60,665 Spain NaN Name: G7 Population in millions, dtype: floates 87_pop canada 35.467 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: floates 87_pop[ ‘Canada"] 35.467 87_pop[ ‘Japan’ ] 127.061 Numeric positions can also be used, with the iloc attribute e7_pop. iloc[] 35.467 @7_pop. iloc[-1] 318.523 Selecting multiple elements at once: g7_pop[['Italy', ‘France"]] Italy 60.665 France 63.951 Name: G7 Population in millions, dtype: floates (The result is another Series) e7_pop. iloc[[, 1]] Canada 35.467 France 63.951 Name: G7 Population in millions, dtype: floates Slicing also works, but important, in Fandas, the upper limit is also included: g7_pop['‘Canada': ‘Italy'] to Studonts:himl 418 sai11/28, 4:85 PM ‘ielC:/Usersluser/Downloads!2_? Pandas Ser 2.2 Pandas Series to Students Canada 35.467 France 63.951 Germany 80.94€ Italy 60.665 Name: G7 Population in millions, dtype: Conditional selection (bo + floated lean arrays) The same boolean array techniques we saw applied to numpy arrays can be used for Pandas Series : 87_pop canada 35.467 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdom 64.522 United States 318.523 Name: G7 Population in millions, dtype: B7_pop > 70 Canada False France False Germany True Italy False Japan True United Kingdom False United States True Name: G7 Population in millions, dtype: 7_pop[g7_pop > 70) Germany 80.94€ Japan 127.061 United States 318.523 Name: G7 Population in millions, dtype: 7_pop.mean() 107..30257142857144 87_pop[g7_pop > g7_pop.mean()] Japan 127.061 United States 318.523 Name: G7 Population in millions, dtype: g7_pop.std() 97. 24996987121581 ~ not | or & and to Studonts:himl 1 floates bool 1 floates + floates 58 sai11/28, 4:85 PM 2.2 Pandas Sores to Students cell In[33], line 1 ~ not syntaxError: invalid syntax 87_pop[(g7_pop > g7_pop.mean() - g7_pop.std() / 2) | (g7_pop > g7_pop.mean() + € France 63.951 Germany 80.940 Italy 60.665 Japan 127.061, United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: floates Operations and methods Series also support vectorized operations and aggregation functions as Numpy: 87_pop canada 35.467 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: floates 87_pop.mean() 107. 3257142857144 np. 10g(87_pop) Canada 3.568603 France 4.158117 Germany 4.393708 Italy 4.105367 Japan 4.844667 United Kingdom 4.166836 United States 5.763695 Name: G7 Population in millions, dtype: floated g7_pop[ ‘France’: ‘Italy'].mean() 58,51866666666666 Boolean arrays (Work in the same way as numpy) e7_pop ‘ieC:/Usersluser/Downloads/2_? Pandas Series to Studonts:himl ae sai11/28, 4:85 PM ‘ielC:/Usersluser/Downloads!2_? Pandas Ser 2.2 Pandas Series to Students canada 35.467 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: e7_pop > 80 Canada False France False Germany True Italy False Japan True United Kingdom False United States True Name: G7 Population in millions, dtype: 87_pop[@7_pop > 82) Germany 80.940 Japan 127.061 United States 318.523 Name: G7 Population in millions, dtype: 87_popl(e7_pop > 8) | (g7_pop < 4@)] canada 35.467 Germany 80.940 Japan 127.061 United States 318.523 Name: G7 Population in millions, dtype: 87_pop[(g7_pop > 88) & (g7_pop < 200)] Germany 80.940 Japan 127.061 Name: G7 Population in millions, dtype: e7_pop['Canada'] = 40.5 &7_pop Canada 40.500 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 318.523 Name: G7 Population in millions, dtype: g7_pop.iloc[-1] = 500 e7_pop to Studonts:himl Floates bool Floates floates Floates Floates 718 sai11/28, 4:85 PM 2.2 Pandas Sores to Students canada 49.500 France 63.951 Germany 80.940 Italy 60.665 Japan 127.061 United Kingdon 64.511 United States 500.000 Name: G7 Population in millions, dtype: floats 87_pop[g7_pop < 72] canada 49.50@ France 63.951 Italy 60.665 United Kingdon 64.511 Name: G7 Population in millions, dtype: floats &7_poplg7_pop < 7@] = 99.99 87_pop Canada 99.990 France 99.990 Germany 80.940 Italy 99.990 Japan 127.061 United Kingdon 99.998 United States 500.000 Name: G7 Population in millions, dtype: floates ‘ieC:/Usersluser/Downloads/2_? Pandas Series to Studonts:himl a8

You might also like