0% found this document useful (0 votes)
4 views

Python_assignment pandas

The document provides an overview of the Pandas library in Python, detailing its data structures such as Series and DataFrame, along with sample declarations. It also explains various types of indexes, methods for data manipulation, and how to perform specific data analysis tasks, including displaying highest sales and total sales in a specific region. Additionally, it includes code snippets demonstrating the use of these features.

Uploaded by

Glauben Caduan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Python_assignment pandas

The document provides an overview of the Pandas library in Python, detailing its data structures such as Series and DataFrame, along with sample declarations. It also explains various types of indexes, methods for data manipulation, and how to perform specific data analysis tasks, including displaying highest sales and total sales in a specific region. Additionally, it includes code snippets demonstrating the use of these features.

Uploaded by

Glauben Caduan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

John Glauben J.

Caduan
01/25/2025

Fundamentals of Analytics Assignment

1. What is Pandas?

Pandas is a Python library used for working with data sets. It has
functions for analyzing, cleaning, exploring, and manipulating data.

2. Types of Data structures in Pandas and give sample declaration

a. Series: A one-dimensional labeled array, capable of holding data of


any type.
import pandas as pd
series = pd.Series([1, 2, 3, 4])

b. DataFrame: A two-dimensional labeled data structure with columns


of potentially different types.
dataframe = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})

3. Types of Indexes in Pandas and how to use these

a. Default Index: Sequential integers starting from 0.


import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]})
# Default index: 0, 1, 2

b. Custom Index: User-defined index values.


df = pd.DataFrame({'A': [1, 2, 3]}, index=['a', 'b', 'c'])

c. MultiIndex: Hierarchical indexing for multiple levels.


multi_index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1)])
df = pd.DataFrame({'Value': [10, 20, 30]}, index=multi_index)

d. DatetimeIndex: Index based on datetime objects.


dates = pd.date_range('2023-01-01', periods=3)
df = pd.DataFrame({'A': [1, 2, 3]}, index=dates)

e. CategoricalIndex: Index with categorical data.


categories = pd.Categorical(['low', 'medium', 'high'])
df = pd.DataFrame({'Value': [10, 20, 30]}, index=categories)

4. Enumerate the series and DataFrame Methods


Data Manipulation
 add(), sub(), mul(), div()
 append()
 combine()
 update()
 replace()
 map()
 apply()
 drop()
 fillna(), bfill(), ffill()
Index & Access
 at[], iat[]
 loc[], iloc[]
 get()
Aggregation & Statistics
 mean(), sum(), prod()
 min(), max(), idxmin(), idxmax()
 median(), mode()
 std(), var()
 cumsum(), cumprod(), cummin(), cummax()
Sorting & Filtering
 sort_values()
 sort_index()
 where(), mask()
Conversion
 to_frame()
 astype()
 to_list()
Other
 describe()
 value_counts()
 unique()
 isna(), notna()
5. Given the Data Below answer the following:
a. how to display the highest sales
b. how to display the total sales in East region
import pandas as pd

# Create the DataFrame


data = { 'Name': ['William', 'Emma', 'Sofia', 'Markus', 'Edward'],
'Region': ['East', 'North', 'East', 'South', 'West'],
'Sales': [50000, 52000, 90000, 34000, 42000],
'Expense': [42000, 43000, 50000, 44000, 38000] }
df = pd.DataFrame(data)

# a. Display the highest sales highest_sales =


df.loc[df['Sales'].idxmax()]
print("Highest Sales:")
print(highest_sales)

# b. Display the total sales in the East region


total_sales_east = df[df['Region'] == 'East']['Sales'].sum()
print("\nTotal Sales in East Region:", total_sales_east)

You might also like