0% found this document useful (0 votes)
6 views8 pages

GR Py 14

The document outlines an assignment for creating a DataFrame using Python's pandas library from a dictionary of lists. It details the steps to create the DataFrame and utilize methods like head(), tail(), info(), and describe() for data exploration and summarization. The conclusion emphasizes the importance of these techniques in managing and analyzing data effectively.

Uploaded by

Nikhil Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

GR Py 14

The document outlines an assignment for creating a DataFrame using Python's pandas library from a dictionary of lists. It details the steps to create the DataFrame and utilize methods like head(), tail(), info(), and describe() for data exploration and summarization. The conclusion emphasizes the importance of these techniques in managing and analyzing data effectively.

Uploaded by

Nikhil Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Project Based Learning-Python

Student Name Grishma Rupnar

SRN No 31242529

Roll No 110

Program Computer Engineering

Year Second Year

Division E

Subject Project Based Learning-Python (BTECCE23306)

Assignment No 14
Project Based Learning-Python

Assignment - 14

Title/Problem Statement:
Write a program to create a DataFrame from a dictionary of lists. Use methods like head(), tail(), info(), and
describe() to explore and summarize the DataFrame

Description:
In this exercise, you will write a program to create a DataFrame using a dictionary of lists in Python's
pandas library. You will utilize methods such as `head()` to view the first few rows,
`tail()` to see the last few rows, `info()` to get a summary of the DataFrame’s structure and data types, and
`describe()` to generate descriptive statistics. This exercise helps in understanding DataFrame creation and
basic data exploration techniques.

Theory:
Creating a DataFrame from a dictionary of lists is a common task in data analysis using the pandas library
in Python. A DataFrame is a two-dimensional labeled data structure with columns of potentially different
types, similar to a table in a database or an Excel spreadsheet. Here’s a step-by-step guide to achieve this:

1. Importing pandas: First, you need to import the pandas library, which provides the
DataFrame structure.

import pandas as pd

2. Creating a Dictionary of Lists: Construct a dictionary where the keys are column names and the
values are lists representing the data for each column.

3. Creating the DataFrame: Use the pandas `DataFrame` constructor to create a DataFramefrom
the dictionary.

df = pd.DataFrame(data)
Project Based Learning-Python

4. Exploring the DataFrame: Utilize various methods to explore and summarize theDataFrame:

 head(): Displays the first few rows of the DataFrame (default is 5).
print(df.head())

 tail(): Displays the last few rows of the DataFrame (default is 5).
print(df.tail())

 info(): Provides a concise summary of the DataFrame, including the index dtype
andcolumn dtypes, non-null values, and memory usage.

print(df.info())

 describe(): Generates descriptive statistics for numeric columns, such as count,


mean,standard deviation, min, and max values.

print(df.describe())

These methods are fundamental for initial data exploration, helping you understand the structure and basic
statistics of the DataFrame. This process is essential in the data analysis pipeline, providing insights and
guiding further data cleaning, processing, and analysis steps.

By following these steps, you can efficiently create, explore, and summarize data using pandas.
Project Based Learning-Python
Experimental Setup / Experimental Outcome:

#Write a program to create a DataFrame from a dictionary of lists. Use methods like head(),tail(),
info(), and
#describe() to explore and summarize the
DataFrameimport pandas as pd

import pandas as pd

# Step 1: Create a DataFrame from a dictionary of lists


data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva', 'Frank', 'Grace', 'Hannah'],
'Age': [24, 27, 22, 32, 29, 25, 28, 30],
'Salary': [50000, 54000, 58000, 62000, 65000, 48000, 52000, 60000]
}

df = pd.DataFrame(data)

# Step 2: Display the first few rows of the DataFrame


print("First few rows of the DataFrame:")
print(df.head())

# Step 3: Display the last few rows of the DataFrame


print("\nLast few rows of the DataFrame:")
print(df.tail())

# Step 4: Display information about the DataFrame


print("\nDataFrame information:")
print(df.info())

# Step 5: Display basic statistics of the DataFrame


print("\nBasic statistics of the DataFrame:")
print(df.describe())
Project Based Learning-Python

OUTPUT:
Project Based Learning-Python

Explanation:
1. Creating the Dictionary: The data dictionary contains three keys ('Name', 'Age', 'Salary'),
each mapped to a list of values.
2. Creating the DataFrame: The pd.DataFrame(data) function converts the dictionary into a
DataFrame.
3. Exploring the DataFrame:
o head(): Shows the first 5 rows of the DataFrame.
o tail(): Shows the last 5 rows of the DataFrame.
o info(): Provides a summary of the DataFrame, including column data types and non-null
counts.
o describe(): Computes and displays descriptive statistics for numeric columns, such as
mean, standard deviation, and range.

This program provides a comprehensive way to initialize a DataFrame and analyze its basic structure and
statistics, making it easier to understand the dataset's characteristics.

1. Creating a DataFrame: The program starts by defining a dictionary called data with keys representing
column names ('Name', 'Age', and 'Salary') and values as lists containing the data for each column. Then,
pd.DataFrame(data) creates a DataFrame from this dictionary.
2. Displaying the First Few Rows: Using df.head() shows the first 5 rows of the DataFrame, giving a quick
look at the initial entries.
3. Displaying the Last Few Rows: Using df.tail() shows the last 5 rows, useful for viewing recent or ending
data in the dataset.
4. Displaying DataFrame Information: df.info() displays metadata, such as column names, data types,
number of non-null entries, and memory usage. This is helpful for understanding the structure and format
of the data.
5. Displaying Basic Statistics: df.describe() calculates summary statistics (count, mean, std, min, max, etc.)
for numeric columns ('Age' and 'Salary' in this case), providing a quick statistical overview of the data's
distribution.
Project Based Learning-Python
Conclusion:

In conclusion, creating a DataFrame from a dictionary of lists and utilizing methods like`head()`, `tail()`,
`info()`, and `describe()` provides a powerful way to manage and analyze data in Python using the pandas library.
By constructing a DataFrame, you can effectively organize data into a structured format, and these methods offer
valuable tools for initial data exploration. `head()` and `tail()` allow you to quickly view the beginning and end of
your dataset, while `info()` provides a summary of the DataFrame's structure and data types. The `describe()`
method gives a statistical overview of numeric data, helping to understand data distribution and key metrics.
Together, these techniques facilitate a comprehensive understanding of your data, supporting effective analysis and
decision- making.

In conclusion, this program demonstrates how to create a DataFrame from a dictionary of lists and explore it using
pandas methods. By displaying the first and last few rows (head() and tail()), examining structural information
(info()), and summarizing numeric data (describe()), we gain a quick yet comprehensive view of the dataset. This
process helps identify data patterns, check for missing values, and understand the distribution of numeric columns.
Such initial exploration is essential for effectively preparing data for further analysis or visualization
Project Based Learning-Python

You might also like