The document discusses importing datasets into Jupyter Notebook using Pandas and calculating measures of central tendency (mean, median, mode) from the datasets using either the Statistics or Numpy library. It provides code examples to import datasets from CSV files into a dataframe, display the dataframe, and calculate the mean, median, and mode of columns in the dataframe.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
42 views2 pages
Python Codes
The document discusses importing datasets into Jupyter Notebook using Pandas and calculating measures of central tendency (mean, median, mode) from the datasets using either the Statistics or Numpy library. It provides code examples to import datasets from CSV files into a dataframe, display the dataframe, and calculate the mean, median, and mode of columns in the dataframe.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
Python Codes
Importing dataset into Jupyter Notebook
• Before you can import a dataset, you must import Pandas library/package. • Import Pandas: import pandas as pd # read a dataset from tip.csv file and store it in a dataframe named df df = pd.read_csv('tip.csv')
# three ways to display the content of the dataset
df display(df) print(df) # to load a dataset that contains a separator # read a dataset from revenue-profit.csv file and store it in a dataframe named df df = pd.read_csv('revenue-profit.csv', sep= '; ' ) display(df)
Additional codes: # to show all rows and columns of the dataset pd.set_option('display.max_rows', None) pd.set_option('display.max_columns', None)
# to know the data type of each column
df.info()
# to know which columns that contain numerical value for calculation
df.describe()
[By: Madam Azimah, ICT Department, CFSIIUM)
Calculating Measures of Central Tendency (Mean, Median, Mode) • Before you can calculate mean/median/mode, you must import Statistics or Numpy library/package. • Import Statistics or Numpy: import statistics as st import numpy as np
• Choose whether you want to use Statistics or Numpy.
Using Statistics: mean = st.mean(df.column_name) median = st.median(df.column_name) mode = st.mode(df.column_name)
Using Numpy: (Note: mode does not exist in Numpy)
mean = np.mean(df.column_name) median = np.median(df.column_name)