DWM Lab Workbook Sample
DWM Lab Workbook Sample
LAB WORKBOOK
22DSB3202 DATA WAREHOUSING & MINING
(DWM)
LABORATORY
WORKBOOK
STUDENT NAME
REGN NO
Year III(2024-25)
Odd semester
SEMESTER
SECTION
Dr. Shahin Fatima
FACULTY
2
2
22DSB3202 DATA WAREHOUSING & MINING (DWM)
TABLE OF CONTENTS
3
22DSB3202 DATA WAREHOUSING & MINING (DWM)
Code
import pandas as pd
OUTPUT
4
22DSB3202 DATA WAREHOUSING & MINING (DWM)
2. Dataset Creation
CATEGORICAL DATA
1. Import Libraries:
o Import the pandas library as pd to handle data manipulation and
DataFrame creation.
o Import the numpy library as np to generate random numbers.
2. Define Date Range:
o Create a date range starting from '2024-01-01' with a total of 100
periods (days).
o Use a daily frequency ('D') to generate the sequence of dates.
3. Generate Random Data:
o Create a dictionary data where:
'Date' is assigned the date range created in step 2.
'Value' is assigned a numpy array of random numbers generated
from a normal distribution, scaled by 100.
'Value' contains random values drawn from a normal distribution
with a mean of 0 and standard deviation of 100.
4. Create DataFrame:
o Convert the dictionary data into a pandas DataFrame object named df.
5. Save DataFrame to CSV:
5
22DSB3202 DATA WAREHOUSING & MINING (DWM)
Code
CATEGORICAL DATA
import pandas as pd
# Define data
data = {
# Create DataFrame
df = pd.DataFrame(data)
# Save to CSV
#df.to_csv('categorical_dataset.csv', index=False)
df.to_csv('C:/Users/lenovo/Documents/categorical_dataset.csv', index=False)
import pandas as pd
import numpy as np
data = {
6
22DSB3202 DATA WAREHOUSING & MINING (DWM)
'Date': date_range,
# Create DataFrame
df = pd.DataFrame(data)
# Save to CSV
df.to_csv('C:/Users/lenovo/Documents/time_series_dataset.csv', index=False)
OUTPUT
7
22DSB3202 DATA WAREHOUSING & MINING (DWM)
Code
import pandas as pd
OUTPUT
8
22DSB3202 DATA WAREHOUSING & MINING (DWM)
Code
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
OUTPUT
9
22DSB3202 DATA WAREHOUSING & MINING (DWM)
Code
import pandas as pd
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
OUTPUT
10