Data Profiling
Data Profiling
• order_id,order_date,customer_id,city,province,product_id,brand,quantity,item_price
# Kolom city
length_city = len(retail_raw['city'])
print('Length kolom city:', length_city)
• count_city = retail_raw['city'].count()
• print('Count kolom count_city:',count_city)
count_city = retail_raw['city'].count()
print('Count kolom count_city:',count_city)
Missing Value
• Dengan Length dan Count, sekarang
dapat menghitung jumlah missing-value.
Jumlah nilai yang hilang adalah
perbedaan antara Length dan Count
import pandas as pd
import numpy as np
import io
import pandas_profiling
retail_raw = pd.read_csv(‘file_data')
# Kolom city
length_city = len(retail_raw['city'])
count_city = retail_raw['city'].count()
# Kolom product id
length_product_id = len(retail_raw['product_id'])
count_product_id = retail_raw['product_id'].count()
2. Restart
3. Jalankan kode berikut
import pandas as pd
import matplotlib.pyplot as plt
from pandas_profiling import ProfileReport
%matplotlib inline
pd.set_option('display.max_colwidth', None)
df = pd.read_csv(‘file_data')
profile = ProfileReport(df, title=‘Data retail', explorative=True)
profile.to_notebook_iframe()
profile.to_file('analisis_Sigit.html')