Exploratory Data Analysis EDA On Power BI 1712874850
This document provides an overview of exploratory data analysis techniques in Power BI, including data import, transformation, cleaning, validation, exploration, visualization, statistical analysis, time series analysis, geographic analysis, and data quality checks. The techniques are demonstrated through examples of Power Query M code.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
92 views9 pages
Exploratory Data Analysis EDA On Power BI 1712874850
This document provides an overview of exploratory data analysis techniques in Power BI, including data import, transformation, cleaning, validation, exploration, visualization, statistical analysis, time series analysis, geographic analysis, and data quality checks. The techniques are demonstrated through examples of Power Query M code.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9
# [ Exploratory Data Analysis (EDA) on Power BI ] [ cheatsheet ]
1. Data Import
● Import data from CSV: Source = Csv.Document(File.Contents("file.csv"),
[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]) ● Import data from Excel: Source = Excel.Workbook(File.Contents("file.xlsx"), null, true) ● Import data from SQL Server: Source = Sql.Database("server", "database", [Query="SELECT * FROM table"]) ● Import data from Web: Source = Web.Page(Web.Contents("https://example.com"))
● Replace values: Table.ReplaceValue(Source, "OldValue", "NewValue", Replacer.ReplaceText, {"Column"}) ● Fill null values: Table.FillDown(Source, {"Column"}) ● Handle errors: Table.ReplaceErrorValues(Source, {{"Column", "DefaultValue"}}) ● Trim whitespace: Table.TransformColumns(Source, {{"Column", Text.Trim, type text}}) ● Remove non-printable characters: Table.TransformColumns(Source, {{"Column", each Text.Remove(_, {"0".."9", "a".."z", "A".."Z", " "}), type text}})
By: Waleed Mousa
4. Data Validation
● Check for null values: Table.AddColumn(Source, "IsNull", each if [Column]
= null then "Yes" else "No") ● Check for empty values: Table.AddColumn(Source, "IsEmpty", each if [Column] = "" then "Yes" else "No") ● Check for duplicate values: Table.AddColumn(Source, "IsDuplicate", each if List.Contains(List.RemoveFirstN(Source[Column], List.PositionOf(Source[Column], [Column])), [Column]) then "Yes" else "No") ● Check for valid data types: Table.AddColumn(Source, "IsValid", each if Value.Is([Column], type text) then "Yes" else "No") ● Check for valid ranges: Table.AddColumn(Source, "IsInRange", each if [Column] >= 0 and [Column] <= 100 then "Yes" else "No")
5. Data Exploration
● View column data types: Table.TransformColumnTypes(Source, {{"Column1",
type text}, {"Column2", type number}}) ● View column statistics: Table.Profile(Source, {"Column"}) ● View unique values: Table.Distinct(Table.SelectColumns(Source, {"Column"})) ● View top N rows: Table.FirstN(Source, 10) ● View bottom N rows: Table.LastN(Source, 10) ● View sample rows: Table.Sample(Source, 100, 1234) ● View missing values: Table.AddColumn(Source, "IsMissing", each if [Column] = null then 1 else 0) ● View data distribution: Table.Profile(Source, {"Column"}, 0.1)
6. Data Visualization
● Create a bar chart: BarChart = Table.Group(Source, {"Category"},
{{"Value", each List.Sum([Value]), type number}}) ● Create a line chart: LineChart = Table.Group(Source, {"Date"}, {{"Value", each List.Sum([Value]), type number}}) ● Create a pie chart: PieChart = Table.Group(Source, {"Category"}, {{"Value", each List.Sum([Value]), type number}}) ● Create a scatter plot: ScatterPlot = Table.Group(Source, {"X", "Y"}, {{"Value", each List.Sum([Value]), type number}})
By: Waleed Mousa
● Create a treemap: Treemap = Table.Group(Source, {"Category", "Subcategory"}, {{"Value", each List.Sum([Value]), type number}}) ● Create a heatmap: Heatmap = Table.Pivot(Source, List.Distinct(Source[Row]), "Row", List.Distinct(Source[Column]), "Column", "Value", List.Sum) ● Create a funnel chart: FunnelChart = Table.Group(Source, {"Stage"}, {{"Value", each List.Sum([Value]), type number}}) ● Create a gauge chart: GaugeChart = Table.Group(Source, {"Category"}, {{"Value", each List.Sum([Value]), type number}})
7. Statistical Analysis
● Calculate mean: Mean = List.Average(Source[Column])
● Convert to date type: Table.TransformColumnTypes(Source, {{"Date", type
date}}) ● Extract year from date: Table.TransformColumns(Source, {{"Year", each Date.Year([Date]), type number}}) ● Extract month from date: Table.TransformColumns(Source, {{"Month", each Date.Month([Date]), type number}}) ● Extract day from date: Table.TransformColumns(Source, {{"Day", each Date.Day([Date]), type number}})
By: Waleed Mousa
● Extract day of week: Table.TransformColumns(Source, {{"DayOfWeek", each Date.DayOfWeek([Date]), type number}}) ● Extract day of year: Table.TransformColumns(Source, {{"DayOfYear", each Date.DayOfYear([Date]), type number}}) ● Extract quarter from date: Table.TransformColumns(Source, {{"Quarter", each Date.QuarterOfYear([Date]), type number}}) ● Calculate moving average: Table.AddColumn(Source, "MovingAverage", each List.Average(List.Range(Source[Value], [Index] - 2, 3))) ● Calculate year-over-year growth: Table.Group(Source, {"Year"}, {{"Value", each List.Sum([Value])}})
9. Geographic Analysis
● Create a map visualization: Map = Table.AddColumn(Source, "Location",
{"OrderID", "ProductID"})), {"OrderID"}, {{"Products", each Text.Combine([ProductID], ","), type text}}), "SupportCount", each Table.RowCount(Table.SelectRows(Source, each List.Contains(Text.Split([Products], ","), [ProductID])))) ● Perform customer segmentation: CustomerSegmentation = Table.AddColumn(Table.Group(Source, {"CustomerID"}, {{"TotalSpend", each List.Sum([TotalAmount]), type number}, {"VisitFrequency", each Table.RowCount(_), type number}, {"Recency", each Date.From(List.Max([OrderDate])), type date}}), "Segment", each if [TotalSpend] > 1000 and [VisitFrequency] > 10 and Date.IsInPreviousNMonths([Recency], 3) then "High Value" else if [TotalSpend] > 500 and [VisitFrequency] > 5 and Date.IsInPreviousNMonths([Recency], 6) then "Mid Value" else "Low Value") ● Perform cohort analysis: CohortAnalysis = Table.Group(Table.AddColumn(Source, "CohortMonth", each Date.StartOfMonth([OrderDate])), {"CohortMonth", "CustomerID"}, {{"TotalSpend", each List.Sum([TotalAmount]), type number}, {"VisitFrequency", each Table.RowCount(_), type number}}) ● Perform RFM analysis: RFMAnalysis = Table.AddColumn(Table.Group(Source, {"CustomerID"}, {{"Recency", each Date.From(List.Max([OrderDate])), type date}, {"Frequency", each Table.RowCount(_), type number}, {"Monetary", each List.Sum([TotalAmount]), type number}}), "RFMScore", each Text.Combine({Text.Range(Text.From(Date.DayOfYear([Recency])), 0, 1), Text.Range(Text.From([Frequency]), 0, 1), Text.Range(Text.From([Monetary]), 0, 1)})) ● Perform customer lifetime value analysis: CustomerLifetimeValue = Table.AddColumn(Table.Group(Source, {"CustomerID"}, {{"TotalSpend", each List.Sum([TotalAmount]), type number}, {"VisitFrequency", each Table.RowCount(_), type number}, {"AverageOrderValue", each List.Average([TotalAmount]), type number}, {"CustomerLifetime", each Duration.Days(DateTime.LocalNow() - Table.Min(_[OrderDate])), type number}}), "CLV", each [AverageOrderValue] * [VisitFrequency] * [CustomerLifetime] / 365)
15. Data Storytelling
● Create a KPI visual: KPIVisual = Table.AddColumn(Table.Group(Source,
{"Category"}, {{"TotalSales", each List.Sum([Sales]), type number}, {"TargetSales", each List.Sum([Target]), type number}}), "Status", each if [TotalSales] >= [TargetSales] then "Meeting Target" else "Below Target")
By: Waleed Mousa
● Create a trend visual: TrendVisual = Table.AddColumn(Table.Group(Source, {"Date"}, {{"Sales", each List.Sum([Sales]), type number}}), "PreviousSales", each #"Sales"{[Index] - 1}) ● Create a comparison visual: ComparisonVisual = Table.Group(Source, {"Category", "Date"}, {{"ThisYearSales", each List.Sum([This Year Sales]), type number}, {"LastYearSales", each List.Sum([Last Year Sales]), type number}}) ● Create a distribution visual: DistributionVisual = Table.Group(Source, {"AgeGroup"}, {{"Sales", each List.Sum([Sales]), type number}}) ● Create a relationship visual: RelationshipVisual = Table.NestedJoin(Table.Group(Source, {"CustomerID"}, {{"TotalSales", each List.Sum([Sales]), type number}}), {"CustomerID"}, Table.Group(Source, {"CustomerID", "ProductCategory"}, {{"CategorySales", each List.Sum([Sales]), type number}}), {"CustomerID"}, "CustomerProduct", JoinKind.Inner)