0% found this document useful (0 votes)
47 views28 pages

Data Analyst Cheat Sheet

The document is a comprehensive cheat sheet for data analysts covering essential tools and techniques in MySQL, Python, and Excel. It includes key commands and functions for data retrieval, manipulation, and visualization, along with best practices and troubleshooting tips. Each section provides concise explanations and examples to facilitate quick reference and learning.

Uploaded by

Budoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views28 pages

Data Analyst Cheat Sheet

The document is a comprehensive cheat sheet for data analysts covering essential tools and techniques in MySQL, Python, and Excel. It includes key commands and functions for data retrieval, manipulation, and visualization, along with best practices and troubleshooting tips. Each section provides concise explanations and examples to facilitate quick reference and learning.

Uploaded by

Budoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Data Analyst Cheat Sheet

CONTENT:

1.​MySQL - Page 2
2.​Python - Page 6
3.​Excel - Page 9
4.​Tableau - Page 12
5.​Microsoft Power BI - Page 15
6.​IBM Cognos - Page 18
7.​Microsoft Visio - Page 21
8.​Google Looker Studio - Page 24
MySQL

1. Basics
Connect to a Database
USE database_name;
Explanation: Switches to the specified database for running queries.

Show All Databases


SHOW DATABASES;
Explanation: Lists all databases available in your MySQL instance.

Show All Tables


SHOW TABLES;
Explanation: Lists all tables in the current database.

Describe a Table
DESCRIBE table_name;
Explanation: Shows the structure of the table (columns, data types, keys).

2. Retrieving Data
Select All Columns
SELECT * FROM table_name;
Explanation: Retrieves all rows and columns from the table.

Select Specific Columns


SELECT column1, column2 FROM table_name;
Explanation: Fetches only the specified columns.

Limit Results
SELECT * FROM table_name LIMIT 10;
Explanation: Limits the number of rows returned.

Filter Data (WHERE Clause)


SELECT * FROM table_name WHERE column_name = 'value';
Explanation: Filters rows based on conditions.

Comparison Operators
Operator Description Example

= Equal to WHERE age = 25

<> or != Not equal to WHERE age <> 25

< Less than WHERE age < 30

> Greater than WHERE age > 30

<= Less than or equal to WHERE age <= 30

>= Greater than or equal WHERE age >= 30


to

3. Aggregating Data
Count Rows
SELECT COUNT(*) FROM table_name;
Explanation: Returns the total number of rows.
Sum Values
SELECT SUM(column_name) FROM table_name;
Explanation: Calculates the sum of numeric values.

Average
SELECT AVG(column_name) FROM table_name;
Explanation: Computes the average of a numeric column.

Minimum and Maximum


SELECT MIN(column_name), MAX(column_name) FROM table_name;
Explanation: Finds the smallest and largest values.

4. Sorting and Grouping


Order By
SELECT * FROM table_name ORDER BY column_name ASC;
SELECT * FROM table_name ORDER BY column_name DESC;
Explanation: Sorts rows in ascending or descending order.

Group By
SELECT column_name, COUNT(*)
FROM table_name
GROUP BY column_name;
Explanation: Groups rows sharing a common value and performs aggregate functions.

5. Joins
Inner Join
SELECT a.column1, b.column2
FROM table1 a
INNER JOIN table2 b
ON a.id = b.id;
Explanation: Returns rows with matching values in both tables.

Left Join
SELECT a.column1, b.column2
FROM table1 a
LEFT JOIN table2 b
ON a.id = b.id;
Explanation: Returns all rows from the left table and matched rows from the right table.

Right Join
SELECT a.column1, b.column2
FROM table1 a
RIGHT JOIN table2 b
ON a.id = b.id;
Explanation: Returns all rows from the right table and matched rows from the left table.

Full Outer Join


SELECT a.column1, b.column2
FROM table1 a
FULL OUTER JOIN table2 b
ON a.id = b.id;
Explanation: Combines LEFT JOIN and RIGHT JOIN results.
6. Conditional Logic
Case Statement
SELECT column_name,
CASE
WHEN condition1 THEN 'Value1'
WHEN condition2 THEN 'Value2'
ELSE 'Default'
END AS alias_name
FROM table_name;
Explanation: Adds conditional logic to queries.

7. Advanced Queries
Subqueries
SELECT column_name
FROM table_name
WHERE column_name IN (
SELECT column_name
FROM another_table
WHERE condition
);
Explanation: Nested query that runs within another query.

CTE (Common Table Expressions)


WITH cte_name AS (
SELECT column_name
FROM table_name
WHERE condition
)
SELECT * FROM cte_name;
Explanation: Creates a temporary result set for use in a subsequent query.

8. Data Manipulation
Insert Data
INSERT INTO table_name (column1, column2)
VALUES ('value1', 'value2');
Explanation: Adds a new row to the table.

Update Data
UPDATE table_name
SET column_name = 'value'
WHERE condition;
Explanation: Modifies existing rows.

Delete Data
DELETE FROM table_name
WHERE condition;
Explanation: Deletes rows based on a condition.

9. Indexing
Create an Index
CREATE INDEX index_name ON table_name(column_name);
Explanation: Speeds up queries by creating an index on a column.
10. Best Practices
Use Aliases: Simplify table/column references.​
SELECT t1.column_name AS alias_name FROM table_name t1;
1.​
2.​ Optimize Joins: Ensure indexed columns are used for join conditions.
3.​ Limit Large Queries: Use LIMIT for large datasets to avoid slow queries.
4.​ **Avoid SELECT ***: Only fetch required columns for efficiency.
5.​ Backup Data: Always back up before running DELETE or UPDATE.

11. Visualization Examples


Example: Aggregating Sales Data
Query:
SELECT product_category, SUM(sales) AS total_sales
FROM sales_table
GROUP BY product_category
ORDER BY total_sales DESC;

Visualization:
●​ Bar Chart: x-axis = product_category, y-axis = total_sales.

Example: Sales Trend Over Time


Query:
SELECT DATE(sales_date) AS date, SUM(sales) AS daily_sales
FROM sales_table
GROUP BY date
ORDER BY date;

Visualization:
●​ Line Chart: x-axis = date, y-axis = daily_sales.

12. Troubleshooting Tips


1.​ Syntax Errors: Double-check commas, semicolons, and parentheses.
2.​ Case Sensitivity: MySQL is case-insensitive for keywords but case-sensitive for table/column names depending
on the OS.
3.​ Debugging Joins: Validate join keys for NULLs or mismatches.
Python

1. Python Basics
Data Types and Operations
Data Types: int, float, str, list, dict, tuple, set, bool​
x = 10 # int
y = 3.14 # float
name = "Ammar" # str
items = [1, 2, 3] # list
data = {'key': 'value'} # dict
●​ List Comprehension:​
squared = [x**2 for x in range(10)]
●​ Dictionary Comprehension:​
squares = {x: x**2 for x in range(10)}

2. File Handling
Read File:​
with open('file.txt', 'r') as f:
content = f.read()

Write File:​
with open('file.txt', 'w') as f:
f.write("Hello World")

3. NumPy
Array Creation
Create Arrays:​
import numpy as np
arr = np.array([1, 2, 3])
zeros = np.zeros((3, 3)) # 3x3 array of zeros
ones = np.ones((2, 2)) # 2x2 array of ones
random = np.random.rand(3, 3) # Random numbers

Array Operations
Element-wise Operations:​
arr1 = np.array([1, 2, 3])
arr2 = np.array([4,

4. Python Basics for Data Analysis


1. Essential Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
●​ NumPy: Numerical operations on arrays.
●​ Pandas: Data manipulation and analysis.
●​ Matplotlib: Basic plotting.
●​ Plotly: Interactive visualizations.

5. NumPy Essentials
1. Array Creation
array = np.array([1, 2, 3, 4])
zeros = np.zeros((3, 3))
ones = np.ones((3, 3))
random_array = np.random.rand(3, 3)
●​ np.array: Create arrays.
●​ np.zeros, np.ones: Initialize arrays with zeros or ones.
●​ np.random.rand: Random array with values in [0, 1].

2. Array Operations
arr = np.array([1, 2, 3, 4])
arr_mean = arr.mean()
arr_sum = arr.sum()
arr_max = arr.max()
●​ .mean(): Calculate mean.
●​ .sum(): Sum of elements.
●​ .max(): Maximum value.

3. Element-wise Operations
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
result = arr1 + arr2 # Element-wise addition

6. Pandas Essentials
1. Data Creation
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
●​ pd.DataFrame: Create a DataFrame.

2. Data Inspection
df.head() # First 5 rows
df.tail() # Last 5 rows
df.info() # Summary
df.describe() # Statistical summary

3. Data Selection
df['Name'] # Select column
df[['Name', 'Age']] # Select multiple columns
df.iloc[0] # Select row by index
df.loc[df['Age'] > 25] # Filter rows

4. Data Cleaning
df.dropna() # Remove missing values
df.fillna(0) # Replace NaNs with 0
df.rename(columns={'Age': 'Years'}, inplace=True) # Rename columns

5. Aggregation
df.groupby('Name').mean() # Group and calculate mean
6. Visualization with Pandas
df['Age'].plot(kind='bar')
plt.show()

7. Matplotlib Essentials
1. Basic Plot
x = [1, 2, 3]
y = [4, 5, 6]
plt.plot(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Basic Line Plot')
plt.show()

2. Scatter Plot
plt.scatter(x, y, color='red')
plt.show()

3. Histogram
plt.hist([1, 2, 2, 3, 3, 3, 4])
plt.show()

8. Plotly Essentials
1. Interactive Line Plot
import plotly.express as px
data = {'x': [1, 2, 3], 'y': [4, 5, 6]}
fig = px.line(data, x='x', y='y', title='Interactive Line Plot')
fig.show()

2. Interactive Bar Chart


fig = px.bar(data, x='x', y='y', title='Interactive Bar Chart')
fig.show()

3. Scatter Plot
fig = px.scatter(data, x='x', y='y', title='Scatter Plot')
fig.show()

9. Additional Tips
1.​ Data Wrangling:
○​ Use df.apply() for custom functions.
○​ Use pd.pivot_table() for multi-dimensional summaries.
2.​ Performance:​
Use .values or .to_numpy() to convert to NumPy arrays for faster computation.
○​ Optimize with .iterrows() sparingly; vectorized operations are better.
3.​ Visualization:
○​ For larger datasets, prefer Plotly over Matplotlib for interactivity.
Excel

🔍 1. Logical Functions
IFS
●​ Use: Replaces complex nested IFs for cleaner logic.
●​ Syntax: =IFS(condition1, result1, condition2, result2, ..., TRUE, default)​

Example:​
=IFS(A2>90,"Overdue", A2=90,"Due", A2<90,"Not Due")

➕ 2. Math & Statistical Functions


SUMIFS
●​ Use: Sum values with multiple criteria.
●​ Syntax: =SUMIFS(sum_range, criteria_range1, criteria1, ...)​

Example:​
=SUMIFS(D2:D100, A2:A100, "North", B2:B100, "John")

COUNTIFS / AVERAGEIFS / MINIFS / MAXIFS


●​ Same syntax as SUMIFS.
●​ Use:
○​ =COUNTIFS(...) – count matches.
○​ =AVERAGEIFS(...) – average values.
○​ =MINIFS(...) / =MAXIFS(...) – min/max based on conditions.

🔁 3. Lookup Functions
XLOOKUP (Modern) vs. VLOOKUP (Legacy)
●​ XLOOKUP Syntax:​
=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found], [match_mode], [search_mode])
●​ VLOOKUP Syntax:​
=VLOOKUP(lookup_value, table_array, col_index, FALSE)
●​ Advantages of XLOOKUP:
○​ Looks left or right (VLOOKUP only looks right).
○​ No need for column index numbers.
○​ Built-in error handling.​

Example:​
=XLOOKUP("Gloves", A2:A100, B2:B100, "Not Found")

INDEX + MATCH (Flexible alternative to VLOOKUP)


●​ =INDEX(return_range, MATCH(lookup_value, lookup_range, 0))
●​ Works with rows and columns. Better than VLOOKUP for large datasets.

🧠 4. Error Handling
IFERROR / IFNA
●​ Use: Handle errors gracefully.
●​ IFERROR Syntax: =IFERROR(value, value_if_error)
●​ IFNA Syntax: =IFNA(value, value_if_na)

=IFERROR(VLOOKUP(A2, Table1, 2, FALSE), "Missing")

📅 5. Date Functions
EOMONTH
●​ Use: Get last day of the month from a date.
●​ Syntax: =EOMONTH(start_date, months)
●​ Example: =EOMONTH(TODAY(), 1) → Last day of next month​

EDATE
●​ Use: Shift a date forward/backward by months.
●​ Syntax: =EDATE(start_date, months)
●​ Example: =EDATE(A2, -1) → Previous month​

NETWORKDAYS.INTL
●​ Use: Calculate workdays between two dates, including custom weekends.
●​ Syntax: =NETWORKDAYS.INTL(start_date, end_date, [weekend], [holidays])
●​ Weekend Codes: "0000011" means Mon–Fri are workdays.​

Example:​
=NETWORKDAYS.INTL(A2, B2, 1, {"2024-12-25","2024-12-26"})

📑 6. Pivot Tools
GETPIVOTDATA
●​ Use: Dynamically extract values from PivotTables.
●​ Syntax: Excel auto-generates this.​
=GETPIVOTDATA("Total", $A$3, "Region", "USA")
●​ Benefit: Stays accurate even if PivotTable layout changes

🔂 7. Dynamic Arrays (Excel 365/2021+)


FILTER
●​ Use: Filter rows based on criteria.
●​ Syntax: =FILTER(array, include, [if_empty])​

Example:​
=FILTER(A2:C100, C2:C100="Sales", "No records")

UNIQUE
●​ Use: Return unique or distinct values.
●​ Syntax: =UNIQUE(array, [by_col], [exactly_once])​

Example:​
=UNIQUE(A2:A100) // distinct values
=UNIQUE(A2:A100,,TRUE) // only values that appear once

SORT
●​ Use: Sort a dataset.
●​ Syntax: =SORT(array, [sort_index], [sort_order], [by_col])​

Example:​
=SORT(A2:C100, 2, 1) // sort by 2nd column, ascending

SEQUENCE
●​ Use: Generate a series of numbers or dates.
●​ Syntax: =SEQUENCE(rows, [columns], [start], [step])​

Example:​
=SEQUENCE(5,1,1,2) → 1, 3, 5, 7, 9

⚙️ 8. Power Query (Automation & Data Transformation)


What is Power Query?
A tool built into Excel (under Data → Get & Transform Data) to automate data cleaning.
Use Cases:
●​ Combine multiple files or sheets.
●​ Clean data: remove duplicates, blanks, split columns, transform formats.
●​ Reshape tables without formulas.
●​ Apply steps once, then refresh.​

How it Works:
●​ Data → From Table/Range → Power Query Editor opens.
●​ Each step you take is recorded as a transformation.
●​ Press “Close & Load” to save back to Excel.

🆚 Feature Comparisons
Feature VLOOKUP XLOOKUP INDEX + MATCH

Look Left? ❌ ✅ ✅
Easier to Read ✅ ✅ ❌
Flexible ❌ ✅ ✅
Error Handling ❌ ✅ (if_not_found) ✅ (with IFERROR)
1. Navigation & Shortcuts
●​ CTRL + Arrow Keys: Jump to the edge of a range of data.
●​ CTRL + SHIFT + Arrow Keys: Select a range of cells.
●​ CTRL + SPACE: Select an entire column.
●​ SHIFT + SPACE: Select an entire row.
●​ ALT + =: Auto-sum selected cells.
Tip: Use shortcuts to speed up your workflow and navigate large datasets efficiently.

2. Data Cleaning
Remove Duplicates
●​ Path: Data → Remove Duplicates.
●​ Use Case: Identify and remove duplicate rows based on one or more columns.
●​ Example:

Original Data:
Name Age City

John 25 New York

John 25 New York

Alice 30 Toronto
After Removing Duplicates:
Name Age City

John 25 New York

Alice 30 Toronto

TRIM Function
●​ Syntax: =TRIM(A1)
●​ Use Case: Removes unnecessary spaces from text.
●​ Example:
○​ Input: " Hello World "
○​ Output: Hello World

Text-to-Columns
●​ Path: Data → Text to Columns.
●​ Use Case: Split text in one column into multiple columns based on a delimiter (e.g., commas, spaces).

3. Formulas and Functions


Statistical Functions
●​ AVERAGE: =AVERAGE(A1:A10) — Calculates the mean of a range.
●​ MEDIAN: =MEDIAN(A1:A10) — Finds the middle value in a range.
●​ MODE: =MODE.SNGL(A1:A10) — Returns the most frequent value.
●​ STDEV: =STDEV.P(A1:A10) — Calculates the standard deviation.

Logical Functions
●​ IF Function:
○​ Syntax: =IF(condition, value_if_true, value_if_false)
○​ Example: =IF(A1>50, "Pass", "Fail").
○​ Input: A1 = 60
○​ Output: Pass.
●​ AND Function:
○​ Syntax: =AND(condition1, condition2, ...)
○​ Example: =AND(A1>50, B1<100).
●​ OR Function:
○​ Syntax: =OR(condition1, condition2, ...)
○​ Example: =OR(A1>50, B1<100).

Lookup Functions
●​ VLOOKUP: Searches for a value in the first column of a range and returns a value in the same row.
○​ Syntax: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup]).
Example:
ID Name Salary

101 John 50000

102 Alice 60000


○​ Formula: =VLOOKUP(101, A2:C3, 2, FALSE) → John.
●​ HLOOKUP: Similar to VLOOKUP but works horizontally.
●​ INDEX & MATCH: More flexible than VLOOKUP.
○​ Syntax:
■​ INDEX(array, row_num, [column_num])
■​ MATCH(lookup_value, lookup_array, [match_type])
○​ Combine: =INDEX(A1:C3, MATCH(101, A1:A3, 0), 2).

4. Data Visualization
Creating Charts
●​ Path: Insert → Select Chart Type (e.g., Line, Bar, Pie).
●​ Tips:
○​ Use bar charts for categorical data.
○​ Use line charts for trends over time.
○​ Use scatter plots for relationships between two variables.

Pivot Tables
●​ Path: Insert → PivotTable.
●​ Use Case: Summarize large datasets.
●​ Steps:
1.​ Select your data.
2.​ Drag fields into Rows, Columns, Values, and Filters.
3.​ Use slicers to interactively filter data.

Conditional Formatting
●​ Path: Home → Conditional Formatting.
●​ Examples:
○​ Highlight values >50: Highlight Cell Rules → Greater Than.
○​ Color scales for numeric ranges: Color Scales.

5. Data Analysis Tools


Goal Seek
●​ Path: Data → What-If Analysis → Goal Seek.
●​ Use Case: Find the input value needed to achieve a desired output.

Solver
●​ Path: Data → Solver.
●​ Use Case: Solve optimization problems.
Descriptive Statistics
●​ Path: Data → Data Analysis → Descriptive Statistics.
●​ Use Case: Generate mean, median, variance, etc., for a dataset.

6. Power Query
●​ Path: Data → Get & Transform Data → From Table/Range.
●​ Use Case: Clean and transform data without manual effort.
●​ Example:
○​ Combine multiple files into one dataset.
○​ Remove duplicates or filter rows automatically.

7. Advanced Techniques
Dynamic Named Ranges
●​ Use Case: Create a range that adjusts automatically as data changes.
●​ Steps:
1.​ Define a name (Formulas → Name Manager).
2.​ Use the formula: =OFFSET(Sheet1!$A$1, 0, 0, COUNTA(Sheet1!$A:$A), 1).

Array Formulas
●​ Use Case: Perform calculations across multiple cells.
●​ Example:
○​ {=SUM(A1:A10*B1:B10)} — Calculates the weighted sum.

Data Validation
●​ Path: Data → Data Validation.
●​ Use Case: Restrict user input.
●​ Example: Create a dropdown list.

8. Useful Tips
1.​ Save as Table:
○​ Path: Insert → Table.
○​ Advantage: Automatically updates formulas, charts, and pivot tables.
2.​ Freeze Panes:
○​ Path: View → Freeze Panes.
○​ Locks rows or columns for easier navigation.
3.​ Use Named Ranges:
○​ Define ranges for readability and easier reference.

9. Keyboard Shortcuts
Shortcut Action

CTRL + T Create a table

CTRL + D Fill down

CTRL + R Fill right

CTRL + SHIFT + L Toggle filters

CTRL + 1 Open format cells dialog

CTRL + ALT + V Open Paste Special dialog

ALT + F1 Create a default chart

Tableau

1. Getting Started with Tableau


Key Concepts:
●​ Dimensions and Measures:
○​ Dimensions: Categorical data (e.g., Country, Product Name).
○​ Measures: Quantitative data (e.g., Sales, Profit).
●​ Data Connection:
○​ Connect to Data: Use the Data Pane to connect to CSV, Excel, SQL databases, or online sources.
○​ Joins: Inner, Left, Right, Full Outer joins for blending multiple datasets.

2. Tableau Interface Overview


●​ Data Pane: Drag dimensions and measures here to create visualizations.
●​ Sheet Tabs: Bottom tabs for creating multiple sheets (charts, tables, dashboards).
●​ Marks Card: Customize visualizations (e.g., color, size, text, detail).

3. Creating Basic Visualizations


Common Chart Types and When to Use Them:
1.​ Bar Chart:
○​ Use for comparing categories.
○​ Drag a dimension (e.g., Product) to Columns and a measure (e.g., Sales) to Rows.
○​ Example: Sales by Product.
2.​ Line Chart:
○​ Use for trends over time.
○​ Drag Date to Columns and a Measure (e.g., Profit) to Rows.
○​ Example: Monthly Sales Trend.
3.​ Scatter Plot:
○​ Use for correlation between two measures.
○​ Drag two measures to Columns and Rows, respectively.
○​ Example: Profit vs. Sales.
4.​ Pie Chart:
○​ Use for proportions.
○​ Drag a measure to Angle on the Marks Card and a dimension to Color.
○​ Example: Market Share by Region.
5.​ Heatmap:
○​ Use for identifying patterns.
○​ Drag dimensions to Rows and Columns, and a measure to Color.
○​ Example: Profit by Category and Region.
6.​ Map:
○​ Use for geographical data.
○​ Drag a geographic dimension (e.g., State) to the canvas; Tableau automatically maps it.

4. Essential Tableau Calculations


Basic Calculations:
1.​ Calculated Field:
○​ Right-click the data pane → Create Calculated Field.
○​ Syntax: IF [Sales] > 500 THEN "High" ELSE "Low" END
○​ Example: Categorize sales as "High" or "Low".
2.​ Aggregate Functions:
○​ SUM([Sales]): Total Sales.
○​ AVG([Profit]): Average Profit.
○​ COUNT([Order ID]): Count of Orders.
3.​ Date Calculations:
○​ YEAR([Order Date]): Extract year.
○​ DATEDIFF('day', [Start Date], [End Date]): Calculate the difference between two dates.
4.​ Table Calculations:
○​ Right-click a field → Add Table Calculation.
○​ Examples:
■​ Running Total: Accumulate a measure over time.
■​ Percent of Total: Contribution of each value to the total.

5. Filters and Parameters


Filters:
●​ Dimension Filter: Filter categorical data (e.g., show only "USA").
●​ Measure Filter: Filter numerical values (e.g., Sales > $1000).
●​ Context Filter: Set as primary filters to improve performance.

Parameters:
●​ Dynamic values for interactive dashboards.
○​ Create a parameter → Add it to a calculated field.
○​ Example: Parameter to toggle between showing Sales or Profit.

6. Dashboard Tips
●​ Combine Views: Drag multiple sheets into a single dashboard.
●​ Add Interactivity:
○​ Use Actions for filters, highlights, or URL navigation.
○​ Example: Clicking a region filters charts to show region-specific data.
●​ Best Practices:
○​ Keep dashboards clean and minimal.
○​ Use consistent color schemes and fonts.

7. Advanced Features
Level of Detail (LOD) Expressions:
●​ Syntax: { FIXED [Dimension] : SUM([Measure]) }
○​ FIXED: Calculate at a specific granularity.
○​ INCLUDE: Include additional dimensions in aggregation.
○​ EXCLUDE: Ignore dimensions.
●​ Example:
○​ { FIXED [Region] : SUM([Sales]) }: Total Sales by Region.
Blending Data:
●​ Combine data from multiple sources using a common field.
●​ Example: Blend Salesforce data with Excel.

8. Tips for Performance Optimization


1.​ Extracts over Live Connections:
○​ Use data extracts for faster performance.
2.​ Filter Minimization:
○​ Avoid excessive filters and use context filters.
3.​ Reduce Dashboard Complexity:
○​ Use fewer sheets and avoid overloading dashboards.

9. Practical Example: Profit Analysis Dashboard


Visualization Flow:
1.​ Bar Chart: Profit by Category.
2.​ Map: Profit by State.
3.​ Trend Line: Profit Over Time.
4.​ Filters: Add Region and Year as filters.
5.​ KPI Card: Create calculated fields for key metrics like Total Sales and Average Profit Margin.
Steps:
●​ Drag and drop the required dimensions and measures.
●​ Use Filters to allow interactivity.
●​ Publish the dashboard to Tableau Server or Tableau Public.

10. Shortcuts and Key Commands


●​ Ctrl + Z: Undo.
●​ Ctrl + D: Duplicate a sheet.
●​ Alt + Shift + Left/Right: Resize columns/rows.
●​ F7: Show data source page.
●​ Ctrl + M: Toggle Marks card.

11. Resources for Practice


●​ Tableau Public: Explore dashboards created by others.
●​ Sample Datasets: Use Tableau’s built-in Superstore dataset.
Microsoft Power BI

1. Data Loading and Transformation


1.​ Import Data:
○​ Get Data: Home > Get Data
■​ Used to load data from various sources (Excel, SQL Server, Web, JSON, CSV, etc.).
■​ Shortcut: Ctrl + Shift + D
○​ Example:
■​ Loading an Excel file:
1.​ Click Get Data.
2.​ Select Excel Workbook.
3.​ Choose the file and load.
2.​ Transform Data:
○​ Navigate to Transform Data to enter Power Query Editor for cleaning and shaping data.
■​ Remove Columns: Select columns > Right-click > Remove Columns.
■​ Split Columns: Select column > Transform Tab > Split Column.
■​ Replace Values: Select column > Transform > Replace Values.
○​ Example:
■​ Splitting "Full Name" into "First Name" and "Last Name":
1.​ Select the column.
2.​ Choose Split Column > By Delimiter > Space.
3.​ Data Types:
○​ Define the correct data type for each column to avoid errors in analysis.
■​ Shortcut: Select column > Transform > Data Type dropdown.

2. DAX (Data Analysis Expressions)


1.​ Common DAX Functions:
○​ SUM: Adds up values in a column.
■​ Example: Total Sales = SUM(Sales[Amount])
○​ AVERAGE: Calculates the mean.
■​ Example: Average Sales = AVERAGE(Sales[Amount])
○​ IF: Conditional logic.
■​ Example: High Sales = IF(Sales[Amount] > 1000, "Yes", "No")
○​ CALCULATE: Apply filters to calculations.
■​ Example: Sales in 2023 = CALCULATE(SUM(Sales[Amount]), Sales[Year] = 2023)
○​ RELATED: Access columns from related tables.
■​ Example: Product Category = RELATED(Products[Category])
○​ DATEADD: Manipulate dates.
■​ Example: Last Year Sales = CALCULATE(SUM(Sales[Amount]), DATEADD(Sales[Date], -1, YEAR))
2.​ Creating Measures vs. Calculated Columns:
○​ Measure:
■​ Dynamic; calculates on the fly.
■​ Example: Total Profit = SUM(Sales[Amount]) - SUM(Cost[Amount])
○​ Calculated Column:
■​ Static; tied to individual rows.
■​ Example: Profit Margin = Sales[Profit] / Sales[Amount]

3. Visualizations
1.​ Types of Visuals:​
Bar Chart: Compare categories.
○​ Line Chart: Track trends over time.
○​ Scatter Plot: Identify relationships between variables.
○​ Pie Chart: Show proportions.
○​ Map: Visualize geographical data.
2.​ Customizing Visuals:
○​ Filters: Add filters directly to visuals.
○​ Formatting:
■​ Adjust axes, labels, and colors.
■​ Example: Use Data Colors to highlight key data points.
○​ Tooltips:
■​ Enhance insights by showing additional information when hovering over data points.
3.​ Visual Hierarchies:
○​ Drag fields to create drill-down capabilities.
○​ Example:
■​ Year > Quarter > Month drill-down in a Line Chart.

4. Relationships
1.​ Model Relationships:
○​ Found in the Model view.
○​ Types:
■​ One-to-Many (most common).
■​ Many-to-Many.
○​ Example:
■​ Link Sales table to Products table via ProductID.
2.​ Edit Relationships:
○​ Double-click the relationship line > Set Cardinality and Cross-filter Direction.

5. Filters and Slicers


1.​ Filters:
○​ Types:
■​ Visual-Level: Filters applied to a specific visual.
■​ Page-Level: Filters applied to the current page.
■​ Report-Level: Filters applied across the entire report.
○​ Example:
■​ Add a filter for "Region" to show only "North America".
2.​ Slicers:
○​ Interactive filters.
○​ Example:
■​ Add a slicer for "Year" to allow users to select a specific year.

6. Performance Optimization
1.​ Data Reduction:
○​ Remove unnecessary columns and rows.
○​ Use summarized data for analysis.
2.​ Efficient DAX:
○​ Avoid using CALCULATE within calculated columns.
○​ Use measures wherever possible.
3.​ Pre-Aggregated Tables:
○​ Pre-aggregate data before importing to Power BI.

7. Exporting and Sharing


1.​ Export Data:
○​ Right-click on a visual > Export Data to Excel/CSV.
2.​ Publish Reports:
○​ Publish to Power BI Service:
■​ Home > Publish > Select Workspace.
3.​ Embedding Reports:
○​ Embed dashboards in SharePoint, Teams, or custom apps.

8. Advanced Analytics
1.​ Bookmarks:
○​ Capture a view of the report for storytelling.
○​ View Tab > Bookmarks.
2.​ Drillthrough Pages:
○​ Create dedicated pages for detailed insights.
○​ Example:
■​ Drillthrough to view sales by individual customer.
3.​ What-If Parameters:
○​ Add interactive sliders for scenario analysis.

9. Keyboard Shortcuts
1.​ Common Shortcuts:
○​ Open File: Ctrl + O
○​ Save: Ctrl + S
○​ Add Visualization: Drag fields to canvas.
○​ Undo: Ctrl + Z
○​ Redo: Ctrl + Y

10. Common Tips


1.​ Naming Conventions:
○​ Use clear, descriptive names for tables, columns, and measures.
2.​ Version Control:
○​ Maintain a history of changes to reports and datasets.
3.​ Test and Validate:
○​ Cross-check calculations and visuals against raw data.
IBM Cognos

1. Understanding the Basics


●​ Cognos Overview:
○​ IBM Cognos is a business intelligence tool for data analysis and reporting.
○​ Components include:
■​ Cognos Analytics: For self-service data visualization.
■​ Framework Manager: For data modeling.
■​ Cognos Query Studio: For simple queries and reporting.

2. Navigating the Interface


●​ Dashboard Layout:
○​ Welcome Page: Central hub to access dashboards, reports, and datasets.
○​ Data Module: Used for creating data models.
○​ Reporting Tool: For designing reports with advanced layouts.

3. Data Preparation Commands and Features


●​ Uploading Data:
○​ Drag and drop your dataset (CSV, Excel, or SQL database) into the data module.
○​ Use the "Manage Data" option to verify and clean your data.
●​ Data Cleansing Commands:
○​ Remove Duplicates:
■​ Use the Deduplicate function to ensure unique rows in a column.
○​ Filter Rows:
■​ FILTER(column_name, condition)
■​ Example: FILTER(Sales, Sales > 1000).
●​ Joining Tables:
○​ Use Relationships to create joins between tables.
○​ Types of Joins:
■​ Inner Join: Only matching rows.
■​ Outer Join: Includes unmatched rows.
●​ Data Grouping:
○​ GROUPBY(column_name) to summarize data by a specific column.
○​ Example: Group sales data by region.

4. Report Building
●​ Creating a New Report:
○​ Use the "+" button to start a new report.
○​ Select a layout: Tabular, Chart, Crosstab, or Custom.
●​ Common Report Elements:
○​ Text Items: Add headers or explanatory text.
○​ Lists: Tabular view for data.
○​ Crosstabs: Data comparison with rows and columns.
○​ Charts: Visualize trends and comparisons.
●​ Report Filters:
○​ Static Filters: Predefined conditions.
○​ Dynamic Filters: Allow user interaction.
1.​ Example: Filter sales by year using a dropdown.
●​ Conditional Formatting:
○​ Highlight cells based on conditions.
○​ Example: Highlight sales > $10,000 in green:
1.​ Select column.
2.​ Apply "Conditional Style."
3.​ Define rule: Sales > 10,000.
5. Data Visualization Tips
●​ Types of Visualizations:
○​ Line Charts: For trends over time.
○​ Bar Charts: For comparing categories.
○​ Pie Charts: For proportions.
○​ Heatmaps: For density or magnitude.
●​ Customization Options:
○​ Change colors, add data labels, and adjust axes for clarity.
○​ Example: In a bar chart, right-click the axis to rename it for better understanding.

6. Advanced Analytics
●​ Calculations:
○​ Basic Calculations:
■​ SUM(column_name)
■​ AVERAGE(column_name)
■​ Example: TOTAL(Sales) / COUNT(Region) for average sales by region.
○​ Custom Expressions:
■​ CASE WHEN condition THEN value ELSE value END
■​ Example: CASE WHEN Sales > 5000 THEN 'High' ELSE 'Low' END.
●​ Drill-Through Reports:
○​ Create links to detailed reports.
○​ Use "Drill-Through Definitions" to pass context (e.g., Region -> Region Details).
●​ Forecasting:
○​ Enable "Predictive Analytics" to forecast future trends based on historical data.

7. Collaboration and Sharing


●​ Exporting Reports:
○​ Options: PDF, Excel, CSV, or XML.
○​ Use "Export" in the toolbar to save reports for offline use.
●​ Sharing Dashboards:
○​ Use "Share" to generate a link or embed dashboards in websites.
●​ Scheduled Reports:
○​ Automate report delivery with the "Schedules" feature.

8. SQL and Query Editor


●​ Running SQL Queries:
○​ Switch to "Query Mode" for advanced data manipulations.
○​ Example:​
SELECT Region, SUM(Sales)
FROM SalesData
WHERE Year = 2024
GROUP BY Region
●​ Parameterization:
○​ Add prompts for user input.
○​ Example:​
SELECT *
FROM Sales
WHERE Region = ?Prompt('Enter Region')?

9. Optimization Tips
●​ Use data caching to speed up queries.
●​ Reduce the number of joins for faster performance.
●​ Aggregate data at the source before importing.

10. Troubleshooting
●​ Common Errors:
○​ Data mismatch: Ensure all columns used in joins have compatible data types.
○​ Report rendering issues: Check dataset size and reduce unnecessary calculations.
●​ Debugging Tools:
○​ Use the Validation feature to check queries and reports.

11. Keyboard Shortcuts


Action Shortcut

Save Report Ctrl + S

Run Report Ctrl + R

Toggle Fullscreen F11

Refresh Data F5

12. Visual Example of a Workflow


Here’s an example visualization of the workflow:
1.​ Data Import: Connect to a database or upload CSV.
2.​ Data Preparation: Clean, filter, and join datasets.
3.​ Report Creation: Use tables, charts, and cross-tabs.
4.​ Analysis: Add calculations, filters, and predictions.
5.​ Sharing: Export or schedule the report.
Microsoft Visio

1. Key Components of Visio


Shapes and Stencils
●​ Shapes: The basic building blocks of Visio diagrams. They represent entities, processes, or relationships in a data
model.
○​ Drag-and-Drop: Use the left panel to drag shapes into the workspace.
○​ Common Shapes:
■​ Rectangle: Represents processes.
■​ Diamond: Represents decisions in a flowchart.
■​ Ellipse: Represents start or end points.
○​ Shortcut: Use Ctrl + Drag to duplicate shapes.
●​ Stencils: Collections of related shapes (e.g., Flowchart, Data Visualization).
○​ Access via the Shapes Pane on the left.
Templates
●​ Predefined diagram layouts tailored to specific scenarios (e.g., Flowcharts, Org Charts, UML Diagrams).
○​ Open via File > New > Templates.

2. Data Linking and Importing


Link Data to Diagrams
●​ Data Import Options:
○​ Excel, Access, SQL Server, SharePoint, ODBC, or other external sources.
○​ Steps:
1.​ Go to Data > Link Data to Shapes.
2.​ Select the data source.
3.​ Map fields to shape data.
●​ Auto-Link Shapes:
○​ Use the Automatically Link feature to map rows in your dataset to Visio shapes based on matching data
fields.
○​ Navigate to Data > Automatically Link > Match Columns.
Data Visualization Add-In (Excel Integration)
●​ Import data from Excel into Visio to generate flowcharts and diagrams automatically.
○​ Available in Excel under Insert > Add-ins > Data Visualizer.

3. Common Diagrams for Data Analysis


Flowcharts
●​ Purpose: Visualize processes.
●​ Key Commands:
○​ Use the Flowchart stencil for shapes.
○​ Add Connectors (Ctrl + 1 for Connector Tool).
○​ Right-click on a shape to add text or data.
Swimlane Diagrams
●​ Purpose: Separate processes into functional areas.
●​ Steps:
1.​ Use the Cross-Functional Flowchart template.
2.​ Add lanes to represent departments or roles.
3.​ Drag shapes into lanes for clarity.
ER Diagrams (Entity Relationship)
●​ Purpose: Model database structures.
●​ Key Features:
○​ Use Database Model Diagram template.
○​ Represent entities, relationships, and cardinality.
○​ Use connectors to define primary key-foreign key relationships.

4. Key Commands and Shortcuts


Basic Navigation
●​ Pan Canvas: Hold Space + Drag.
●​ Zoom: Use Ctrl + Scroll or Ctrl + 1/2/3 for specific zoom levels.
●​ Center Object: Select shape and press Ctrl + Shift + C.
Align and Distribute
●​ Align shapes:
○​ Home > Arrange > Align.
○​ Keyboard Shortcut: Ctrl + L.
●​ Distribute shapes evenly:
○​ Home > Arrange > Distribute.
Connector Commands
●​ Dynamic Connectors:
○​ Automatically adjust when shapes are moved.
○​ Add via Ctrl + 1 or Connector Tool.
●​ Straight Connectors:
○​ Select connector and change to straight under Design > Line.
Format Shapes
●​ Fill Color: Ctrl + Shift + F for color dialog.
●​ Line Style: Right-click > Format Shape > Line.
●​ Quick Styles: Select shapes, then use Design > Quick Styles.

5. Advanced Tips
Data Graphics
●​ Overlay dynamic data onto shapes for enhanced visualization.
○​ Steps:
1.​ Import data.
2.​ Go to Data > Display Data Graphics.
3.​ Customize display options (text, icons, or bars).
Layer Management
●​ Organize and control visibility of diagram elements.
○​ Steps:
1.​ Navigate to Home > Layers > Layer Properties.
2.​ Create new layers and assign shapes.
Validation
●​ Ensure diagram accuracy by checking rules.
○​ Go to Process > Check Diagram to validate against predefined rules (e.g., for flowcharts or BPMN
diagrams).

6. Visualization Examples
Flowchart Example
[Start] --> [Process A] --> [Decision?]
| |
V V
[Yes] [No]
●​ Use Decision Diamonds to branch flows.

ER Diagram Example
Entity Attributes

Customer ID (PK), Name, Email

Orders OrderID (PK), Date, CustomerID (FK)


●​ Use Primary Keys (PK) and Foreign Keys (FK) to link.

Swimlane Diagram Example


Marketing | [Generate Leads] -------> [Pass to Sales]
Sales | [Qualify] ---> [Close Deal]
Operations | [Deliver Product]

7. Export and Sharing


Export Options
●​ Image/ PDF: Go to File > Save As > Choose Format.
●​ Share Online:
○​ File > Share > OneDrive/Teams.
Integration with Power BI
●​ Import Visio diagrams into Power BI for interactive dashboards.
○​ Install the Visio Visual from Power BI marketplace.
○​ Link diagrams and data for live updates.

8. Troubleshooting Tips
●​ Slow Performance:
○​ Reduce file size by simplifying diagrams.
○​ Turn off shape shadows and 3D effects.
●​ Connector Issues:
○​ Ensure shapes are properly grouped.
○​ Use Ctrl + Shift + O to view connection points.
Google Looker Studio

1. Data Connections
●​ Connect to Data Sources:​
Looker Studio supports over 800 data connectors.
○​ Popular options: Google Sheets, BigQuery, Google Analytics, MySQL, PostgreSQL.
○​ Use Extract Data Connector to cache frequently used data for faster dashboards.
●​ Tips:
○​ Ensure datasets have proper field naming conventions.
○​ Pre-clean your data to avoid unnecessary calculations.

2. Data Transformation
●​ Calculated Fields:
○​ Create new fields using Looker Studio's formulas.
○​ Example:
■​ SUM(Sales): Total sales.
■​ CASE WHEN (Condition) THEN (Result):​
CASE
WHEN Age > 18 THEN "Adult"
ELSE "Minor"
END
●​ Blending Data:
○​ Combine datasets on common keys (e.g., blending sales and customer demographics using CustomerID).
○​ Ensure the join key is consistent across datasets.

3. Visualization Options
●​ Charts:
○​ Time Series Chart: Trends over time.
○​ Bar/Column Charts: Compare categories.
○​ Pie Charts: Show proportions.
○​ Geo Maps: Visualize data by location.
●​ Pro Tip: Avoid pie charts for more than 5 categories; use bar or stacked bar charts instead.
●​ Interactive Elements:
○​ Use filters and controls (e.g., dropdowns, date range pickers) for dynamic dashboards.
○​ Set default filters for the most common views.
4. Best Practices for Dashboards
●​ Design:
○​ Keep dashboards clean and intuitive.
○​ Group related metrics together (e.g., KPIs at the top, detailed breakdowns below).
○​ Use consistent colors for better readability.
●​ KPIs:
○​ Display key metrics using Scorecards.
○​ Example:
■​ Total Sales: $1,200,000
■​ YOY Growth: 15%
●​ Visualization:
○​ Use color-coded scorecards to indicate performance (e.g., red for negative growth, green for positive).

5. Key Looker Studio Functions


Function Syntax Description

SUM() SUM(Sales) Returns the sum of a field. Useful for


total metrics like revenue or expenses.

AVG() AVG(Profit Margin) Calculates the average of a field.

COUNT() COUNT(CustomerID) Counts the number of records.

CASE CASE WHEN condition THEN Creates conditional logic (similar to IF).
result END

IF() IF(condition, true_value, Simplifies conditional expressions.


false_value)

DATEDIFF() DATEDIFF(Date1, Date2) Returns the difference between two


dates in days.

CONCAT() CONCAT(FirstName, " ", Combines strings into one.


LastName)

REGEXP_MATCH() REGEXP_MATCH(Field, Checks if a field matches a regex


"regex") pattern.

EXTRACT() EXTRACT(YEAR FROM Date) Extracts components from a date


(YEAR, MONTH, DAY).

FORMAT_DATETI FORMAT_DATETIME("%Y-%m- Formats dates into custom formats.


ME() %d", Date)

TRUNC() TRUNC(Number, Digits) Truncates a number to a specific


number of decimal places.

6. Advanced Visualization Techniques


●​ Heat Maps:
○​ Use conditional formatting in tables to highlight values.
○​ Example:
■​ Revenue Growth by Region: Apply gradient colors to show performance.
●​ Custom Calculations in Charts:
○​ Add calculated metrics directly to charts.
○​ Example:
■​ Profit Margin = (Revenue - Cost) / Revenue.
●​ Dynamic KPIs:
○​ Use date range controls to dynamically update KPIs.

7. Filters and Segments


●​ Types of Filters:
○​ Dimension Filters: Filter data by category (e.g., filter by product type).
○​ Metric Filters: Filter data by range (e.g., revenue > $10,000).
●​ Segmentation:
○​ Example:
○​ High-Value Customers:​
CASE
WHEN Revenue > 10000 THEN "High Value"
ELSE "Regular"
END

8. Optimization Tips
●​ Performance:
○​ Use aggregated data to reduce computation time.
○​ Avoid overloading dashboards with unnecessary charts.

●​ Caching:
○​ Use the Extract Data Connector to cache static datasets.
○​ Enable Google Analytics sampling if working with large datasets.
●​ Version Control:
○​ Save dashboard snapshots before making major changes.

9. Sharing and Collaboration


●​ Sharing Dashboards:
○​ Control access by role (Viewer, Editor, Owner).
○​ Embed dashboards in websites or share via a link.
●​ PDF Export:
○​ Schedule email delivery of dashboard snapshots.

10. Pro-Tips for Analysts


1.​ Start with the Question: Define what you need to analyze before building.
2.​ Storytelling with Data:
○​ Highlight key insights and actionable trends.
○​ Use annotations on charts for context.
3.​ Iterative Approach: Test and refine dashboards based on stakeholder feedback.
4.​ Use Templates: Leverage pre-built Looker Studio templates for faster setup.

11. Example Dashboard Layout


Section 1: Key Metrics (KPIs)
●​ Scorecards: Total Revenue, Average Order Value, YOY Growth.
Section 2: Trends
●​ Time Series Chart: Monthly Revenue.
Section 3: Breakdown
●​ Bar Chart: Sales by Product Category.
●​ Geo Map: Revenue by Region.
Section 4: Filters
●​ Date Range, Product Category, Region.

You might also like