0% found this document useful (0 votes)
539 views6 pages

Informatica Kickoff Jan

The document outlines the Informatica Challenge kickoff on January 25, detailing the setup of Oracle SQL tables for a Superstore dataset, including tables for cleaned data, top customers, and discount analysis. It describes the data preparation process, including loading data from a CSV file and executing various Informatica tasks to clean data, identify top customers, and analyze discounts. Additionally, it provides instructions for exporting the resulting data into CSV files.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
539 views6 pages

Informatica Kickoff Jan

The document outlines the Informatica Challenge kickoff on January 25, detailing the setup of Oracle SQL tables for a Superstore dataset, including tables for cleaned data, top customers, and discount analysis. It describes the data preparation process, including loading data from a CSV file and executing various Informatica tasks to clean data, identify top customers, and analyze discounts. Additionally, it provides instructions for exporting the resulting data into CSV files.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Informatica Challenge Kick off Jan-25

1. Oracle SQL Setup

● Table Creation Queries: 1.1 Superstore Table

CREATE TABLE Superstore (


Category VARCHAR(255),
City VARCHAR(255),
Country_Region VARCHAR(255),
Customer_ID VARCHAR(255),
Customer_Name VARCHAR(255),
Order_Date VARCHAR(255),
Order_ID VARCHAR(255),
Order_Profitable VARCHAR(255),
Postal_Code NUMBER,
Product_ID VARCHAR(255),
Product_Name VARCHAR(255),
Region VARCHAR(255),
Row_ID NUMBER,
Segment VARCHAR(255),
Ship_Date VARCHAR(255),
Ship_Mode VARCHAR(255),
Ship_Status VARCHAR(255),
State VARCHAR(255),
Sub_Category VARCHAR(255),
Days_to_Ship_Actual NUMBER,
Days_to_Ship_Scheduled NUMBER,
Discount VARCHAR(255),
Profit NUMBER,
Quantity NUMBER,
Sales NUMBER,
Sales_Forecast NUMBER
);

1.2 Superstore_Cleaned_Data Table

CREATE TABLE Superstore_Cleaned_Data (


Category VARCHAR(255),
City VARCHAR(255),
Customer_ID VARCHAR(255),
Customer_Name VARCHAR(255),
Order_Date VARCHAR(255),
Order_ID VARCHAR(255),
Order_Profitable VARCHAR(255),
Product_ID VARCHAR(255),
Product_Name VARCHAR(255),
Region VARCHAR(255),
Row_ID NUMBER,
Segment VARCHAR(255),
Ship_Date VARCHAR(255),
Ship_Mode VARCHAR(255),
Ship_Status VARCHAR(255),
State VARCHAR(255),
Sub_Category VARCHAR(255),
Days_to_Ship_Actual NUMBER,
Days_to_Ship_Scheduled NUMBER,
Discount VARCHAR(255),
Profit NUMBER,
Quantity NUMBER,
Sales NUMBER,
Sales_Forecast NUMBER
);

1.3 Top_15 Table

CREATE TABLE Top_15 (


Category VARCHAR(255),
City VARCHAR(255),
Customer_ID VARCHAR(255),
Customer_Name VARCHAR(255),
Order_Date VARCHAR(255),
Order_ID VARCHAR(255),
Order_Profitable VARCHAR(255),
Region VARCHAR(255),
Discount VARCHAR(255),
Profit NUMBER,
Quantity NUMBER,
Sales NUMBER,
Total_Profit NUMBER
);

1.4 Discount_Analysis Table

CREATE TABLE Discount_Analysis (


Category VARCHAR(255),
City VARCHAR(255),
Customer_ID VARCHAR(255),
Customer_Name VARCHAR(255),
Order_Date VARCHAR(255),
Order_ID VARCHAR(255),
Order_Profitable VARCHAR(255),
Product_ID VARCHAR(255),
Product_Name VARCHAR(255), Region VARCHAR(255),
Row_ID NUMBER,
Segment VARCHAR(255),
Ship_Date VARCHAR(255),
Ship_Mode VARCHAR(255),
Ship_Status VARCHAR(255),
State VARCHAR(255),
Sub_Category VARCHAR(255),
Days_to_Ship_Actual NUMBER,
Days_to_Ship_Scheduled NUMBER,
Discount VARCHAR(255),
Profit NUMBER,
Quantity NUMBER,
Sales NUMBER,
Sales_Forecast NUMBER,
Discount_Updated NUMBER,
Avg_Profit NUMBER,
Avg_Sales NUMBER
);

2. Data Preparation

● Dataset Location:
~/Desktop/Project/kickoffs-information-tableau-superstore/Superstore.csv
● Load Data: Use Oracle SQL Developer to import Superstore.csv into the Superstore table.

3. Informatica Tasks

Task 1: Cleaned Data

● Mapping Name: m_cleaned_data

● Workflow Name: w_cleaned_data

● Session Name: s_cleaned_data

● Target Table: Superstore_Cleaned_Data

● CSV Name: superstore_cleaned_data.csv


Operations:

1. Import the Superstore table as the source.

2. Remove all rows with null values using Expression Transformation (e.g.,

ISNULL(Column_Name) = 0).

3. Delete columns Country_Region and Postal_Code.

4. Load cleaned data into Superstore_Cleaned_Data.

Sample output

Table count: 977

CUSTO ORDE ORDER_


CUSTOMER_ MER_N R_DA ORDER PROFITA PRODU
CATEGORY CITY ID AME TE _ID BLE CT_ID
SEGMEN SHIP_ SHIP_ SHIP_ST
PRODUCT_NAME REGION ROW_ID T DATE MODE ATUS STATE
DAYS_TO_ DAYS_TO_S SALES_
SHIP_ACTU HIP_SCHEDU DISCOU PROFI QUANT FORECA
SUB_CATEGORY AL LED NT T ITY SALES ST
CA-
New York Custome 2/17/ 2022- Profitabl FUR-
Furniture City C0037 r 47 2022 228926 e 2681
Global Deluxe
High-Back Corporat 2/25/ Standar Shipped Californ
Manager's Chair East 1 e 2022 d Class on Time ia
Furniture 4 2 0.15% 93.92 5 434.26 413.98

Task 2: Top 15 Customers by Category


● Mapping Name: m_top_15
● Workflow Name: w_top_15
● Session Name: s_top_15
● Target Table: Top_15
● CSV Name: top_15.csv
Operations:
1. Import Superstore_Cleaned_Data as the source.
2. Split data into three categories: Technology, Office Supplies, Furniture.
3. Filter rows where Order_Profitable = "Profitable".
4. Calculate Total_Profit (sum of Profit) grouped by Customer_Name.
5. Select Top 5 Customers per category based on Total_Profit.
6. Load results into the Top_15 table.

Sample output

Table count: 15

CUST CUSTO ORDE ORDER_ RE DIS PR QU S TOTA


CATE OMER MER_N R_DA ORDE PROFITA GI COU OF ANT AL L_PR
GORY CITY _ID AME TE R_ID BLE ON NT IT ITY ES OFIT
CA-
Office Phila 2022- 12 85
Suppli delp Custom 2/15/ 54650 Profitabl Eas 0.01 7.6 5.
es hia C0075 er 11 2022 0 e t % 4 5 2 371

Task 3: Discount Analysis


● Mapping Name: m_discount_analysis
● Workflow Name: w_discount_analysis
● Session Name: s_discount_analysis
● Target Table: Discount_Analysis
● CSV Name: discount_analysis.csv Operations:
1. Import Superstore_Cleaned_Data as the source.
2. Create Discount_Updated by removing "%" from the Discount column (e.g., 0.00% →
0.00).
3. Split data into segments: Consumer and Corporate.
4. Calculate Avg_Profit and Avg_Sales grouped by Region.
5. Load results into the Discount_Analysis table.

Sample output:

Table count: 8

CUSTO CUSTOM ORDE ORDER_P


MER_I ER_NAM R_DA ORDER ROFITABL PRODU PRODUCT_N
CATEGORY CITY D E TE _ID E CT_ID AME
SHIP_ SUB_C
ROW SEGM SHIP_DA MOD
SHIP_ST ATEGO DAYS_TO_SHI
REGION _ID ENT TE EATUS STATE RY P_ACTUAL
DAYS_TO_SH DISC SALES_ DISCOUN
IP_SCHEDUL OUN PROFI QUANTI FORECA T_UPDAT AVG_P
ED T T TY SALES ST ED ROFIT AVG_SALES
India CA- Hon Deluxe
napol Custome 1/28/ 2022- FUR- Fabric Micro
Furniture is C0087 r 92 2022 619416 Profitable 6289 Chair
Secon
Corpor 2/14/20 d Shipped Furnitu
Central 969 ate 22 Class Late Texas re 3
0.05 172.9
4 % 143.73 3 1 189.03 0 45 554

4. Export CSV Files

1. Export Steps in Oracle SQL Developer:


○ Run SELECT * FROM [Table_Name];
○ Right-click query results → Export → Format: CSV.
○ Save to output folder.

Required CSV Files:


○ superstore_cleaned_data.csv
○ top_15.csv
○ discount_analysis.csv

You might also like