0% found this document useful (0 votes)
27 views60 pages

Creating Text Columns For Data Analysis Slides

The document discusses how to split product description strings into multiple columns for analysis using DAX functions. It describes using SUBSTITUTE to replace delimiters, SEARCH and FIND to locate text positions, and LEFT, MID, RIGHT to cut out attributes into new columns. It also covers capitalizing text with UPPER and REPLACE, looking up values with RELATED and RELATEDTABLE, joining text with COMBINEVALUES, and creating aggregated columns using RELATEDTABLE. The goal is to transform unstructured text into a structured format suitable for analysis.

Uploaded by

thilakkumar.tk10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views60 pages

Creating Text Columns For Data Analysis Slides

The document discusses how to split product description strings into multiple columns for analysis using DAX functions. It describes using SUBSTITUTE to replace delimiters, SEARCH and FIND to locate text positions, and LEFT, MID, RIGHT to cut out attributes into new columns. It also covers capitalizing text with UPPER and REPLACE, looking up values with RELATED and RELATEDTABLE, joining text with COMBINEVALUES, and creating aggregated columns using RELATEDTABLE. The goal is to transform unstructured text into a structured format suitable for analysis.

Uploaded by

thilakkumar.tk10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Creating Text Columns for Data

Analysis

Andrew McSwiggan
BUSINESS INTELLIGENCE SPECIALIST
Use a series of text manipulation
functions
Overview How to search through text strings to
extract specific values
Split the product description into
multiple columns
A firm foundation for product analysis
Add custom columns to your dimension
tables
Creating Columns
Analysis Requirements

Season Product Item Product Type

Color Size
Add Columns

Filters Slicers Display


Creating Product Attributes

The Product Description is made up of a string of data elements

Autumn Winter Dress Classic White Large

We will use DAX to identify each element in the string

Season Product Type Color Size

Cut out each element from every record in the table

Create a new column attribute for each element

Save the new attributes and values in the product dimension table
Lists of values
Creating Product Attributes

The Product Description is made up of a string of data elements

Autumn Winter Dress Classic White Large

We will use DAX to identify each element in the string

Season Product Type Color Size

Cut out each element from every record in the table

Create a new column attribute for each element

Save the new attributes and values in the product dimension table
Product descriptions
Look at the Data
How is it delimited?
Spaces!
Create a new description column
- Change the spaces
- To another value
- Not in the data
- Easy to see
Replace a String with SUBSTITUTE
SUBSTITUTE(

<column_name>,

<old_text>,

<new_text>

<instance_num> )

SUBSTITUTE
Column Name of the column (or text string)

Search Text Text string to Search for

Replacement Text New text string

Instance Identify which occurrence to change


SUBSTITUTE(

<Product_Description>,

<“ “> ,

<“&”> ,

<1> )
Data Model
Row Context
Autumn Winter
Autumn&Winter

Spring Summer
Spring&Summer
Create Column

Invokes Iterates each row Before users interact


“Row Context” with the model
Row Context.
Create a DAX Expression

More beneath the Create new column Expression populates


surface each row
NESTING

SUBSTITUTE(

<SUBSTITUTE([Product_Description]," ","&",1)>,

<“ “> ,

<“-”> ,
<> )
Identify Text Positions with SEARCH & FIND
Locate the Position

FIND SEARCH
Case sensitive Not case sensitive
SEARCH(

“-”,

[Product Base],

1 ,

-1 )

SEARCH Returns Character Position


Find Text String to find

Within Text Column name or Expression

Start Num Character position in text to start from default = 1

Not Found Value set to -1


Locate the Text Positions

Season - Product - Product Type - Color - Size

14 22 30 35
Cut Out Text Parts with LEFT MID & RIGHT
Function Collection

Cutting functions Educational


Not official term
Cut Out the Attributes

LEFT MID RIGHT

Season - Product - Product Type - Color - Size

14 22 30 35
New Columns

Season Product Item Product Type

Color Size
LEFT (

[Product Base]

, [Season Gap] -1 )

LEFT Returns a string to the left of another string

Source Column The column that contains the string to evaluate

Number of Characters The number of characters from the left


MID (

, [Product Base]

, [Season Gap] +1

, ([Product Gap] - ([Season Gap] ) -1 )

MID Cuts out a sub string from the middle of a string

Column A column that contains the block of source text

Start Num The position in the middle of the string to start from

Length The length of the text to cut out - number of characters


RIGHT (

, [Product Base]

, LEN([Product Base]) - [Post Color Gap] )

RIGHT Returns a string to the right of another string

Source Column The column that contains the string to evaluate

Number of Characters The number of characters from the right


Cutting Functions

LEFT MID RIGHT


Data Cleansing with SUBSTITUTE
Messy Text
Clean the Text String

SUBSTITUTE

Light Blue LightBlue

Light Green
LightGreen
Capitalising Text with UPPER & REPLACE
Product
Title or - Lower case
Relevant
Other Attributes
Graphic - First letter is Capital
CAPITALISE

UPPER ( <column> )

UPPER ( LEFT ( [Product],1 ) )

REPLACE ( [Product], 1 , 1 , UPPER(LEFT([Product],1)) )


LOOKUP Values

Lookup

Dimension Table Fact Table


Logic
- IF

Manipulate Text with DAX


Summary - SUBSTITIUTE
- SEARCH
- FIND

Cutout Text
- LEFT
- MID
- RIGHT

Lookup Values
- RELATED
- RELATEDTABLE
Join Text Strings with COMBINEVALUES
Combine Text Strings

[Product Type] & "/" & [Product]

CONCATENATE( CONCATENATE( [Product Type] , "/" ) , [Product] )

COMBINEVALUES( "/" , [Product Type] , [Product] )


Lookup Values with RELATED and
RELATEDTABLE
Lookup Functions

RELATED RELATEDTABLE
Single value Table of values
RELATED ( <table name column name> )

RELATED

Table Name The name of the related table

Column name The name of the column


Compare Dates

Registration Date Sales Date


Fact Table Row Context
Fact Table Row Context
Fact Table Row Context
Data Model
Customers First Day Sales

First Day Other Days


Sales on date of registration Sales on a later date
Denormalisation
Create an Aggregated Column with
RELATEDTABLE
Multi Sale

Single Sale Multiple Sales


Sale Count = 1 Sale Count > 1
Multi Sale

Single Sale Multiple Sales


Customer has only purchased 1 Customer has purchased more
item than 1 item
Lookup Values
Key Learning Points

Create columns Row Context


Demo
Create two columns
- Customer table

Sale Count
Multi Sale
SUMX(

RELATEDTABLE ( <Table name> )

, <Column Name> )

RELATEDTABLE

SUMX Aggregation Function that gives a total

Table Name The name of the related table

Column name The name of the column


Row Context

You might also like