Evaluating and changing column data types
Evaluating and changing column data types
Introduction
At this point of the course, you should know that data types in Microsoft Power BI are used to
classify values to ensure a better-organized and structured dataset. You should also know
that there are three main data groups: structured, semi-structured, and unstructured.
In this reading, you'll learn how data types influence how you evaluate datasets. You'll also
explore how to change data types where required.
For example, let's consider a dataset used to analyze an online store's sales. This dataset may
contain information such as:
product name
sales date
sales amount
customer name
and customer address.
Online store dataset information Data type
Identifying these data types is essential for data analysis and processing. For example,
knowing that a sales amount has a numerical data type is necessary to perform a SUM
operation. Likewise, it's important to know that the data has a text data type to analyze
customer addresses.
As another example, let's consider a hospital's patient data. This data may include
information such as name, age, gender, diagnosis, and medication use.
Name Text
Age Numerical
Gender Text
Diagnosis Text
Medication text
Each of these columns has a different data type. The Age column is a numerical data type,
while the diagnosis and medication columns use a text data type. Identifying these data types
is crucial for analyzing a patient's health condition and creating a suitable treatment plan.
For example, knowing that the age column has a numerical data type is necessary for using it
in mathematical formulas to create a treatment plan based on a patient’s age. So, correctly
identifying data types in data sources and using appropriate methods for data analysis is
essential.
If the connected data source is a standard database system or a spreadsheet with strict rules,
there is little chance of errors or inconsistencies in the type of data being read. However,
errors can sometimes occur in data sources where data is manually entered, such as classic
CSV (Comma Separated Value) or Excel files.
However, these types of data sources don’t impose type restrictions on a column basis as
strictly as a database does. So, any inconsistent data that may appear in these columns can
lead to errors or inconsistencies in detecting the column type.
So it's important that you check the column types in Power Query Editor before loading the
data into the reporting environment. If you detect an incorrect data type, you can correct it
by changing it to the correct type. Additionally, you can also update the format of the data.
1. On the left side of the column header, select the data type icon and then select the correct
data type from the
drop-down list.
2. Alternatively, in the Transform tab, select Data Type and then select the correct data type
from the list.
3. When you save this change, this step is called Changed Type and is reiterated every time
the data is refreshed.
4. In the Home tab, select the Close & Apply menu. Then, choose the Apply option to apply
the changes and keep the
window open.
5. Or choose Close & Apply to apply the changes and close the Power Query Editor.
Conclusion
Power BI can connect to many different types of data sources. In such sources, data is
classified into three main groups called structured, semi-structured, and unstructured data.
When you connect to a data source, Power BI imports the contents and tries to detect the
data type for each piece of data.
In this reading, you learned about the importance of the data type, and you explored how to
check if the data type has been detected correctly before using it. If you detect an incorrect
data type, you should now know how to correct it by changing its type.
You can also refer to the following Microsoft Learn article for details about evaluating and
changing column data types in Power BI.