Tutorial - Next Steps in Scripting
Tutorial - Next Steps in Scripting
Qlik® Sense
1.1
Copyright © 1993-2015 QlikTech International AB. All rights reserved.
Copyright © 1993-2015 QlikTech International AB. All rights reserved.
Qlik®, QlikTech®, Qlik® Sense, QlikView®, Sense™ and the Qlik logo are trademarks which have been
registered in multiple countries or otherwise used as trademarks by QlikTech International AB. Other
trademarks referenced herein are the trademarks of their respective owners.
Contents
Unmap 43
5 Handling hierarchical data 44
5.1 Hierarchy prefix 44
5.2 HierarchyBelongsTo prefix 45
Authorization 46
6 QVD files 48
6.1 Working with QVD files 48
6.2 Creating QVD files 49
Store 49
Buffer 50
6.3 Reading data from QVD files 51
6.4 Thank you! 51
In Qlik Sense, scripting is mainly used to specify what data to load from your data sources. In this tutorial you
will learn how to transform and manipulate data from databases and files using the data load editor.
l Editing scripts
l Transforming data
l Data cleansing
l Hierarchical data
l Dollar-sign expansions
When you have completed the tutorial, you should have a fair understanding of some of the more advanced
steps involved in scripting in Qlik Sense. A deeper understanding of scripting can be gained by taking a
training course that is available from the Qlik website.
1.2 Prerequisites
To get the most out of this tutorial, we recommend that you have fulfilled the following prerequisites before
you begin:
l You have completed the Tutorial - Scripting for Beginners that is available at help.qlik.com.
At help.qlik.com, you find the Qlik Sense online help and a number of downloadable guides.
These are all valuable sources of information and are highly recommended.
During the data load, Qlik Sense identifies common fields from different tables (key fields) to associate the
data. The resulting data structure of the data in the app can be monitored in the data model viewer. Changes
to the data structure can be achieved by renaming fields to obtain different associations between tables.
After the data has been loaded into Qlik Sense, it is stored in the app. The app is the heart of the program's
functionality and it is characterized by the unrestricted manner in which data is associated, its large number
of possible dimensions, its speed of analysis and its compact size. The app is held in RAM when it is open.
Before we can start looking more closely at the script in the data load editor, you need to create an empty app
and a couple of data connections to be able to load data into Qlik Sense.
If you do not already have the latest tutorial files, you can download them from the Qlik website. The
example files that you need for this tutorial are:
l Product.xlsx
l Transactions.csv
l Employees.xlsx
l Salesman.xlsx
l Data.xlsx
l Event.txt
l Intervals.txt
l Winedistricts.txt
Do the following:
You can find more detailed descriptions of how to create data connections, and the supported file types, in
the Scripting for beginners tutorial available on the Qlik Sense help site. The following sections are simply
reminders of the procedures used in the Tutorial - Scripting for Beginners tutorial.
You need to have an ODBC data source for the database you want to access. This is
configured through theODBC Data Source Administrator tool. If you do not have a data source
already, you need to add it and configure it to point to, for example, a Microsoft Access
database.
Do the following:
The data connection has been created and you are now ready to connect to the database and to start
selecting which data to load.
Do the following:
1. Click Create new connection and select OLE DB from the drop-down list.
2. Select Provider from the list of available providers.
3. Type the name of the Data source to connect to. This can be a server name, or in some cases, the
path to a database file. This depends on which OLE DB provider you are using.
4. Select which type of credentials to use if required:
l Windows integrated security: With this option you use existing Windows credentials.
l Specific user name and password: With this option you need to enter User name and
Password.
If the data source does not require credentials, leave User name and Password empty.
5. If you want to test the connection, click the Test connection button.
6. If you want to use a name different to the default provider name, edit Name.
7. Click Save.
The Save button is only enabled if connection details have been entered correctly, and
the automatic connection test was successful.
The connection is now added to the Data connections, and you can connect to, and select data from the
connected OLE DB data source if the connection string is correctly entered.
Do the following:
1. In the app, Advanced Scripting tutorial, you created in the previous section, open the data load
editor.
2. Click Create new connection and select Folder.
3. Locate the folder you want to connect to and give it a name.
4. Click Save.
The folder connection is now complete and you are now ready to connect to the files and start selecting which
data to load.
The script, which must be written using the Qlik Sense script syntax, is color coded to make it easy to
distinguish the different elements. Comments are highlighted in green, whereas Qlik Sense syntax keywords
are highlighted in blue. Each script line is numbered.
There are a number of functions available in the editor to assist you in developing the load script:
Detailed syntax help There are two ways to access detailed syntax help for a Qlik Sense syntax
keyword:
l Click D in the toolbar to enter syntax help mode. In syntax help mode
you can click on a syntax keyword (marked in blue and underlined) to
access syntax help.
l Place the cursor inside or at the end of the keyword and press Ctrl+H.
Auto-complete If you start to type a Qlik Sense script keyword, you get an auto-complete list
of matching keywords to select from. The list is narrowed down as you
continue to type. For example, type month.
Do the following:
Tooltips When you type an open parenthesis after a Qlik Sense script function, a
tooltip displays the syntax of the function, including parameters, return value
types and additional statements.
Prepared test script You can insert a prepared test script that will load a set of inline data fields.
You can use this to quickly create a data set for test purposes.
Do the following:
l Press Ctrl+00
Do the following:
Tab (indent)
Shift+Tab (outdent)
Search and replace You can search and replace text throughout script sections.
Selecting all code You can select all code in the current script section.
Do the following:
l Press Ctrl+A
You cannot create connections, edit connections, select data, save the script or load new data
while you are running in debug mode, that is, from when you have started debug execution until
the script is executed or execution has been ended.
Debug toolbar
The debug panel for the data load editor has a toolbar with the following options to control the debug
execution:
Limited load Select the check box to limit how many rows of data to load from each data source.
This is useful for large data sources, as it reduces the execution time.
Enter the number of rows you want to limit the load to.
œ Start or continue execution in debug mode until the next breakpoint is reached.
Œ Step to the next line of code.
– End execution here.
Output
Output displays all messages that are generated during debug execution. You can select to lock the output
from scrolling when new messages are displayed by clicking \.
3 Transforming data
This section introduces you to the transformation and manipulation of data that you can perform through the
data load editor before using the data in your app.
One of the advantages to data manipulation is that you can choose to load only a subset of the data from a
file, such as a few chosen columns from a table, to make the data handling more efficient. You can also load
the data more than once to split up the raw data into several new logical tables. It is also possible to load data
from more than one source and merge it into one table in Qlik Sense.
In this section you will learn how to load data using Crosstable. You will also learn how to join tables, use
inter-record functions such as Peek and Previous, and load the same row several times using While Load.
Crosstable prefix
In the input table below you have one column per month and one row per product.
If this table is simply loaded into Qlik Sense the result is a table with one field for Product and one field for
each of the months. But if you want to analyze this data, it is much easier to have all numbers in one field and
all months in another,that is,in a three-column table, one for each category (Product, Month, Sales).
The Crosstable prefix converts the data to a table with one column for Month and another for Sales. Another
way to express it is to say that it takes field names and converts these to field values.
Example:
Crosstable (Month, Sales) LOAD Product, [Jan 2014], [Feb 2014], [Mar 2014], … From … ;
Do the following:
1. In the app, Advanced Scripting tutorial, open the data load editor.
2. Click Create new connection and select Folder.
3. Locate the folder where the tutorial file Product.xlsx is stored, and give it the name Tutorial Files.
4. Click Save.
8. Click ± on the Tutorial Files data connection to select a file to load data from.
9. Select Product.xlsx and click Select.
10. Select Product.
11. Click Insert script.
Under Field names, make sure that Embedded field names is selected to include the
names of the table fields when you load the data.
This script produces a table with one field for product and one for each of the months.
12. To view the table, select data model viewer, highlight the table and click on Preview to verify your
data.
13. Select Data load editor and open the Product script.
14. Add a new row at the top of the script and add the following to that row:
CrossTable(Month, Sales)
17. Click ” and select Data model viewer to verify that your data is loaded.
The loaded table is a three-column table, one column for each category (Product, Month, Sales):
Usually the input data has only one column as a qualifier field; as an internal key (Product in the above
example). But you can have several. If so, all qualifying fields must be listed before the attribute fields in the
LOAD statement, and the third parameter to the Crosstable prefix must be used to define the number of
qualifying fields.
It is not possible to have a preceding LOAD or a prefix in front of the Crosstable keyword. Auto-concatenate
can, however, be used.
The numeric interpretation will not work for the attribute fields. This means that if you have months as
column headers, these will not be automatically interpreted. The work-around is to use the crosstable prefix
to create a temporary table, and to run a second pass through it to make the interpretations as in the
following example:
tmpData:
Crosstable (MonthText, Sales)
LOAD Product, [Jan 2014], [Feb 2014], … From 'lib://Tutorial Files/Product.xlsx'
(ooxml, embedded labels, table is Product);
Final:
LOAD Product,
Date(Date#(MonthText,'MMM YYYY'),'MMM YYYY') as Month,
Sales
Resident tmpData;
Drop Table tmpData;
The following examples show the various options available with Drop Table:
The command Drop also lets you delete one or several fields:
As you may notice, the keyword FIELD or TABLE can set in plural (FIELDS, TABLES) even if you have to
delete one single table or field.
l The loaded tables often become larger, and Qlik Sense works slower.
l Some information may be lost: the frequency (number of records) within the original table may no
longer be available.
The Keep functionality, which has the effect of reducing one or both of the two tables to the intersection of
table data before the tables are stored in Qlik Sense, has been designed to reduce the number of cases
where explicit joins needs to be used.
In this documentation, the term join is usually used for joins made before the internal tables are
created. The association, made after the internal tables are created, is however essentially
also a join.
Join
The simplest way to make a join is with the Join prefix in the script, which joins the internal table with another
named table or with the last previously created table. The join will be an outer join, creating all possible
combinations of values from the two tables.
Example:
The resulting internal table has the fields a, b, c and d. The number of records differs depending on the field
values of the two tables.
The names of the fields to join over must be exactly the same. The number of fields to join over
is arbitrary. Usually the tables should have one or a few fields in common. No field in common
will render the cartesian product of the tables. All fields in common is also possible, but usually
makes no sense. Unless a table name of a previously loaded table is specified in the Join
statement the Join prefix uses the last previously created table. The order of the two
statements is thus not arbitrary.
Using Join
The explicit Join prefix in the Qlik Sense script language performs a full join of the two tables. The result is
one table. In many cases such joins will results in very large tables.
Do the following:
1. In the app, Advanced Scripting tutorial, open the data load editor.
5. Click ± on the Tutorial files data connection to select a file to load data from.
6. Select Transactions.csv and click Select.
7. Click Insert script
8. Click ± on the Tutorial files data connection to select a file to load data from.
9. Select Salesman.xlsx and click Select.
10. Select Salesman
11. Click Insert script
Under Field names, make sure that Embedded field names is selected to include the names
of the table fields when you load the data.
LOAD
"Transaction ID",
"Salesman ID",
Product,
"Serial No",
"ID Customer",
"List Price",
"Gross Margin"
FROM 'lib://Tutorial Files/Transactions.csv'
(txt, codepage is 1252, embedded labels, delimiter is ',', msq);
LOAD
"Salesman ID",
Salesman,
"Distributor ID"
FROM 'lib://Tutorial Files/Salesman.xlsx'
(ooxml, embedded labels, table is Salesman);
Clicking on Load data at this point would produce the following data model:
However, having the Transactions and Salesman tables separated may not be the required result.It may be
better to join the two tables.
Do the following:
1. To set a name for the joined table, add a new row at the top of the script and enter the following:
Transactions:
2. To join the Transactions and Salesman tables, on the empty line above the second LOAD statement,
add the following:
Your script should now look like this:
Join (Transactions)
Transactions:
LOAD
"Transaction ID",
"Salesman ID",
Product,
"Serial No",
"ID Customer",
"List Price",
"Gross Margin"
FROM 'lib://Tutorial Files/Transactions.csv'
(txt, codepage is 1252, embedded labels, delimiter is ',', msq);
Join (Transactions)
LOAD
"Salesman ID",
Salesman,
"Distributor ID"
FROM 'lib://Tutorial Files/Salesman.xlsx'
(ooxml, embedded labels, table is Salesman);
4. When the script execution is finished, click Close in the Progress pop-up.
5. Click ” and select Data model viewer to verify that your data is loaded.
All the fields of the Transactions and Salesman tables are now combined into a single Transactions table.
Keep
One of the main features of Qlik Sense is its ability to make associations between tables instead of joining
them, which reduces space in memory, increases speed and gives enormous flexibility. The keep
functionality has been designed to reduce the number of cases where explicit joins need to be used.
The Keep prefix between two LOAD or SELECT statements has the effect of reducing one or both of the
two tables to the intersection of table data before they are stored in Qlik Sense. The Keep prefix must always
be preceded by one of the keywords Inner, Left or Right. The selection of records from the tables is made in
the same way as in a corresponding join. However, the two tables are not joined and will be stored in Qlik
Sense as two separately named tables.
Inner
The Join and Keep prefixes in the Qlik Sense script language can be preceded by the prefix Inner.
If used before Join, it specifies that the join between the two tables should be an inner join. The resulting
table contains only combinations between the two tables with a full data set from both sides.
If used before Keep, it specifies that the two tables should be reduced to their common intersection before
being stored in Qlik Sense.
Example:
First, we perform an Inner Join on the tables, resulting in VTable, containing only one row, the only record
existing in both tables, with data combined from both tables.
VTable:
SELECT * from Table1;
inner join SELECT * from Table2;
If we perform an Inner Keep instead, you will still have two tables. The two tables are of course associated
via the common field A.
VTab1:
SELECT * from Table1;
VTab2:
inner keep SELECT * from Table2;
Left
The Join and Keep prefixes in the Qlik Sense script language can be preceded by the prefix left.
If used before Join, it specifies that the join between the two tables should be a left join. The resulting table
only contains combinations between the two tables with a full data set from the first table.
If used before Keep, it specifies that the second table should be reduced to its common intersection with the
first table before being stored in Qlik Sense.
Example:
First, we perform a Left Join on the tables, resulting in VTable, containing all rows from Table1, combined
with fields from matching rows in Table2.
VTable:
SELECT * from Table1;
left join SELECT * from Table2;
If we perform an Left Keep instead, you will still have two tables. The two tables are of course associated via
the common field A.
VTab1:
SELECT * from Table1;
VTab2:
left keep SELECT * from Table2;
Right
The Join and Keep prefixes in the Qlik Sense script language can be preceded by the prefix right.
If used before Join, it specifies that the join between the two tables should be a right join. The resulting table
only contains combinations between the two tables with a full data set from the second table.
If used before Keep, it specifies that the first table should be reduced to its common intersection with the
second table before being stored in Qlik Sense.
Example:
First, we perform a Right Join on the tables, resulting in VTable, containing all rows from Table2, combined
with fields from matching rows in Table1.
VTable:
SELECT * from Table1;
right join SELECT * from Table2;
If we perform an Right Keep instead, you will still have two tables. The two tables are of course associated
via the common field A.
VTab1:
SELECT * from Table1;
VTab2:
right keep SELECT * from Table2;
In this part of the tutorial we will be examining the Peek, Previous and Exists functions. More detailed
information on these functions can be found on the Qlik Sense help site.
Peek
Peek() finds the value of a field in a table for a row that has already been loaded or that exists in internal
memory. The row number can be specified, as can the table.
Syntax:
Row must be an integer. 0 denotes the first record, 1 the second and so on. Negative numbers indicate order
from the end of the table. -1 denotes the last record read.
tablename is a table label without the ending colon.If no tablename is stated, the current table is assumed.
If used outside the LOAD statement or referring to another table, the tablename must be included.
Previous
Previous() finds the value of the expr expression using data from the previous input record that has not been
discarded because of a where clause. In the first record of an internal table, the function will return NULL.
Syntax:
Previous(expression)
The Previous function may be nested in order to access records further back. Data are fetched directly from
the input source, making it possible to refer also to fields which have not been loaded into Qlik Sense,that is,
even if they have not been stored in the associated database.
Exists
Exists() determines whether a specific field value has already been loaded into the field in the data load
script. The function returns TRUE or FALSE, so can be used in the where clause of a LOAD statement or an
IF statement.
Syntax:
Exists(field [ , expression ] )
The field must exist in the data loaded so far by the script. Expression is an expression evaluating to the
field value to look for in the specified field. If omitted, the current record’s value in the specified field will be
assumed.
Currently this only collects data for month, hires and terminations, so we are going to add fields for Employee
Count and Employee Var, using the Peek and Previous functions, to see the monthly difference in total
employees.
Do the following:
1. In the app, Advanced Scripting tutorial, open the data load editor.
5. Click ± on the Tutorial files data connection to select a file to load data from.
6. Select Employees.xlsx and click Select.
7. Click Insert script.
Under Field names, make sure that Embedded field names is selected to include the
names of the table fields when you load the data.
LOAD
"Date",
Hired,
Terminated
FROM 'lib://Tutorial Files/Employees.xlsx'
(ooxml, embedded labels, table is Sheet2);
9. Click Save.
The Peek() function lets you identify any value loaded for a defined field.
Notice too that in the Peek() function we are using a (-1). This tells Qlik Sense to look at the record
above the current record. If the (-1) is not specified, Qlik Sense will assume that you want to look at
the previous record.
10. Add the following below the script you have just modified
[Employee Count]:
LOAD
Row,
Date,
Hired,
Terminated,
[Employee Count],
If(rowno()=1,0,[Employee Count]-Previous([Employee Count])) as [Employee Var]
Resident [Employees Init] Order By Row asc;
[Employees Init]:
LOAD
rowno() as Row,
Date(Date) as Date,
Hired,
Terminated,
If(rowno()=1, Hired-Terminated, peek([Employee Count], -1)+(Hired-Terminated)) as
[Employee Count]
[Employee Count]:
LOAD
Row,
Date,
Hired,
Terminated,
[Employee Count],
If(rowno()=1,0,[Employee Count]-Previous([Employee Count])) as [Employee Var]
Resident [Employees Init] Order By Row asc;
If, in a new sheet in the app overview, you now create a standard table using Date, Hired, Terminated,
Employee Count and Employee Var as the columns of the table, you should get a result similar to this:
Peek() and Previous() allow users to target defined rows within a table. The biggest difference between the
two functions is that the Peek() function allows the user to look into a field that was not previously loaded into
the script whereas the Previous() function can only look into a previously loaded field. Previous() operates
on the Input to the LOAD statement, whereas Peek() operates on the output of the LOAD statement.
(Same as the difference between RecNo() and RowNo().) This means that the two functions will behave
differently if you have a Where-clause.
So the Previous() function would be better suited for when a user needs to show the current value versus the
previous value. In the example we calculated the employee variance from month to month.
The Peek() function would be better suited when the user is targeting either a field that has not been
previously loaded into the table or if the user needs to target a specific row. This was shown in the example
where we calculated the Employee Count by peeking into the previous month’s Employee Count and adding
the difference between the hired and terminated employees for the current month. Remember that
Employee Count was not a field in the original file.
Using Exists()
The Exists() function is often used with the Where clause in the script in order to load data if related data has
already been loaded in the data model.
In the following example we are also using the Dual() function to assign numeric values to strings.
Do the following:
In the script, the Age and AgeBucket fields are loaded only if the PersonID has already been loaded in
the data model.
Notice in the AgeTemp table that there are ages listed for PersonID 11 and 12 but since those IDs
were not loaded in the data model (in the People table), they are excluded by the Where Exists
(PersonID) clause. This clause can also be written like this: Where Exists(PersonID, PersonID).
If none of the PersonIDs in the AgeTemp table had been loaded into the data model, then the Age
and AgeBucket fields would not have been joined to the People table. Using the Exists function can
help to prevent orphan records/data in the data model, that is, Age and AgeBucket fields that do not
have any associated people.
7. Add a bar chart to the sheet with the dimension AgeBucket, and the measure Count([AgeBucket]).
8. Adjust the properties of the table and bar chart to your preference and click Done.
The Dual() function is very useful in the script, or in a chart expression, when there is the need to assign a
numeric value to a string.
LOAD
PersonID,
Age,
If(IsNull(Age) or Age='', Dual('No age', 5),
If(Age<25, Dual('Under 25', 1),
If(Age >=25 and Age <35, Dual('25-34', 2),
If(Age>=35 and Age<50, Dual('35-49' , 3),
If(Age>=50, Dual('50 or over', 4),
))))) as AgeBucket
Resident AgeTemp
Where Exists(PersonID);
In the script you have an application that loads ages, and you have decided to put those ages in buckets so
that you can create visualizations based on the age buckets versus the actual ages. There is a bucket for
people under 25, between 25 and 35, and so on. By using the Dual() function, the age buckets can be
assigned a numeric value that can later be used to sort the age buckets in a list box or in a chart. So, as in
the app sheet, the sort puts "No age" at the end of the list.
By far the simplest way to solve this problem in Qlik Sense is to use the IntervalMatch prefix in front of either
a LOAD or a SELECT statement. The LOAD/ SELECT statement needs to contain two fields only, the
“From” and the “To” fields defining the intervals. The IntervalMatch prefix will then generate all
combinations between the loaded intervals and a previously loaded numeric field, specified as parameter to
the prefix.
Do the following:
Intervals:
LOAD
IntervalID
IntervalAttribute,
IntervalBegin,
IntervalEnd,
FROM 'lib://Tutorial Files/Intervals.txt'
(txt, utf8, embedded labels, delimiter is '\t', msq);
BridgeTable:
IntervalMatch (EventDate)
LOAD distinct IntervalBegin, IntervalEnd Resident Intervals;
The data model contains a composite key (the IntervalBegin and IntervalEnd fields) which will manifest itself
as a Qlik Sense synthetic key:
l The Events table that contains exactly one record per event.
l The Intervals table that contains exactly one record per interval.
l The bridge table that contains exactly one record per combination of event and interval, and that links
the two previous tables.
Note that an event may belong to several intervals if the intervals are overlapping. And an interval can of
course have several events belonging to it.
This data model is optimal, in the sense that it is normalized and compact. The Events table and the
Intervals table are both unchanged and contain the original number of records. All Qlik Sense calculations
operating on these tables, for example, Count(EventID), will work and will be evaluated correctly.
A loop inside the LOAD statement can be created using the While clause:
Such a LOAD statement will loop over each input record and load this over and over as long as the
expression in the While clause is true. The IterNo() function returns “1” in the first iteration, “2” in the second,
and so on.
You have a primary key for the intervals, the IntervalID, so the only difference in the script will be how the
bridge table is created:
BridgeTable:
LOAD distinct * Where Exists(EventDate);
LOAD IntervalBegin + IterNo() - 1 as EventDate, IntervalID
Resident Intervals
While IntervalBegin + IterNo() - 1 <= IntervalEnd ;
Do the following:
In the general case, the solution with three tables is the best one, because it allows for a many to
many relationship between intervals and events. But a very common situation is that you know that an
event can only belong to one single interval. In such a case, the bridge table is really not necessary:
The IntervalID can be stored directly in the event table. There are several ways to achieve this, but the
most useful is to join Bridgetable with the Events table.
If you have a case where the intervals are overlapping and a number can belong to more than one interval,
you usually need to use closed intervals.
However, in some cases you do not want overlapping intervals, you want a number to belong to one interval
only. Hence, you will get a problem if one point is the end of one interval and, at the same time, the
beginning of next. A number with this value will be attributed to both intervals. Hence, you want half-open
intervals.
A practical solution to this problem is to subtract a very small amount from the end value of all intervals, thus
creating closed, but non-overlapping intervals. If your numbers are dates, the simplest way to do this is to use
the function DayEnd() which returns the last millisecond of the day:
Intervals:
LOAD…, DayEnd(IntervalEnd – 1) as IntervalEnd From Intervals ;
But you can also subtract a small amount manually. If you do, make sure the subtracted amount isn’t too
small since the operation will be rounded to 52 significant binary digits (14 decimal digits).
If you use a too small amount, the difference will not be significant and you will be back using the original
number.
It could be as in the table below where you have currency rates for multiple currencies. Each currency rate
change is on its own row; each with a new conversion rate. Also, the table contains rows with empty dates
corresponding to the initial conversion rate, before the first change was made.
This table defines a set of non-overlapping intervals, where the begin data is called “Change Date” and the
end date is defined by the beginning of the following interval. But since the end date isn’t explicitly stored in a
column of its own, we need to create such a column, so that the new table will become a list of intervals.
Do the following:
1. Create a file called Rates.xlsx containing the table shown above and store it ready for loading.
Make sure that the dates in the Change Date column are in the same format as the local date format.
2. Determine which time range you want to work with. The beginning of the range must be before the
first date in the data and the end of the range must be after the last.
3. Load the source data, but change empty dates to the beginning of the range defined in the previous
bullet. The change date should be loaded as “From Date”.
4. Sort the table first according to Currency, then according to the “From Date” descending so that you
have the latest dates on top.
5. Run a second pass through data where you calculate the “To Date”. If the current record has a
different currency from the previous record, then it is the first record of a new currency (but its last
interval), so you should use the end of the range defined in step 1. If it is the same Currency, you
should take the “From Date” from the previous record, subtract a small amount of time, and use this
value as “To Date” in the current record.
The script listed below will update the source table in the following manner:
Rates:
LOAD Currency, Rate, FromDate,
Date(If( Currency=Peek(Currency),
Peek(FromDate) - $(#vEpsilon),
$(#vEndTime)
)) as ToDate
Resident Tmp_Rates
Order By Currency, FromDate Desc;
When this script is run, you will have a table listing the intervals correctly. Use the Preview section of the
data model viewer to view the resulting table.
This table can subsequently be used in a comparison with an existing date using the Intervalmatch
methods.
4 Data cleansing
There are times when the source data that we load into Qlik Sense is not necessarily how we want it in our
Qlik Sense application. Qlik Sense provides a host of functions and statements that allow us to transform our
data into a format that works for us.
Mapping can be used in a Qlik Sense script to replace or modify field values or names when the script is run,
so mapping can be used to clean up data and make it more consistent or to replace parts or all of a field
value.
When loading data from different tables, field values denoting the same thing are not always consistently
named. Since this lack of consistency hinders associations, the problem needs to be solved. This can be
done in an elegant way by creating a mapping table for the comparison of field values.
Rules:
l A mapping table must have two columns, the first one containing the comparison values and the
second the desired mapping values.
l The two columns must be named, but the names have no relevance in themselves. The column
names have no connection to field names in regular internal tables.
• Mapping prefix
• ApplyMap()
• MapSubstring()
• Unmap statement
Mapping prefix
The Mapping prefix is used in a script to create a mapping table. The mapping table can then be used with
the ApplyMap() function, the MapSubstring() function or the Map … Using statement.
Do the following:
CountryMap:
MAPPING LOAD * INLINE [
Country, NewCountry
U.S.A., US
U.S., US
United States, US
United States of America, US
];
The CountryMap table stores two columns: Country and NewCountry. The Country column stores the various
ways country has been entered in the Country field. The NewCountry column stores how the values will be
mapped. This mapping table will be used to store consistent US country values in the Country field. For
instance, if U.S.A. is stored in the Country field, map it to be US.
ApplyMap() function
ApplyMap() allows the user to replace data in a field based on a previously created mapping table. The
mapping table need to be loaded before the ApplyMap() function can be used. In the sample file, Data.xlsx,
data that includes people and the country they reside in is loaded. The raw data looks like this:
In the table above, notice that the country is entered in various ways. In order to make the country field
consistent, the mapping table is loaded and then the ApplyMap() function is used.
Do the following:
1. In the data load editor, click ± on the data connection you created in the previous section to select a
file to load data from.
2. Select Data.xlsx and click Select.
3. Click Insert script
4. Insert a line above the newly created LOAD statement and enter the following:
Data:
The script should now look like this:
CountryMap:
CountryMap:
MAPPING LOAD * INLINE [
Country, NewCountry
U.S.A., US
U.S., US
United States, US
United States of America, US
];
Data:
LOAD
ID,
Name,
ApplyMap('CountryMap', Country) as Country,
Code
FROM 'lib://Tutorial Files/Data.xlsx'
(ooxml, embedded labels, table is Sheet1);
The first parameter of the ApplyMap() function has the map name enclosed in single quotes. The
second parameter is the field that has the data that is to be replaced.
Use the Preview section of the data model viewer to view the resulting table.
The various spellings of the United States have all been changed to US. There is one record that was
not spelled correctly so the ApplyMap() function did not change that field value. Using the
ApplyMap() function, you can use the third parameter to add a default expression if the mapping
table does not have a matching value.
7. Add 'US' as the third parameter of the ApplyMap() function, to handle such cases when the country
may have been entered incorrectly:
ApplyMap('CountryMap', Country, 'US') as Country
The script should now look like this:
CountryMap:
MAPPING LOAD * INLINE [
Country, NewCountry
U.S.A., US
U.S., US
United States, US
United States of America, US
];
Data:
LOAD
ID,
Name,
ApplyMap('CountryMap', Country, 'US') as Country,
Code
FROM [lib://Tutorial Files/Data.xlsx]
(ooxml, embedded labels, table is Sheet1);
MapSubstring() function
The MapSubstring() function allows you to map parts of a field.
In the table created by ApplyMap() we now want the numbers to be written as text, so the MapSubstring()
function will be used to replace the numeric data with text.
Do the following:
1. In the Data load editor, add the following script lines at the end of the CountryMap section, but
before the Data section.
The script should look like this:
CountryMap:
MAPPING LOAD * INLINE [
Country, NewCountry
U.S.A., US
U.S., US
United States, US
United States of America, US
];
CodeMap:
MAPPING LOAD * INLINE [
F1, F2
1, one
2, two
3, three
4, four
5, five
11, eleven
];
Data:
LOAD
ID,
Name,
ApplyMap('CountryMap', Country, 'US') as Country,
Code
FROM [lib://Tutorial Files/Data.xlsx]
(ooxml, embedded labels, table is Sheet1);
2. In the Data section of the script modify the Code statement as follows:
MapSubString('CodeMap', Code) as Code
The Data section of the script should now look like this:
Data:
LOAD
ID,
Name,
ApplyMap('CountryMap', Country, 'US') as Country,
MapSubString('CodeMap', Code) as Code
FROM 'lib://Tutorial Files/Data.xlsx'
(ooxml, embedded labels, table is Sheet1);
Now let us take a look at the results of the MapSubstring() function. The loaded data now looks like this:
Use the Preview section of the data model viewer to view the resulting table.
The numeric characters were replaced with text in the Code field. If a number appears more than once as it
does for ID=3, and ID=4, the text is also repeated. ID=4, Susan McDaniels had a 6 in her code. Since 6 was
not mapped in the CodeMap table, it remains unchanged. ID=5, Dean Smith, had 111 in his code. This has
been mapped as 'elevenone'.
Map … Using
The Map … Using statement can also be used to apply a map to a field but it works a little differently than
ApplyMap(). While ApplyMap() handles the mapping every time the field name is encountered, Map …
using handles the mapping when the value is stored under the field name in the internal table.
Let’s take a look at an example. Assume we were loading the Country field multiple times in the script and
wanted to apply a map every time the field was loaded. The ApplyMap() function could be used as illustrated
earlier in this tutorial or Map … Using can be used.
If Map … Using is used then the map is applied to the field when the field is stored to the internal table. So
in the example below, the map is applied to the Country field in the Data1 table but it would not be applied to
the Country2 field in the Data2 table. This is because the Map … using statement is only applied to fields
named Country. When the Country2 field is stored to the internal table it is no longer named Country. If you
want the map to be applied to the Country2 table then you would need to use the ApplyMap() function.
Unmap
The Unmap statement ends the Map … Using statement so if Country were to be loaded after the Unmap
statement, the CountryMap would not be applied.
From the top of a hierarchy to the bottom, the members are progressively more detailed. For example, in a
dimension that has the levels Market, Country, State and City, the member Americas appears in the top level
of the hierarchy, the member U.S.A. appears in the second level, the member California appears in the third
level and San Francisco in the bottom level. California is more specific than U.S.A., and San Francisco is
more specific than California.
Storing hierarchies in a relational model is a common challenge with multiple solutions. There are several
approaches:
For the purposes of this tutorial we will be creating an Ancestor list since it presents the hierarchy in a form
that is directly usable in a query. Further information on the other approaches can be found in the Qlik
Community.
The prefix will transform a loaded table into an expanded nodes table; a table that has a number of additional
columns; one for each level of the hierarchy.
Do the following:
3. In the data load editor, click ± on the data connection to select a file to load data from.
4. Select Winedistricts.txt and click Select.
5. Uncheck the Lbound and RBound fields so they are not loaded.
6. Click Insert script.
The loaded script should look like this:
LOAD
NodeID,
ParentID,
NodeName
FROM [lib://Tutorial Files/Winedistricts.txt]
(txt, utf8, embedded labels, delimiter is '\t', msq);
7. Insert a new line above the LOAD statement, and enter the following:
Hierarchy (NodeID, ParentID, NodeName)
Use the Preview section of the data model viewer to view the resulting table.
The resulting expanded nodes table has exactly the same number of records as its source table: One per
node.
The expanded nodes table is very practical since it fulfills a number of requirements for analyzing a hierarchy
in a relational model:
l All the node names exist in one and the same column, so that this can be used for searches.
l In addition, the different node levels have been expanded into one field each; fields that can be used
in drill-down groups.
l It can be made to contain a path unique for the node, listing all ancestors in the right order.
l It can be made to contain the depth of the node, i.e. the distance from the root.
NodeName
Also here, the LOAD statement needs to have at least three fields: An ID that is a unique key for the node, a
reference to the parent and a name. The prefix will transform the loaded table into an ancestor table, a table
that has every combination of an ancestor and a descendant listed as a separate record. Hence, it is very
easy to find all ancestors or all descendants of a specific node.
Do the following:
1. In the data load editor modify the Hierarchy statement so that is reads as follows:
HierarchyBelongsTo (NodeID, ParentID, NodeName, BelongsToID, BelongsTo)
Use the Preview section of the data model viewer to view the resulting table.
The ancestor table is very practical since it fulfills a number of requirements for analyzing a hierarchy in a
relational model:
l If the node ID represents the single nodes, the ancestor ID represents the entire trees and sub-trees
of the hierarchy.
l All the node names exist both in the role as nodes and in the role as trees, and both can be used for
searches.
l It can be made to contain the depth difference between the node depth, and the ancestor depth, that
is, the distance from the root of the sub-tree.
Authorization
It is not uncommon that a hierarchy is used for authorization. One example is an organizational hierarchy.
Each manager should obviously have the right to see everything pertaining to their own department,
including all its sub-departments. But they should not necessarily have the right to see other departments.
This means that different people will be allowed to see different sub-trees of the organization. The
authorization table may look like the following:
In this case, Carol is allowed to see everything pertaining to the CEO and below; Larry is allowed to see the
Product organization; and James is allowed to see the Engineering organization only.
6 QVD files
QVD files are one of the recognized types of files that can be used as a data connection.
QVD files can be read in two modes: standard (fast) and optimized (faster). The selected mode is determined
automatically by the Qlik Sense script engine. Optimized mode can be utilized only when all loaded fields are
read without any transformations (formulas acting upon the fields), although renaming of fields is allowed. A
Where clause causing Qlik Sense to unpack the records will also disable the optimized load.
A QVD file holds exactly one data table and consists of three parts:
l An XML header (in UTF-8 char set) describing the fields in the table, the layout of the subsequent
information and some other metadata.
l Symbol tables in a byte-stuffed format.
l Actual table data in a bit-stuffed format.
QVD files can be used for many purposes. Four major uses can be easily identified. More than one may apply
in any given situation:
l Incremental load
In many common cases the QVD functionality can be used for facilitating incremental load by
exclusively loading new records from a growing database.
l Explicit creation and naming using the Store command in the Qlik Sense script.
State in the script that a previously-read table, or part thereof, is to be exported to an explicitly-named
file at a location of your choice.
There is no difference between the resulting QVD files, with regard to reading speed.
Store
This script function creates a QVD or a CSV file.
Syntax:
The statement will create an explicitly named QVD or CSV file. The statement can only export fields from
one data table. If fields from several tables are to be exported, an explicit join must be made previously in the
script to create the data table that should be exported.
The text values are exported to the CSV file in UTF-8 format. A delimiter can be specified, see LOAD. The
store statement to a CSV file does not support BIFF export.
Examples:
1. In the app, Advanced Scripting tutorial, open the data load editor.
2. In the script section select the section Product.
The script should look like this:
CrossTable(Month, Sales)
LOAD
Product,
"Jan 2014",
"Feb 2014",
"Mar 2014",
"Apr 2014",
"May 2014",
"Jun 2014"
FROM 'lib://Tutorial Files/Product.xlsx'
(ooxml, embedded labels, table is Product);
3. Add a new line at the end of the script. For this tutorial we will take the last example above,modified
for the Product script:
store * from Product into 'lib://Tutorial Files/Product.qvd';
5. Click ± on the Tutorial files data connection to view the available files. The Product.qvd file should
now be in the list of files.
This data file is the result of the Crosstable script and is a three-column table, one column for each category
(Product, Month, Sales). This data file could now be used to replace the entire Product script section.
Buffer
QVD files can be created and maintained automatically via the Buffer prefix. This prefix can be used on most
LOAD and SELECT statements in script. It indicates that QVD files are used to cache/buffer the result of
the statement.
Syntax:
If no option is used, the QVD buffer created by the first execution of the script will be used indefinitely.
Example 1:
incremental
The incremental option enables the ability to read only part of an underlying file. Previous size of the file is
stored in the XML header in the QVD file. This is particularly useful with log files. All records loaded at a
previous occasion are read from the QVD file whereas the following new records are read from the original
source and finally an updated QVD file is created.
Example 2:
Note that the incremental option can only be used with LOAD statements and text files and that
incremental load cannot be used where old data is changed or deleted!
Amount is a number specifying the time period. Decimals may be used. The unit is assumed to be days if
omitted.
The stale after option is typically used with database sources where there is no simple timestamp on the
original data. A stale after clause simply states a time period from the creation time of the QVD buffer after
which it will no longer be considered valid. Before that time the QVD buffer will be used as source for data
and after that the original data source will be used. The QVD buffer file will then automatically be updated
and a new period starts.
Example 3:
QVD buffers will normally be removed when no longer referenced anywhere throughout a complete script
execution in the app that created it or when the app that created it no longer exists. The Store statement
should be used if you wish to retain the contents of the buffer as a QVD or CSVfile.
l Loading a QVD file as an explicit data source. QVD files can be referenced by a load statement in the
Qlik Sense script just like any other type of text files (csv, fix, dif, biff, and so on).
Example:
l Automatic loading of buffered QVD files. When using the buffer prefix on load or select statements,
no explicit statements for reading are necessary. Qlik Sense will determine the extent to which it will
use data from the QVD file as opposed to acquiring data using the original LOAD or SELECT
statement.
l Accessing QVD files from the script. A number of script functions (all beginning with QVD) can be
used for retrieving various information on the data found in the XML header of a QVD file.