0% found this document useful (0 votes)

148 views30 pages

Qlik Associative Big Data Index Setup Configuration and Deployment

The document describes deploying Qlik's Associative Big Data Index (QABDI) on Amazon Web Services Elastic Kubernetes Service (EKS) to index sample datasets and build on-demand applications in Qlik Sense. Key steps include deploying an EKS cluster, installing and configuring required tools, setting up QABDI using Helm charts, and indexing sample data located on Amazon Elastic File System (EFS).

Uploaded by

Manuel Sosa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views30 pages

Qlik Associative Big Data Index Setup Configuration and Deployment

Uploaded by

Manuel Sosa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

An Example Qlik Associative Big Data Index

Deployment in AWS EKS

Introduction
The Qlik Associative Big Data Index delivers (QABDI), Qlik’s associative experience to Big Data,
allowing users to freely explore and search big data while leaving the data where it resides.

This new capability provides a governed performant associative layer which can be deployed within
sources such as Hadoop based data lakes, without the need to load the data into memory. QABDI
Index enables fast and engaging data discovery on massive data volumes with full access to all the
details of the underlying data.

This paper is designed to deliver a technical overview of the steps involved deploying a QABDI
infrastructure in AWS EKS, indexing a sample dataset and building an on-demand solution in Qlik
Sense.

Scenario

The deployment supports a sample customer use case in which they have a requirement to derive
value from a large volume of data to be analyzed with Qlik Sense, they want to allow a section of
users to access all the data in a governed environment without impacting the source with SQL
queries. A combination of approaches will be utilized including the deploying QABDI in conjunction
with on demand app generation (ODAG). The source data used as in this paper is a combination of
open source travel data from Flights, New York Taxi, Chicago Taxi and New York City Bike
websites.

All data is in Parquet file format the expectation is the data has been “prepared” prior to the indexing
procedure, potentially by Qlik Data Catalyst. The following environment is used to analyze this
data:

• Deployment of the QABDI machinery within AWS EKS.

o 3 Tables, 66 columns 2.7Bn rows

o 2 x i3.8xlarge Ubuntu EC2 instances (32 CPU, 244GB RAM)

• A “live” application with an ODAG link

• ODAG detail app to replace the source for QABDI instead of the database

• An in-memory application using the QABDI selection GUI

The following architecture in a two node AWS EKS Kubernetes cluster supports the scenario:

A user logs into the Selection App in live

mode which contains a series of
dimensional selections in filter boxes and
a KPI obect with a count.

A user selects dimensional criteria in the

selection app from filter boxes/charts and
the map. After the governed limit is
reached the navigation button on the
toolbar becomes active with a green
indicator. The user can then choose to
generate a new application by
dynamically reloading data from the
QABDI and not the source database to
generate the detail in memory app.
The Prerequisites

This following high level process enables the deployment of QABDI in an AWS EKS cluster:

• Deployment of an EKS cluster using eksctl

• Docker login to Bintray with user and API Key

• Adding the Helm repository to install the software and charts

• Preparaing and indexing the data

Workstation

The cluster is deployed using a workstation with the following prerequisites in place and the
following software installed on an Ubuntu 18.04 instance. (Note Kubectl currently has an issue with
windows deployment terminal so Linux is recommended):

• The AWS command line interface configured to access the instance - Aws Cli

• Kubernetes command line tool - Kubectl

• Package manager for Kubernetes - Helm

• Command line JSON processor - jq

Qlik Sense Enterprise

February 2019 release or above with the following settings is also required:

In the settings file in C:\ProgramData\Qlik\Sense\Engine:

[Settings 7]
EnableBDIClient=1
BDIAsyncRequests=1
BDIStrictSynchronisation=0

Chart recommendation feature that leads to complex expressions generated from drag-and-drops is
not yet supported by QABDI, it can be disabled in the C:\Program
Files\Qlik\Sense\CapabilityServicecapabilities. json file with:
{
"flag": "DISABLE_AUTO_CHART",
"enabled": true
}

A per application set statement is also required to disable the insight advisor

SET DISABLE_INSIGHTS = 1;
AWS EKS Installation

Instantiation of the EKS cluster can be achieved using an open source utility eksctl :

$ curl -sL
"https://github.com/weaveworks/eksctl/releases/download/latest_release/eksctl_$(uname-
s)_amd64.tar.gz" | tar xz -C /tmp
$ sudo mv /tmp/eksctl /usr/local/bin

aws-iam-authenticator

A tool to use AWS IAM credentials to authenticate to a Kubernetes cluster is required. To enable
this aws-iam-authenticator can be used.

https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html

$ curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-

2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator
$ chmod +x ./aws-iam-authenticator
$ mkdir bin
$ cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export
PATH=$HOME/bin:$PATH
$echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc

Creating the Cluster Bootstrap with eksctl

The cluster is Initialized with a name, region, tags,nodes and node type. For additional configuration
options see the eksctl documentation:

$ eksctl create cluster \

--name=icd-eks \
--region=eu-west-1 \
--tags owner=ICD \
--nodes=2 \
--node-type=i3.8xlarge \
--ssh-access \
--ssh-public-key=bdi_icd_key

If eksctl fails, check CloudFormation for error messages. Confirm that aws-iam-authenticator has
worked by checking your current Kubernetes context:

$ kubectl config current-context

Initializing Helm into the EKS cluster

Assuming Helm is installed locally, the following four commands initialize it in the EKS cluster:

$ kubectl create -f tiller-rbac.yaml

$ helm init –-service-account tiller

Configuring EFS

The index is created within the EFS file system created as follows:

$ aws efs --region $YOUR_REGION create-file-system \

--creation-token $YOUR_CLUSTER_NAME-efs
--performance-mode maxIO

This will also create a FileSystemId, additionally VpcId and SubnetIds are required

Mount points are created for the EFS storage and connected to the Vpc that the EKS cluster is in.
The VpcId, SubnetId and SecurityGroupId’s are required, the Vpc Id can be found as follows:

$ aws ec2 --region <region> describe-vpcs \

--filters "Name=tag-key,Values=kubernetes.io/cluster/<cluster-name>" \
| jq -r '.Vpcs[0].VpcId'

Sample output:
vpc-0ad865ba8700c957d

Retrieve the SubnetId’s within the Vpc:

$ aws ec2 --region <region> describe-subnets \

--filters "Name=vpc-id,Values=<vpc-id>" \
| jq -r '.Subnets | unique_by(.AvailabilityZone) | .[].SubnetId'

Sample output:

subnet-019b8cc9e2a5c7ef0
subnet-0aba1710d318a9db4
subnet-039dfe928f362c575

Retrieve the SecurityGroupId’s:

$ aws ec2 --region $YOUR_REGION describe-security-groups \

--filters 'Name=vpc-id,Values=$YOUR_VPC_ID,Name=group-name,Values=*nodegroup*' \
| jq -r .SecurityGroups[].GroupId

Create a mount point for each SubnetId with the VpcId and SecurityGroupIds:

$ aws efs --region $YOUR_REGION create-mount-target \

--file-system-id $YOUR_FILE_SYSTEM_ID \
--security-group $YOUR_SECURITY_GROUP \
--subnet-id $YOUR_SUBNET_ID_1
$ aws efs --region $YOUR_REGION create-mount-target \
--file-system-id $YOUR_FILE_SYSTEM_ID \
--security-group $YOUR_SECURITY_GROUP \
--subnet-id $YOUR_SUBNET_ID_2

The efs provisioner allows you to mount EFS storage as PersistentVolumes in Kubernetes.

Install the EFS provisioner (name = aws-efs with storgeclass = efs) with Helm by setting
the FileSystemId and region. Be sure Path is set to / or the pod will fail to create.

$ helm install --name efs-provisioner stable/efs-provisioner \

--set efsProvisioner.efsFileSystemId=<file-system-id> \
--set efsProvisioner.awsRegion=<region> \
--set efsProvisioner.path=/ \
--set efsProvisioner.provisionerName=aws-efs \
--set efsProvisioner.storageClass.name=efs

For more configuration options consult the official EFS provisioner chart.
Deploying QABDI in the AWS EKS Cluster
Key and Repository setup

Verify the Kubernetes cluster is configured as follows:

$ kubectl config current-context

Helm packages Kubernetes applications together as a "chart". These charts are tarballs that are
stored externally.

Bintray is used as a chart repository, adding the Helm repository containing the QABDI charts as
follows:

$ sudo helm repo add bt_qlik https://qlik.bintray.com/qabdicharts

Synchronizing with the Bintray Helm chart repository:

$ sudo helm repo update

The latest charts and app are listed as follows:

$ sudo helm search bdi

NAME CHART VERSION APP VERSION DESCRIPTION

bt_qlik/bdi 1.0.0 1.4.0 Big Data Index

Deployment of QABDI

Installation of the QABDI chart with the default values achieved by providing a release-name,
repository (bt_qlik), license acceptance and licence key as follows:

$ helm install --name <release-name> bt_qlik/bdi \

--set acceptLicense=true \
--set 'license.key=enterkeyguid'

The release-name is a string that can be used to differentiate Helm deployments (as you could
theoretically deploy QABDI more than once on a cluster). If one is not provided, Helm will generate
one.

By setting acceptLicense=true you agree to the Qlik User License Agreement (QULA), which is
required to start the qsl_processor_tool and the indexer_tool. You don't need to do it while running
helm install but it is the easiest way. Another way to accept the license agreement is to log in to the
bastion and type export ACCEPT_QULA=true. When running helm install the QULA text will be
printed to the console.
The default values.yaml and be overridden with additional yaml files and optionally set flags:

$ helm install --name bdi bt_qlik/bdi \

-f my_values.yaml \
-f my_other_values.yaml \
--set 'image.tag=0.265.0'

The number of QABDI services are configured in additional yaml files. To install the configuration
required i.e. three indexers, one indexingmanager, one qslexecutor, one qslmanager, two
qslworkers and three symbolservers by changing the replicaCounts :

## Configuration values for BDI QSL Worker components.

##
qslworker:
## Override the components name (defaults to qslworker).
##
# nameOverride:
## Number of replicas.
##
replicaCount: 2

And the repository is also configured a yaml file:

## Image configuration.
image:
repository: qlik-docker-qabdi.bintray.io/bdiproduct

## List of secrets that can pull private Docker images.

imagePullSecret:
- artifactory-docker-secret

To access the index from Sense via the QSL manager LoadBalancer can be used with default port
55000 (qsl-manager-loadbalancer.yaml) provided :
## BDI values

qslmanager:
service:
type: LoadBalancer

In addition, a series of yaml files are provided with this paper for varying data volumes.

Checking the pods running per node and status will show a similar output to:

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

efs-provisioner-75f9f8fd74-tvzdn 1/1 Running 0 39d
icd-mn-bdi-bastion-556858d676-pfmng 1/1 Running 0 5d
icd-mn-bdi-diskcache-mnw4k 1/1 Running 0 5d
icd-mn-bdi-diskcache-nhd8g 1/1 Running 0 5d
icd-mn-bdi-indexer-0 1/1 Running 0 5d
icd-mn-bdi-indexer-1 1/1 Running 0 5d
icd-mn-bdi-indexer-2 1/1 Running 0 5d
icd-mn-bdi-indexingmanager-6467ccf8b7-dvvwq 1/1 Running 0 5d
icd-mn-bdi-qslexecutor-55c77f5984-h9rsn 1/1 Running 0 5d
icd-mn-bdi-qslmanager-7455b84b66-m58bn 1/1 Running 0 5d
icd-mn-bdi-qslworker-0 1/1 Running 0 5d
icd-mn-bdi-qslworker-1 1/1 Running 0 5d
icd-mn-bdi-symbolserver-0 1/1 Running 0 5d
icd-mn-bdi-symbolserver-1 1/1 Running 0 5d
icd-mn-bdi-symbolserver-2 1/1 Running 0 5d

Retrieving the external IP of the qsl manager which will be used as the Host when creating a new
QABDI connection in Qlik Sense:

$ kubectl get svc

NAME TYPE CLUSTER-IP EXTERNAL-IP

icd-mn-bdi-indexer ClusterIP None <none>
icd-mn-bdi-indexingmanager ClusterIP 10.100.75.34 <none>
icd-mn-bdi-qslexecutor ClusterIP 10.100.84.104 <none>
icd-mn-bdi-qslmanager LoadBalancer 10.100.115.189
a3b0757b1decb11e88b8f0a1f2ce7daa-397101174.eu-west-1.elb.amazonaws.com
icd-mn-bdi-qslworker ClusterIP None <none>
icd-mn-bdi-symbolserver ClusterIP None <none>
kubernetes ClusterIP 10.100.0.1 <none>

Note the host for the QABDI connection in the above example will be:

a3b0757b1decb11e88b8f0a1f2ce7daa-397101174.eu-west-1.elb.amazonaws.com
Mount the EFS Drive into the Pods

The deployed pods required a shared mount to access the source parquet files and to act as a
repository for the index output.

Note. The mount command requires root access to enable this “privileged” is required to be set to
“true” allowing the docker container root access to the pods:

$ helm upgrade <release-name> <repo/chart> -f <yaml> –-set image.privileged=true

Mounting the drives is a two stage process (in this case a shell script – exec_in_all_pods.sh shared
with this paper) is used to execute the commands in all pods vs individually, firstly the shared folder
is created:

$ ./exec_in_all_pods.sh <cluster name> 'mkdir /home/efs'

The EFS drive is mounted with reference to the Filesystem id and region:

$ ./exec_in_all_pods.sh <cluster name> 'mount -t nfs4 -o

nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport
<FileSystemID>.efs.<AWS_REGION>.amazonaws.com:/ /home/efs'
Indexing the Data

Configuration for Indexing

The source parquet files are organized in an EFS bucket and the required configuration files are
updated to specify the source/output and associations. All configuration files are stored in
/home/ubuntu/dist/runtime/config/ in the pods.

Data source (Parquet):

field_mappings_file.json

This file is required to specify the associations between the files (called A2A in the indexing
process).

{
"field_mappings": [
{
"column1": "Flights.Flight_Year",
"column2": "Link.link_flight_year"
},
{
"column1": "Taxi_Bike_Trips.pickup_year",
"column2": "Link.link_pickup_year"
}
]
}

indexing_setting.json

The purpose of this file is to indicate where the source data/output index/mapping file for association
are stored along with the model name (alltrips).

{
"output_root_folder": "/home/output",
"symbol_output_folder": "",
"index_output_folder": "",
"symbol_positions_output_folder": "",
"symbol_server_async_threads": 1,
"create_column_index_threads": 1,
"dataset_name": "alltrips",
"source_data_path":"/home/data/alltrip_source",

"field_mappings_file":"/home/ubuntu/dist/runtime/config/field_mappings_file.json
",
"logging_settings_file": ""
}

“output_root_folder” - output folder for the index files/log and config files
“dataset_name” - used as the model name in Qlik Sense
“source_data_path” - the location of the NYC sample parquet files
”Field_mappings_file” - name of the associations json file.

Starting the QABDI machinery and Indexing

All the indexing startup scripts and subsequent tasks can be executed from anyone of the available
pods, for this deployment the bastion is used by logging on to it as follows:

$ kubectl exec -it icd-mn-bdi-bastion-556858d676-pfmng bash

Indexing Startup: start_indexing_env.sh

Within the indexing folder /home/ubuntu/dist/runtime/scripts/indexer:

$ ./dist/runtime/scripts/indexer/start_indexing_env.sh

Resulting in a similar output as below:

[19-02-01T08:07:31:821]-[idx_srv-info]-[000016] RegistryServiceClient
connects to icd-mn-indexingmanager:55057
[19-02-01T08:07:31:831]-[idx_srv-info]-[000016] RegistryServiceClient
connects to icd-mn-indexingmanager:55057
[19-02-01T08:07:31:836]-[idx_srv-info]-[000016] PersistenceManagerServ
iceClient connects to icd-mn-indexingmanager:55010
[19-02-01T08:07:31:874]-[idx_srv-info]-[000016] IndexerServiceServer l
istens on icd-mn-indexer:55040
[19-02-01T08:07:31:877]-[idx_srv-info]-[000016] RegistryServiceClient
connects to icd-mn-indexingmanager:55057
[19-02-01T08:07:31:882]-[idx_srv-info]-[000016] RegistryServiceClient
connects to icd-mn-indexingmanager:55057
[19-02-01T08:07:31:883]-[reg_serv-info]-[000018] Registered indexer at
icd-mn-indexer:55040 - id: 403090185676891958616864329674924927981
[19-02-01T08:07:31:884]-[idx_srv-info]-[000016] Registered indexer at
icd-mn-indexer:55040 - id: 403090185676891958616864329674924927981

Indexing Service Management: service_manager.sh

Once the indexing services have started the service_manager.sh script can be used to query, start,
or stop indexing:

$ ./home/ubuntu/dist/runtime/scripts/indexer/service_manager.sh

source_data_path: /home/data/alltrips_source
Register ip found in cluster configuration: qlik-bdi-indexingmanager
Running in interactive Mode:
Valid Options:
h) help
1) list
2) stop
3) start
a) stop all services
q) quit

Listing the running processes shows:

Service Type : IP Address : Port

************************ : ******************************************* : **********
Registry : icd-mn-bdi-indexingmanager : 55057
PersistenceManager : icd-mn-bdi-indexingmanager : 55010
IndexingManager : icd-mn-bdi-indexingmanager : 55020
Symbol : icd-mn-bdi-symbolserver-0.icd-mn-bdi-symbolserver : 55030
Symbol : icd-mn-bdi-symbolserver-1.icd-mn-bdi-symbolserver : 55030
Symbol : icd-mn-bdi-symbolserver-2.icd-mn-bdi-symbolserver : 55030
Indexer : icd-mn-bdi-indexer-0.icd-mn-bdi-indexer : 55040
Indexer : icd-mn-bdi-indexer-1.icd-mn-bdi-indexer : 55040
Indexer : icd-mn-bdi-indexer-2.icd-mn-bdi-indexer : 55040
IndexMaintenance : icd-mn-bdi-indexingmanager : 55050
*********************** : ******************************************* : **********
Index the data with task_manager.sh

The task_manager.sh script is used to invoke the indexing process via a series of steps:

$ ./dist/runtime/scripts/indexer/task_manager.sh

Load commonly used functions

Gathering environment variables
Using user specified IP address 127.0.0.1
BDI_LOCAL_IP=127.0.0.1
BDI_ROOT_FOLDER=/home/ubuntu/dist
BDI_FOLD_STRUCTURE=DEPLOYMENT
BDI_CONFIG_FOLDER=/home/ubuntu/dist/runtime/config
BDI_CONFIG_TEMPLATE_FOLDER=/home/ubuntu/dist/runtime/config/template
BDI_INDEXER_TOOL_LOC=/home/ubuntu/dist/indexer/indexer_tool
BDI_QSL_PROCESSOR_TOOL_LOC=/home/ubuntu/dist/qsl_processor/qsl_processor_tool
BDI_GW_LOC=/home/ubuntu/dist/gateway/gw
Setting default cluster configuration file:
/home/ubuntu/dist/runtime/config/cluster.json
Using security key file /root/.ssh/inter_cluster_key.pem
Loading indexing settings from : /home/ubuntu/dist/runtime/config/indexing_setting.json
source_data_path: /home/efs/alltrips
Running in interactive Mode:
Valid Options: (Options marked with **** are to be triggered to complete index creation.
h) help
1) **** scan data for schema generation ****
2) view generated schema config
3) **** register schema (including table map for index maintenance) ****
a) **** start indexing creation full flow (option 4, 5 and 6) ****
4) add indexing task
5) create column index
6) create a2a
l) list task progress
t) list indexing tasks
r) resume indexing task
s) scan data to generate statistics (DO NOT USE IN NORMAL FLOW! Run Option 1 to
generate schema)
q) quit
Choose Action:

Step 1 scans the data and creates a series of json files in the output folder
home/data/alltrips_output/indexing/output/config/indexer containing the data types and source
parquet files reference:
1) **** scan data for schema generation ****

Once complete the following message appears in the console:

[19-02-01 10:59:55:701]-[console-info]-[345] DataScan finished .

Step 2 displays the result of step 1:

2) view generated schema config
****************************
/home/efs/All_Trips_output4/config/indexer/alltrips_schema.json
****************************
{
"associations" :
[
{
"data" :
[
{
"field" : "Flight_Year",
"table_name" : "Flights"
},
{
"field" : "link_flight_year",
"table_name" : "Link"
}
],
"op" : "add"
},
{
"data" :
[
{
"field" : "link_pickup_year",
"table_name" : "Link"
},
{
"field" : "pickup_year",
"table_name" : "Taxi_Bike_Trips"
}
],
"op" : "add"
}
],
"data_set_name" : "alltrips",
"tables" :
[
{
"fields" :
[
{
"col_no" : 0,
"name" : "ItinID",
"type" : "StringType"
},

Step 3 registers the schema in preparation to index:

3) **** register schema (including table map for index maintenance) ****

Once complete the following message appears in the console:

[19-02-01T08:13:16:316]-[idx_mgr_srv-info]-[000076] Registered schema for data source

"alltrips"
[19-02-01 08:13:16:318]-[console-info]-[000370] Response code: 0, schema_id: 0
/home/ubuntu/dist/indexer/indexer_tool -s indexmaintenance_client -r MaintainIndex -a
qlik-bdi-indexingmanager -p 55050 -l true -i
/home/output/config/indexer/alltrips_tablemap.json
[19-02-01 08:13:16:332]-[console-info]-[000376] Connect to qlik-bdi-
indexingmanager:55050
[19-02-01 08:13:16:339]-[console-info]-[000376] IndexMaintenanceServiceClient connects
to qlik-bdi-indexingmanager:55050
[19-02-01T08:13:16:341]-[idx_maintenance_srv-info]-[000120]
IndexMaintenanceServiceServer: MaintainIndex
[19-02-01 08:13:16:344]-[console-info]-[000376] Sent rpc MaintainIndex - Status: SUCCESS

Step 4 creates the entries in the RocksDB key value store:

4) add indexing task

Once complete the following message appears in the console:

[19-02-01T08:30:45:208]-[ss_srv-info]-[000504] /home/data/alltrips
_source/flights.table/ fares.parquet/1_0_9.parquet is a single parquet file
[19-02-01T08:31:33:681]-[ss_srv-info]-[000504] /home/data/alltrips_source/flights.table/
fares.parquet/1_1_4.parquet is a single parquet file
[19-02-01T08:32:21:881]-[ss_srv-info]-[000504] Create symbol table for table 'flights'
takes 8890 seconds
[19-02-01T08:32:21:881]-[ss_srv-info]-[000504] Apply compaction...
[19-02-01T08:33:27:193]-[ss_srv-info]-[000504] Apply compaction for table ' flights' in
dataset 'alltrips' takes 650 seconds
[19-02-01T08:33:27:193]-[ss_srv-info]-[000504] Create symbol table for table 'flights'
indataset 'alltrips'... DONE

Step 5 creates the index files per column in /alltrips_output/Index_output/alltrips:

5) create column index

Once complete the following message appears in the console:

[19-02-01T08:43:49:061]-[idx_srv-info]-[000038] Create indexlet for alltrips, flights,

ItinID, idx_0... DONE
[19-02-01T08:43:49:067]-[idx_srv-info]-[000038] Create indexlet for alltrips, flights,
Name, idx_0... DONE
[19-02-01T08:43:49:116]-[idx_mgr_srv-info]-[000076] Calculate SymbolPositions for task 0
[19-02-01T08:43:49:143]-[idx_mgr_srv-info]-[000076] Calculate global symbol to position
id... done in 0
[19-02-01T08:43:49:185]-[idx_srv-info]-[000038] No more indexlet in alltrips, table
flights for indexing

Step 6 creates the associations:

6) create a2a

Once complete the following message appears in the console:

[19-02-01T08:48:31:383]-[ss_srv-info]-[000065] Created A2A indexlet 0 for
Flights:Flight_Year – link_table:link_flight_year
[19-02-01T08:48:31:385]-[ss_srv-info]-[000065] Create A2A for link_table:flight_year
- link_flight_year: Vendor_ID... DONE

Checking on completion of the index shows:

Choose Action: l
[19-02-01 08:36:19:457]-[console-info]-[000400] Connect to icd-mn-indexingmanager:55020
[19-02-01 08:36:19:461]-[console-info]-[000400] Symbol Table creation progress at: 100%
[19-02-01 08:36:19:461]-[console-info]-[000400] UnmappedColumn Index creation progress
at: 100%
[19-02-01 08:36:19:462]-[console-info]-[000400] Column Index creation progress at: 100%
[19-02-01 08:36:19:463]-[console-info]-[000400] A2A creation progress at: 100%

Starting the Worker (Qlik Selection Language – (QSL)) Services

QSL services are required to process the selections made from the Sense client and to process any
extractions made in the load script from the Index into memory. The QSL services are dependent
on several services, including Indexing Registry, Persistence Manager and all Symbol services. All
executables to start the QSL are in runtime/scripts/qsl_processor folder.

The QSL services start with:

$ ./home/ubuntu/dist/runtime/scripts/qsl_processor/start_qsl_env.sh

once the following message appears in the console the process has started:

[19-02-01T08:52:26:585]-[Wrk:qlik-bdi-qslworker-info]-[000046] Registry : done with a2a

indexlet ( alltrips 0-3, 0 )
[19-02-01T08:52:26:585]-[Wrk:qlik-bdi-qslworker-info]-[000046] Total indexlets: 56
QABDI and Qlik Sense Applications
The Qlik Sense portion of the deployment uses Feb 2019 QSE and utilizes an on-demand
application generation solution using the QABDI as the source for the detail app and to build a basic
“live” application with the selection GUI.

• All Trips Live.qvf – Live mode selections app

• Flight Details Handle.qvf – Detail ODAG application using Handles

• Flight Details Cache.qvf - Detail application with modified ODAG script

• Taxi and Bike Details.qvf - Detail application with modified ODAG script

Qlik Sense settings for QABDI

Qlik Sense February 2019 release and above are the only versions which can use the QABDI
functionality and the following flags need to be set:

In the settings file in C:\ProgramData\Qlik\Sense\Engine:

[Settings 7]
EnableBDIClient=1
BDIAsyncRequests=1
BDIStrictSynchronisation=0
Chart recommendation feature that leads to complex expressions generated from drag-and-drops
which are not yet supported by QABDI, it can be disabled in capabilities. json file with:

{
"flag": "DISABLE_AUTO_CHART",
"enabled": true
}

A per application set statement is also required to disable the insight advisor:

SET DISABLE_INSIGHTS = 1;

Connectivity to QABDI and Live mode

One of the modes of interacting with the index is “live” mode which effectively allows Qlik Sense to
have a minimal memory footprint by only loading the metadata into memory.

A connection string is configured with the following parameters: Create a new connection string and
select BDI from the list and enter the criteria:

Host - external IP of the qsl manager

QSL Port - 55000
Data model - name of the “dataset_name” specified indexing_settings.json
Name - name of the connection to be stored in Qlik Sense
The live model import syntax:

IMPORT LIVE 'alltrips';

Currently QABDI does not support search and insight advisor, the search index and insights are
disabled:

SET CreateSearchIndexOnReload=0;
SET DISABLE_INSIGHTS = 1;

The QABDI Connector and Selection GUI

As part of the product a QABDI connector has been developed which will allow users to extract data
from the index using autogenerated script through the Data Load Editor (DLE) into an in-memory
application.

Opening the GUI displays all the available entities in the alltrips model for selection:
Inserting the script from connector GUI contains the following:

The set handle reference containing filter as a GUID:

SET $bdiHandle = '[alltrips].[hdfa71c23_3796_4732_8b2a_bfcfcf7c0b28]';

A default limit for the number of rows of data (MaxRowsPerTable) is set to 10,000 which can be
changed for a higher volume, this is controlled by an initial count before extraction resulting in the
script exiting if the limit is breached:

LET MaxRowsPerTable = 10000;

BDI CONNECT TO ' alltrips aa0785e5a14af11e9917dcb283-185840822.us-east-
1.elb.amazonaws.com:55000';

[rowcount@ Flights]:
QSL Select count(*) as nRows0 from [alltrips].[ Flights] at STATE ${bdiHandle};
let nFilteredRows = Peek('nRows0',0, Flights ');

TRACE 'nFilteredRows' = ${nFilteredRows};

if( nFilteredRows > MaxRowsPerTable ) then
TRACE 'Too many rows to import for Flights';
drop table [rowcount@ Flights];
Exit Script;
end if

drop table [rowcount@Flights];

Script for ODAG with Handles

The load scripts for on-demand template apps contain connection and data load commands whose
parameters are set by a special variable - odb_setHandle that the on-demand app service uses for
linkage. The odb_setHandle variable is used specifically for QABDI linkage and captures all the
selection states from the selection app:

TRACE Generated SELECTION STATE GUID: ;

TRACE $(odb_setHandle);

Flights:
QSL SELECT
[ItinID],
[SeqNum],
[Coupons],
[Flight_Year],
[Flight_Quarter],
[Origin],
[OriginCountry],
[OriginState],
[Dest],
[DestCountry],
[DestState],
[TkCarrier],
[Passengers],
[FareClass],
[Distance]
FROM [alltrips].[Flights]
AT STATE $(odb_setHandle);

The odb_setHandle gets replaced with the GUID of the handle:

AT STATE [alltrips].[h6790741d_5ed5_496c_9a5b_fd47b4c165e2]

Which are stored in the output/qsl_handles/alltrips folder on EFS.

Script Modifications converting existing ODAG apps to use QABDI

Conversion of existing SQL generating ODAG apps to use the index as the source replaces the
WHERE_CLAUSE variable creation with a selection state clause using QSL syntax, for example we can
create a specific QSL SET statement which will create the “set handle” containing a reference to the
columns selected and the data to filter on.

The QSL syntax will apply the set handle to the underlying model via an AT STATE statement with
the following syntax:

LOAD <column1,column2>;
QSL SELECT <column1,column2>
FROM <modelname.table>
AT STATE hPassedSelection;

The hPassedSelection set handle is dynamically populated by the ODAG process, in the example
below a set handle is created which consists of:

QSL SET hPassedSelection = {[alltrips].1 <rate_code={2}, payment_type={Dis}>};

hPassedSelection - name of set handle

alltrips - model name specified in the indexing process

<rate_code={2}, payment_type={Dis}>} - selection filters

The following changes are applied to an on-demand detail app which executes SQL in comparison
to the syntax required for QABDI

First subroutine – ExtendSelectionState

The first subroutine (ExtendWhere) is modified to generate a dynamic SELECTION_STATE statement

by converting the WHERE_PART generation by firstly renaming the subroutine to
ExtendSelectionState and replacing all WHERE_PART instances with SELECTION_STATE.

The SELECTION_STATE generation is modified create syntax for the selection state format and to
cater for multiple selection state criteria, all instances of WHERE_PART are placed with
SELECTION_STATE.

The WHERE and IN clauses are replaced with QSL syntax, the main change is substituting the model
name into the statement [alltrips] with a “.1” suffix to indicate current selections.

LET SELECTION_STATE = '{ [alltrips].1 <$(ColName)={$(Values)}>} ';

To cater for multiple selection states and the format required some string replacement is required:

LET SELECTION_STATE = replace('$(SELECTION_STATE)','>}', ',') &' $(ColName)={

$(Values) }>}';

The subroutine looks as follows

SUB ExtendSelectionState(Name, ValVarName)

LET T = Name & '_COLNAME';
LET ColName = $(T);
LET Values = $(ValVarName);
IF len(Values) > 0 THEN
IF len(SELECTION_STATE) > 0 THEN
LET SELECTION_STATE = replace('$(SELECTION_STATE)','>}',
',') &' $(ColName)={ $(Values) }>}';
ELSE
LET SELECTION_STATE = '{[alltrips].1 <$(ColName)={$(Values)}>}
';
ENDIF
ENDIF

END SUB;
Changing quoting char for SELECTION_STATE in CALL BuildValueList

For each of the bound fields modification of the quoting options in the CALL BuildValuelist
statements is required to cater for the QSL syntax, by changing the ASCII character code 0:

CALL BuildValueList(‘whatever the field bound’, 'OdagBinding', 'VAL', 0);

The script loops through all field bindings and calls the modified subroutine -
(ExtendSelectionState).

The set handle statement which will reference the SELECTION_STATE variable is applied to the fact
table to filter the data:

QSL SET hPassedSelection = $(SELECTION_STATE);

And finally, the set handle is applied to the QSL SELECT statement containing the fields, model
name and required table:

FROM [alltrips].[Flights]
AT STATE hPassedSelection;
Trouble Shooting and Cheat Sheet

Quick reference for scripts to run

The instance can be shut down by typing:

$ sudo helm del --purge qlik

Checking the status of the pods:

$ sudo kubectl get pods

Destroying the pods:

$ sudo helm del --purge qlik

Starting the indexing service and check the services are running:

$ cd /home/ubuntu/dist/runtime/scripts/indexer
$ ./start_indexing_env.sh

Checking the services:

$ ./service_manager.sh

Creating the index or just creating the schema:

$ ./task_manager.sh

Starting/stopping the QSL services:

$ cd /home/ubuntu/dist/runtime/scripts/qsl_processor
$ ./start_qsl_env.sh
$ ./stop_qsl_env.sh

Checking the version of QABDI:

$ kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}"

Error Checking

If the source data is not in the location specified in the indexing_setting.json file or the format is not
as described the following error will be thrown:

[18-08-25 08:31:08:846]-[console-critical]-[342] ********************** CRITICAL!

*********************
[18-08-25 08:31:08:846]-[console-critical]-[342] Missing or unsupported type for file
'/home/data/alltrips' Ref:1018[18-08-25 08:31:08:846]-[console-critical]-[342]
***************
To see which services are running:

$ ps aux|grep qsl
$ ps -ef | grep -E '[i]ndexer_tool|[q]sl_processor_tool'

Note that in case the Indexing/QSL processors stop responding, or crash or just disappear, the best
thing is to start with killing of all the running processes, and then when it comes to start the QSL
processors, you will need to start the indexing cluster, and then skip the perform Indexing tasks and
then start the QSL_Processor tool.

Currently you can kill the QSL and Indexer processes in two ways;

Recommended approach:

$ cd /home/ubuntu/dist/runtime/scripts/qsl_processor
$ ./stop_qsl_env.sh
$ cd /home/ubuntu/dist/runtime/scripts/indexer
$ ./service_manager.sh
$ Enter option: a (stop all services)

Kill command: connect to each QSL and Indexing instance, and type:

$ ps aux|grep qsl if found pkill --signal 9 -f qsl

$ cd /home/ubuntu/dist/runtime/scripts/indexer
$ ./start_indexing_env.sh
$ cd /home/ubuntu/dist/runtime/scripts/qsl_processor
$ ./start_qsl_env.sh

Configuration and logs in the QABDI environment

QABDI logs are stored in your indexing output folder. Based on the configuration defined for your
QABDI environment, it should be in the following location: /home/efs/alltrips_output/logs

Indexing_manager_service.log Check for index service registration and

schema registration on ports
55020,55010,55050

registry_service.log Check for status of indexing and symbol

servers
persistence_manager_service.log Check for status of persistence service port
55010

symbol_service.log Check for status of symbol service port 55030

and symbol output location

indexer_service.log Check for status of index service on port 55040

/worker/Wrk_xxx.qlog Check for activity of workers involved in startup

process on ports 44001,54001

/manager/Mgr_xxx.qlog Check for activity of workers involved in QSL

generation, line by line code, to see in realtime

tail -f Mgr_xxx-qslmanager_55000.qlog

/reg-exec/ReX_xxx.qlog Check for registry processes

Checking the configuration files updated during the indexing process stored in
output/config/indexer/:

symbol_service_xx.json – multiple files for multiple symbol servers

indexer_service_xx.json - multiple files for multiple indexing servers

registry_service.json

persitence_manager.json

indexing_manager_service.json

xxx_data_source_fullflow.json - parquet sources storage locations (xxx = model name)

xxx_data_source.json - parquet sources storage locations

xxx_schema.json - data type mappings in the parquet files.

Other commands

To upgrade an existing deployment after making changes to the configuration (using

my_values.yaml):
helm upgrade qlik-mn qlik/bdi -f my_values.yaml

To destroy the release and remove all pods, volumes and associated data:
helm del --purge qlik-mn

To increase the number of services, in this case workers:

kubectl scale deployment/qlik-mn-bdi-qslworker --replicas=4

About Qlik
Qlik is on a mission to create a data-literate world, where everyone can use data to solve their most
challenging problems. Only Qlik’s end-to-end data management and analytics platform brings together
all of an organization’s data from any source, enabling people at any skill level to use their curiosity to
uncover new insights. Companies use Qlik products to see more deeply into customer behavior,
reinvent business processes, discover new revenue streams, and balance risk and reward. Qlik does
business in more than 100 countries and serves over 48,000 customers around the world.
qlik.com

© 2018 QlikTech International AB. All rights reserved. Qlik®, Qlik Sense®, QlikView®, QlikTech®, Qlik Cloud®, Qlik DataMarket®, Qlik Analytics Platform®, Qlik NPrinting®, Qlik
Connectors®, Qlik GeoAnalytics®, Qlik Core®, Associative Difference®, Lead with Data™, Qlik Data Catalyst™, Qlik Associative Big Data Index™ and the QlikTech logos are trademarks of
QlikTech International AB that have been registered in one or more countries. Other marks and logos mentioned herein are trademarks or registered trademarks of their respective owners.
BIGDATAWP092618_MD

The Teradata Database - Part 3 Usage Fundamentals PDF
No ratings yet
The Teradata Database - Part 3 Usage Fundamentals PDF
20 pages
Website Dumps 98-364
No ratings yet
Website Dumps 98-364
109 pages
Hybrid Kubernetes Strategy - EKS in Production and RKE2 in Development
No ratings yet
Hybrid Kubernetes Strategy - EKS in Production and RKE2 in Development
8 pages
AWS Certified Data Engineer Associate Cheat Sheet
No ratings yet
AWS Certified Data Engineer Associate Cheat Sheet
54 pages
A Practical Guide To AWS EKS CI - CD
No ratings yet
A Practical Guide To AWS EKS CI - CD
20 pages
AWS EKS in Action
No ratings yet
AWS EKS in Action
23 pages
CH 01
No ratings yet
CH 01
18 pages
Launching An EKS Cluster
No ratings yet
Launching An EKS Cluster
6 pages
Amazon EKS Cheatsheet 1703176742
No ratings yet
Amazon EKS Cheatsheet 1703176742
8 pages
19 EKS-Cluster
No ratings yet
19 EKS-Cluster
11 pages
Tanzu On AWS
No ratings yet
Tanzu On AWS
35 pages
Advanced End-to-End Kubernetes DevSecOps Project
0% (1)
Advanced End-to-End Kubernetes DevSecOps Project
26 pages
11 - K8s On AWS Handout
No ratings yet
11 - K8s On AWS Handout
21 pages
AWS EKS CI - CD With AWS CodeCommit + AWS CodeBuild + AWS CodePipeline - Final
No ratings yet
AWS EKS CI - CD With AWS CodeCommit + AWS CodeBuild + AWS CodePipeline - Final
21 pages
Deploying A Microservices Application On Amazon EKS (Elastic Kubernetes Service)
No ratings yet
Deploying A Microservices Application On Amazon EKS (Elastic Kubernetes Service)
13 pages
CMPE282 Homework#3 Kubernetes
No ratings yet
CMPE282 Homework#3 Kubernetes
133 pages
Three Tier Architecture, Deployed On EKS
No ratings yet
Three Tier Architecture, Deployed On EKS
10 pages
Review and Create - Create Cluster - Amazon EKS6
No ratings yet
Review and Create - Create Cluster - Amazon EKS6
2 pages
Kubernetes On AWS (Using Amazon EKS) : Akash Agrawal - July'27 2019
No ratings yet
Kubernetes On AWS (Using Amazon EKS) : Akash Agrawal - July'27 2019
24 pages
AWS Eks
No ratings yet
AWS Eks
31 pages
Cloud Project
No ratings yet
Cloud Project
12 pages
Kubernetes EKS Cluster
No ratings yet
Kubernetes EKS Cluster
25 pages
3070 1036 Doc Eks
No ratings yet
3070 1036 Doc Eks
4 pages
AWS EKS Cluster Setup
No ratings yet
AWS EKS Cluster Setup
5 pages
AWS EKS Essentials A Comprehensive Guide - Ebenezer Paintsil
No ratings yet
AWS EKS Essentials A Comprehensive Guide - Ebenezer Paintsil
601 pages
The Power of Kubernetes On AWS
No ratings yet
The Power of Kubernetes On AWS
9 pages
Lesson 6
No ratings yet
Lesson 6
2 pages
Deploying A Production Ready Amazon EKS Cluster 1725314562
No ratings yet
Deploying A Production Ready Amazon EKS Cluster 1725314562
24 pages
Aws Eks
No ratings yet
Aws Eks
5 pages
Advanced Container Security - Jason Umiker - 28jun - Final
No ratings yet
Advanced Container Security - Jason Umiker - 28jun - Final
68 pages
Kubernetes Architecture and Components
No ratings yet
Kubernetes Architecture and Components
8 pages
Project Documentation
No ratings yet
Project Documentation
13 pages
Eks - Elastic Kubernetes Service
No ratings yet
Eks - Elastic Kubernetes Service
3 pages
Introduction To Amazon EKS-2024
No ratings yet
Introduction To Amazon EKS-2024
41 pages
Eks
No ratings yet
Eks
4 pages
How To Design and Provision A Production-Ready EKS Cluster: Image Credit: Pixabay
No ratings yet
How To Design and Provision A Production-Ready EKS Cluster: Image Credit: Pixabay
11 pages
Statefulset Documentation.
No ratings yet
Statefulset Documentation.
3 pages
Monitoring Lab Main
No ratings yet
Monitoring Lab Main
12 pages
CH 06
No ratings yet
CH 06
3 pages
3 Ways To Run Kubernetes On AWS (Inoreader)
No ratings yet
3 Ways To Run Kubernetes On AWS (Inoreader)
6 pages
AWS EKS Kubernetes
No ratings yet
AWS EKS Kubernetes
193 pages
Devops Project Report
No ratings yet
Devops Project Report
12 pages
Kubernetes HandsOn Project
No ratings yet
Kubernetes HandsOn Project
8 pages
EKS Best Practices
No ratings yet
EKS Best Practices
9 pages
Diffrence of Aws Eks and Azureaks
No ratings yet
Diffrence of Aws Eks and Azureaks
3 pages
Amazon EKS
No ratings yet
Amazon EKS
20 pages
Quizlt SAAC03 Notes
No ratings yet
Quizlt SAAC03 Notes
104 pages
AWS Kubernetes EKS Overview
No ratings yet
AWS Kubernetes EKS Overview
16 pages
Amazon EKS
No ratings yet
Amazon EKS
36 pages
Amazon Elastic Kubernetes Service (Amazon EKS) 04-07-2020 (Amazon EKS) (Summary)
No ratings yet
Amazon Elastic Kubernetes Service (Amazon EKS) 04-07-2020 (Amazon EKS) (Summary)
2 pages
EKS Overview
No ratings yet
EKS Overview
103 pages
Eks Setup
No ratings yet
Eks Setup
3 pages
200 EKS Questions
No ratings yet
200 EKS Questions
7 pages
CH 12
No ratings yet
CH 12
13 pages
12 - Kubespray
No ratings yet
12 - Kubespray
18 pages
WhizCard CLF C01 06 09 2022
No ratings yet
WhizCard CLF C01 06 09 2022
111 pages
S 1 Breakingamazoneksserviceinthenameofauditapr 20251744210130045
No ratings yet
S 1 Breakingamazoneksserviceinthenameofauditapr 20251744210130045
40 pages
Project - Real Time Monitoring Project
No ratings yet
Project - Real Time Monitoring Project
12 pages
Eks With Terraform
No ratings yet
Eks With Terraform
34 pages
K8s Report
No ratings yet
K8s Report
14 pages
ESP-VI Part B
No ratings yet
ESP-VI Part B
31 pages
University of Mumbai
No ratings yet
University of Mumbai
32 pages
Historical Inventory Trial Balance White Paper
100% (1)
Historical Inventory Trial Balance White Paper
17 pages
Tran - Thi Thu Hien
No ratings yet
Tran - Thi Thu Hien
78 pages
Database System Notes
100% (3)
Database System Notes
83 pages
2778A-En Writing Queries Using MS SQL Server Trans SQL-TrainerManual
No ratings yet
2778A-En Writing Queries Using MS SQL Server Trans SQL-TrainerManual
500 pages
Certification in Business Analytics - Curriculum
No ratings yet
Certification in Business Analytics - Curriculum
7 pages
PostgreSQL EASchema - SQL
No ratings yet
PostgreSQL EASchema - SQL
28 pages
Iot
100% (1)
Iot
54 pages
Python
No ratings yet
Python
197 pages
Os Mod4 PDF
No ratings yet
Os Mod4 PDF
18 pages
Encryption & Key Management For SQL Server: The Definitive Guide
No ratings yet
Encryption & Key Management For SQL Server: The Definitive Guide
32 pages
TCS
No ratings yet
TCS
21 pages
Jags User Manual
No ratings yet
Jags User Manual
42 pages
Oss Unit III
No ratings yet
Oss Unit III
44 pages
SQL Server:: Performance Tuning and Optimization
No ratings yet
SQL Server:: Performance Tuning and Optimization
2 pages
Database Management System
No ratings yet
Database Management System
35 pages
Dbms and Rdbms Questions
No ratings yet
Dbms and Rdbms Questions
16 pages
Telenor
No ratings yet
Telenor
24 pages
70-462: Administering Microsoft SQL Server 2012 Databases
No ratings yet
70-462: Administering Microsoft SQL Server 2012 Databases
5 pages
X3 Script Keywords Glossary at L.V. Expertise X3
No ratings yet
X3 Script Keywords Glossary at L.V. Expertise X3
16 pages
OpenEdge 10.2A SQL Command Reference
100% (1)
OpenEdge 10.2A SQL Command Reference
374 pages
Interview Question
No ratings yet
Interview Question
34 pages
Data Ogranization
No ratings yet
Data Ogranization
4 pages
Day 8-Siebel 7.7 EIM Lab - Draft B
No ratings yet
Day 8-Siebel 7.7 EIM Lab - Draft B
6 pages
DBWR
No ratings yet
DBWR
19 pages
SQL Server Interview Questions and Answers PDF - Freshers and Experienced Interview Questions PDF
0% (3)
SQL Server Interview Questions and Answers PDF - Freshers and Experienced Interview Questions PDF
16 pages
Blog - Arsalan Dehghani's Oracle Blog
No ratings yet
Blog - Arsalan Dehghani's Oracle Blog
18 pages