0% found this document useful (0 votes)

18 views

Running Databricks Migrations Code Analyzer

The document provides detailed instructions for running the Databricks Migrations Code Analyzer, including prerequisites, metadata extraction, and execution steps across various ETL platforms. It outlines the process for analyzing SQL code and interpreting results, as well as exporting XMLs from ETL tools. Additionally, it includes specific commands and configurations needed for different systems like Informatica, DataStage, and SQL-based systems.

Uploaded by

8syz8yq8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Running Databricks Migrations Code Analyzer

Uploaded by

8syz8yq8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Running Databricks Migrations Code Analyzer

Running Databricks Migrations Code Analyzer 1

Detailed steps for Running Analyzer 3
1. PREREQUISITES 3
2. EXTRACT THE METADATA 5
3. RUN THE ANALYZER 5
Informatica Powercenter 6
Informatica Cloud 6
All SQL code and DBT SQL code 6
( like Redshift, Snowflake, Teradata , Oracle, SQL server, Synapse, Greenplum, Netezza
and .sql files:) 6
SSIS 7
4. INTERPRETING THE RESULTS 7
Appendix 7
Exporting XML’s out of ETL tools 7
DataStage 7
Informatica PowerCenter XML export 8
For Informatica Cloud (IICS): 9
For SQL-Based Systems (Snowflake, Teradata, Netezza, Oracle,
Synapse, SQL server, Greenplum, Vertica, Presto, any DB etc.) 11
Azure Synapse (Dedicated) 12
Azure Synapse (Serverless) 14
SSIS 16
How is complexity calculated in the analyzer? 17
SQL Code Analysis 17
Informatica Code Analysis 18
DataStage Analysis 18
Talend Analysis 19
SSIS Code Analysis 20
Alteryx Code Analysis 20
BODS Code Analysis 21
SAS Code Analysis 21
Pentaho Code Analysis 22
Splitter Instructions: 23
Detailed steps for Running Analyzer
Contact Databricks PS if you need help on how to run it.
The analyzer can be executed on Windows, Mac and Linux

1. PREREQUISITES

IMPORTANT NOTE:
Execution of Code Analyzer ( and SQL Splitter) needs the following prerequisites.
So make sure to go through this section to get the prerequisites taken care of.

If you have downloaded the MAC version, please pay special attention to section C
below.

A) Analyzer Package Download and Integrity Checks

1)Download the Code Analyzer package using the SAS URL shared with you in
email

2) This Analyzer package is a zip file which contains Analyzer Binaries (both linux
and windows versions) for Code Analyzer and SQL Splitter.

3) If you would like to do integrity checks on the downloaded zip file, please let
your Databricks representative know and we can provide this information for
you.

B) Share Folder for Sending back the generated Code Analysis

report(s)
1.Execution of Code Analyzer (steps are in subsequent sections) generates
analysis reports in spreadsheet format ( xlsx). Don’t share these
reports with the Databricks team in email.

2.Instead use a Share folder (with proper access control) to upload the
generated report(s) and give access to Databricks technical point of contact.
a. You will receive a shared folder URL in the same email with Code
Analyzer download link.
b. If you have an existing secured file sharing process, set by your
organization, use it to share generated report files. Otherwise, you can
use the same shared-folder (in step a ) to upload the xlsx files.
C) MAC version requires an extra step to allow the running of the
analyzer (and more extra steps if you have M3+ chip on Mac)
1. After downloading the analyzer/splitter zip file and unzipping it. Navigate in
the mac terminal to the location of the unzipped analyzer contents (you may
want to place the analyzer in a folder outside of “Downloads”).
2. Perform an “ls -l” and you will notice that the “analyzer” unzipped file is not
executable. Change this by running “chmod +x analyzer”.
3. Re-run “ls -l” to verify you now see this:

-rwxr-xr-x@ 1 user.name staff 12345678 Jan 30 12:50 analyzer

4. Attempt to run the following command “./analyzer” which will bring up a
pop-up window saying the tool developer is not verified (this is expected).
DO NOT move it to trash, click Cancel.

5.
6. Navigate to System Settings -> Privacy and Security. Scroll down to where
you see the following and click “Open Anyway”.

7. Another window might open to verify that you want to open it, click open.
8. Re-run the command “./analyzer” to verify that you now have access to
run analyzer tool via command line. If it works, it will respond with the
successful output showing you which flags/arguments are available/required
by the tool.
9. If you have an M3, M3 pro, M3 max you also need to install Rosetta
(https://support.arduino.cc/hc/en-us/articles/7765785712156-Error-bad-CPU-t
ype-in-executable-on-macOS)

2. EXTRACT THE METADATA

All the major ETL platforms provide some kind of export of their code repositories.
Typically this is done into XML or JSON formats which can be used to restore the
environment. Here is a short guide for how to export the various environments:
DataStage
Informatica PowerCenter XML export
Informatica Cloud (IICS)
SQL-Based Systems
Azure Synapse (Dedicated)
Azure Synapse (Serverless)
Talend
SSIS

3. RUN THE ANALYZER

The analyzer can be executed on Windows and Linux and Mac. Before running the analyzer
you might need to move all config files into the same directory as the tool itself. Config files
might be packaged in the zip download in a ‘config’ directory. Simply copy those files into the
same directory as the ‘analyzer’ executable file.

Please confirm that the general_sql_specs.json is in the same directory as the analyzer
executable file which should be included in the download already. If other config files are
needed, please seek out your Databricks representative to help in acquiring any other specific
config files needed based on the source system.
Sample commands to run the For Datastage:
analyzer -t DATASTAGE -d “<folder with ds xml files>” -r <path to xlsx
report file>
Informatica Powercenter
analyzer -t INFA -d “<folder with xml files>” -r <path to xlsx report
file>

Informatica Cloud
analyzer -u ic2dws.json -t INFACLOUD -d “<folder with zip files>” -r
<path to xlsx report file>

All SQL code and DBT SQL code

( like Redshift, Snowflake, Teradata , Oracle, SQL server, Synapse,

Greenplum, Netezza and .sql files:)

Important Note
Analyzing SQL code requires following steps for accurate analysis. Make sure to follow the
given instructions below.
● Typically DDL statements (like create table, create views, create procedure and
create functions etc…) are extracted for analysis using various
utilities/commands and they are kept in a few large files. These files need to be
split before running the analyzer.

● Whereas the DML statements , queries and data load scripts etc..are maintained
as part of application code they are kept in a large number of small files. These
files cannot be split for running them through the analyzer.

● Follow below steps to get your code ready for running it through the Analyzer

1) Keep SQL DDL files and Rest of the SQL code in separate root folders. As an
example..

● sql_ddl → DDL files

● sql_other → All other SQL files
2) Run the sqlsplit program with sql_ddl as input. You need an empty folder for
keeping the output (let’s call this analyzer_input/sql_ddl) created by sqlsplit.

3) Also copy sql_other to analyzer_input/sql_other

4) Now run analyzer using analyzer_input as input folder

5) File extensions to be processed by Analyzer. Note that by default analyzer looks
for files with .sql extension only. So it is important to let analyzer know about ALL
file extension types in your SQL code base ( for example bteq in case of Teradata
BTEQ scripts) using -E input flag for analyzer. Value for this parameter is a comma
separated list and no need to include “.” (period).

Adjust this list ( ksh,sh,bteq,sql) in the analyzer command below

mkdir analyzer_input
mkdir analyzer_input/sql_ddl
cp -R sql_other analyzer_input/sql_other

sqlsplit -d sql_ddl -o analyzer_input/sql_ddl

analyzer -t SQL -E ksh,sh,bteq,sql -d analyzer_input -r

analyzer_report_v1.xlsx -u general_sql_specs.json

SSIS

analyzer.exe -d “<folder with dstx/ssis exported files>” -t SSIS -u

ssis2dws.json -r analyzer_report_v1.xlsx

Note:
** Paths, if they have spaces - say in the folder directory - use double quotes” .
E.g. “C:\Users\xyz\Downloads\analyzer-package\SQL Server”
(double quotes - because “SQL server” directory has space) - else you will get an error.

4. INTERPRETING THE RESULTS

You can review how the analyzer calculates complexity here, based on the system.
Appendix

Exporting XML’s out of ETL tools

DataStage
The easiest way to export metadata is through the GUI, one folder at a time. In order to do
so, please right click on a folder to export and select the option “Export”. Please ensure that
the XML format is specified for the export and that all the jobs within the folder are selected
(they are by default)

Informatica PowerCenter XML export

Overview
To run the Analyzer or converters on Informatica XMLs, the XML file first need to be
extracted out of the PowerCenter repository. Typically, it is easier to deal with the
conversion of a relatively granular level, so extracting the artifacts at the workflow level
is advisable.
● Metadata Extraction
To extract the metadata out of PowerCenter repository, use the following
commands:
● Connect to repository
pmrep connect <list of credentials>
● Get the list of folders
pmrep listobjects -o FOLDER
● For each folder, get the list of workflows
pmrep listobjects -o WORKFLOW -f <your folder name>
● Workflow extraction
Create a batch script with the following command template for each folder.
Note: Excel can be used to create the script with the following command.
pmrep objectexport -n workflow_name -o WORKFLOW -f
folder_name -b -r -m -s -u path-to-output-file

( Or do it manually via exporting the entire folder and save it as XML)

For Informatica Cloud (IICS):

The following comes from this article (How to read metadata in Informatica Cloud (IICS)? -
ThinkETL 1)

● Select all the Mapping Configuration tasks you want to read the metadata from
and export them as a single file.
● Exporting Mapping task fetches the associated mapping also.
● Make sure you select the check box as shown below to include all dependent
assets.

● Next Click on MyImport/Export Logs from the left pane. Go to Export Tab. Find
the name with which you exported the code. Click download.
● The entire tasks and its dependencies are downloaded as a single zip file. In
our example the file name will be IICS_Demo_Export.zip

Talend
To export all jobs in bulk, right click on Job Designs and select “Export Items”. In the popup,
select “Include All Dependencies”
Also here is a link on the topic: Talend export and import a job - Stack Overflow 15
Note: while Talend jobs can be exported as a single zip file, when running analyzer or any
converter utilities please unzip the file(s). Both the analyzer and converters will look for .item
and .properties files in non-zipped folders.

For SQL-Based Systems (Snowflake, Teradata, Netezza, Oracle, Synapse,

SQL server, Greenplum, Vertica, Presto, any DB etc.)

( Get the code/scripts/stored procedures from the Database in a folder)

Typically, client environments make use of source code repositories, such as Git, SVN,
Perforce and others. It would be preferred to get the code from such a repository,
potentially a combination of production branches and dev/qa branches- whichever makes
sense. This is the preferred method of getting the code, as it is stored in its original form,
unobstructed by any database-injected code snippets.
The same is true for general shell scripts and shell script wrappers with embedded SQL
code.

If such a repository is not available, SQL-based objects, such as procedures, UDFs, macros,
table and view DDLs, can be extracted using either native code export utilities. SQL scripts
and BTEQ code that lives outside of the database on a file system can just be taken as is.
For example, in the case of Snowflake, you can use the below statement to extract DDLs. It
extracts definitions of schemas, tables, views, functions, stored procs, tasks, etc. in that
database. Please repeat the step for all production databases.
You will have to use the analyzer splitter option to split DDLs automatically (see next
section on how to use the splitter).

select get_ddl('database','<database name>');

Please note that some SQL exporter utilities may create files with a single long line, with all
the statements appended on the same line. This would not be an acceptable import into
the analyzer.

Also, note that every database object (table/view/procedure/function/macro etc) should be

exported into its own individual file. If that is not possible and the only way to export
database code is into one large file, then SQL Splitter utility that we provide should be
executed to split up large combined files into smaller individual files.

( Ask the Teradata/Oracle DBA to export out the Table DDL, Views DDL, Packages, Stored
procedures, Functions, etc. to a folder. And then run the analyzer on it - so that we can get
an analyzer results such as below: )

Azure Synapse (Dedicated)

To extract metadata like Table, View and Stored Procedures DDL you can use Microsoft
SQL Server Management Studio.

● Preferred way to export the DDLs is each database object into individual files ( by
selecting the “One script file per object” option in the Set Scripting Options step of
Generate Script wizard. Ref screen-shots below)
● If you already have all the DDL statements in a single file, Analyzer Package comes
with a SQL splitter program which you can use to split one large file with all DDL
statements into individual files. This needs to be executed before running the
analyzer command. Check “Run the Analyzer section” for SQL code.
Note: In the above step select all required object types. Above screenshot is for
illustration purpose only
Azure Synapse (Serverless)
To extract metadata like Table, View and Stored Procedures DDL you can use Microsoft
SQL Server Management Studio.

For a Serverless database “Generate Scripts” Context Menu option is not available at
Database level in the studio ( as of version 19.1). So we need to use the “Object Explorer
Details” view and select required objects to export the corresponding DDL to a file as in
below screenshots.

Switch to Object Explorer Details view

Export External Table DDLs

Export View DDLs

SSIS

● You’ll need to export the DTSX packages. For details on how to obtain it see: Save
and Run Package (SQL Server Import and Export Wizard) - SQL Server Integration
Services (SSIS) | Microsoft Learn 35

ODI

● Exporting jobs in ODI is detailed in this document: 20 Exporting and Importing 11

Alteryx

● Analyzer needs the .yxmd files. These can be obtained by Select File > Export to
download your workflow to your local machine in .yxmd format.

SAP Business Objects Data Services

● Instructions for export can be found in the following articles: SAP Help Portal 4
SAP Help Portal 2
How is complexity calculated in the analyzer?

SQL Code Analysis

At the beginning of script analysis, mark a script with complexity level of LOW

If any of the following conditions are true, then mark the job as MEDIUM complexity:

1. At least one loop

2. Conventional Statement count greater than 10
3. Simple Statement count greater than 1000
4. Number of pivot statements between 1 and 3
5. Number of XML SQL statements between 1 and 3

If any of the following conditions are true, then mark the job as COMPLEX complexity:

1. Number of loops greater than 5

2. Conventional Statement count greater than 30
3. Simple Statement count greater than 2000
4. Number of pivot statements greater than 3
5. Number of XML SQL statements greater than 3

If any of the following conditions are true, then mark the job as VERY COMPLEX complexity:

1. Number of loops greater than 8

2. Conventional Statement count greater than 50
3. Simple Statement count greater than 5000
4. Number of pivot statements greater than 5
5. Number of XML SQL statements greater than 5

Simple Statement count is determined by regex patterning in analyzer config file.

Conventional Statement count is determined by below formula:

Conventional Statement count = Total Statement count - Simple Statement count

If the analyzer encounters a SQL procedure or function body inside a SQL file, it will
categorize the script as “ETL”.

Teradata MLOAD and FLOAD scripts follow the same rules as above.
Informatica Code Analysis
At the beginning of mapping analysis, mark mapping with complexity level of LOW

If any of the following conditions are true, then mark the mapping as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Number of sources > 1
3. Number of joins >= 1
4. Number of lookups between 4 and 6
5. Number of targets > 1
6. Overall function call count >= 10
7. Number of components (transformations) >= 10

If any of the following conditions are true, then mark the mapping as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

2. Number of expressions with 5+ function calls between 5 and 7
3. Number of mapping components >= 20
4. Overall function call count >= 20
5. Complex or Unstructured nodes are being used (e.g. Normalizer)
6. Number of lookups between 7 and 14

If any of the following conditions are true, then mark the mapping as VERY COMPLEX
complexity:

1. Three COMPLEX breaks from the list above

2. Number of expressions with 5+ function calls > 7
3. Number of lookups > 15
4. Number of job components >= 50

DataStage Analysis
At the beginning of job analysis, mark job with complexity level of LOW

If any of the following conditions are true, then mark the job as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Number of sources > 1
3. Number of joins >= 1
4. Number of lookups between 4 and 6
5. Number of targets > 1
6. Overall function call count >= 10
If any of the following conditions are true, then mark the job as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

2. Number of expressions with 5+ function calls between 5 and 7
3. Number of job components >= 20
4. Overall function call count >= 20
5. Complex or Unstructured nodes are being used (ChangeCapture, etc…)
6. Number of lookups between 7 and 14

If any of the following conditions are true, then mark the job as VERY COMPLEX complexity:

1. Three COMPLEX breaks from the list above

2. Number of expressions with 5+ function calls > 7
3. Number of lookups > 15
4. Number of job components >= 50

Talend Analysis
At the beginning of job analysis, mark job with complexity level of LOW

If any of the following conditions are true, then mark the job as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Number of sources > 1
3. Number of joins >= 1
4. Number of job components >= 10
5. Number of targets > 1
6. Overall function call count >= 10

If any of the following conditions are true, then mark the job as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

If any of the following conditions are true, then mark the job as VERY COMPLEX complexity:

1. Three COMPLEX breaks from the list above

2. Number of job components >= 50
SSIS Code Analysis
At the beginning of package analysis, mark package with complexity level of LOW

If any of the following conditions are true, then mark the mapping as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Number of sources > 1
3. Number of targets > 1
4. Overall function call count >= 10
5. Number of package components >= 10

If any of the following conditions are true, then mark the mapping as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

2. Number of expressions with 5+ function calls between 5 and 7
3. Number of package components >= 20
4. Overall function call count >= 20

If any of the following conditions are true, then mark the mapping as VERY COMPLEX
complexity:

1. Three COMPLEX breaks from the list above

2. Number of expressions with 5+ function calls > 7
3. Number of job components >= 50

Alteryx Code Analysis

At the beginning of package analysis, mark package with complexity level of LOW

If any of the following conditions are true, then mark the mapping as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Overall function call count >= 10
3. Number of job components >= 10

If any of the following conditions are true, then mark the mapping as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

2. Number of expressions with 5+ function calls between 5 and 7
3. Number of package components >= 20
4. Overall function call count >= 20
If any of the following conditions are true, then mark the mapping as VERY COMPLEX
complexity:

1. Three COMPLEX breaks from the list above

2. Number of expressions with 5+ function calls > 7
3. Number of job components >= 50

BODS Code Analysis

At the beginning of job analysis, mark package with complexity level of LOW

If any of the following conditions are true, then mark the mapping as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Overall function call count >= 10
3. Number of job components >= 10

If any of the following conditions are true, then mark the mapping as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

2. Number of expressions with 5+ function calls between 5 and 7
3. Number of package components >= 20
4. Overall function call count >= 20

If any of the following conditions are true, then mark the mapping as VERY COMPLEX
complexity:

1. Three COMPLEX breaks from the list above

2. Number of expressions with 5+ function calls > 7
3. Number of job components >= 50

SAS Code Analysis

At the beginning of SAS script analysis, mark the script with complexity level of LOW

If any of the following conditions are true, then mark the script as MEDIUM complexity:

1. Macro definition count > 3

2. Data block count > 5
3. number of statements inside macros and data blocks > 50
4. Conditional statement count > 10
5. ‘DO’ loop count > 3
6. Count of SQL Procs categorized as MEDIUM > 0
7. SQL Proc count > 10

If any of the following conditions are true, then mark the script as COMPLEX:

1. Macro definition count > 7

2. Data block count > 15
3. number of statements inside macros and data blocks > 100
4. Conditional statement count > 20
5. ‘DO’ loop count > 10
6. Count of SQL Procs categorized as COMPLEX > 0
7. SQL Proc count > 20

If any of the following conditions are true, then mark the script as VERY COMPLEX:

1. Macro definition count > 15

2. Data block count > 25
3. number of statements inside macros and data blocks > 150
4. Conditional statement count > 50
5. ‘DO’ loop count > 20
6. Count of SQL Procs categorized as VERY COMPLEX > 0
7. SQL Proc count > 40

Pentaho Code Analysis

At the beginning of mapping analysis, mark mapping with complexity level of LOW

If any of the following conditions are true, then mark the mapping as MEDIUM complexity:

1. Number of expressions with 5+ function calls between 2 and 4

2. Number of sources > 1
3. Number of joins >= 1
4. Number of lookups between 4 and 6
5. Number of targets > 1
6. Overall function call count >= 10
7. Number of components (transformations) >= 10

If any of the following conditions are true, then mark the mapping as COMPLEX complexity:

1. Three MEDIUM breaks from the list above

If any of the following conditions are true, then mark the mapping as VERY COMPLEX
complexity:

1. Three COMPLEX breaks from the list above

2. Number of expressions with 5+ function calls > 7
3. Number of lookups > 15
4. Number of job components >= 50

Splitter Instructions:
Purpose - Splits large SQL files with multiple objects into individual .sql files

sqlsplit
-h this message
######## OPTIONS ########

-i input file OR comma-separated list of files

OR
-d input folder

-o output folder

[-s plug in newline after semicolon]

[-E extensions. Default is sql]
[-t trim lines from both sides]
[-b remove square brackets]
[-P do not add package variables to procedures and functions]
[-G custom object separator pattern]
[-v verbose mode]

sc-400_5
No ratings yet
sc-400_5
35 pages
Honeywell Pro-Watch API Service Documentation
No ratings yet
Honeywell Pro-Watch API Service Documentation
150 pages
Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
TerraformAssociateCertification Exam Practice 1700268997
No ratings yet
TerraformAssociateCertification Exam Practice 1700268997
228 pages
Data Management Fundamentals
No ratings yet
Data Management Fundamentals
14 pages
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
From Everand
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
Rodrigo Copetti
No ratings yet
PlayStation Architecture: Architecture of Consoles: A Practical Analysis, #6
From Everand
PlayStation Architecture: Architecture of Consoles: A Practical Analysis, #6
Rodrigo Copetti
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet
SRS - How to build a Pen Test and Hacking Platform
From Everand
SRS - How to build a Pen Test and Hacking Platform
alasdair gilchrist
2/5 (1)
Oracle Database Administration Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series
From Everand
Oracle Database Administration Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series
Vibrant Publishers
5/5 (1)
Theory of Computation Question Paper 2017 - Tutorialsduniya
No ratings yet
Theory of Computation Question Paper 2017 - Tutorialsduniya
8 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Jetson Platform Development Guide: Definitive Reference for Developers and Engineers
From Everand
Jetson Platform Development Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Mastering DuckDB: High-Performance Analytics Made Easy
From Everand
Mastering DuckDB: High-Performance Analytics Made Easy
Robert Johnson
No ratings yet
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
Data Analysis With Python - FreeCodeCamp
100% (1)
Data Analysis With Python - FreeCodeCamp
26 pages
Build your own Blockchain: Make your own blockchain and trading bot on your pc
From Everand
Build your own Blockchain: Make your own blockchain and trading bot on your pc
Magelan Cybersecurity
No ratings yet
NoSQL Injection for Elasticsearch
From Everand
NoSQL Injection for Elasticsearch
Gary Drocella
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Enterprise Analyzer Ds
No ratings yet
Enterprise Analyzer Ds
4 pages
Learn Cassandra in 24 Hours
From Everand
Learn Cassandra in 24 Hours
Alex Nordeen
No ratings yet
Mastering the Art of x86 Assembly Programming: Unlocking the Secrets of Expert-Level Skills
From Everand
Mastering the Art of x86 Assembly Programming: Unlocking the Secrets of Expert-Level Skills
Steve Jones
No ratings yet
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet
Professional Microsoft SQL Server 2012 Integration Services
From Everand
Professional Microsoft SQL Server 2012 Integration Services
Brian Knight
No ratings yet
Fortinet FCP - FortiAnalyzer 7.4 Analyst Exam Preparation
From Everand
Fortinet FCP - FortiAnalyzer 7.4 Analyst Exam Preparation
Georgio Daccache
No ratings yet
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet
A Practical Guide Wireshark Forensics
From Everand
A Practical Guide Wireshark Forensics
alasdair gilchrist
5/5 (4)
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
From Everand
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
Mamta Devi
No ratings yet
C++ Basics for New Programmers: A Practical Guide with Examples
From Everand
C++ Basics for New Programmers: A Practical Guide with Examples
William E. Clark
No ratings yet
Pop!_OS System Administration Guide: Definitive Reference for Developers and Engineers
From Everand
Pop!_OS System Administration Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Cisco Packet Tracer Implementation: Building and Configuring Networks: 1, #1
From Everand
Cisco Packet Tracer Implementation: Building and Configuring Networks: 1, #1
S. R. Jena
No ratings yet
Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer
From Everand
Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer
Brian Knight
No ratings yet
Elements of Android Room
From Everand
Elements of Android Room
Mark Murphy
No ratings yet
Java / J2EE Interview Questions You'll Most Likely Be Asked
From Everand
Java / J2EE Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Learning Programming and Computer Science: 1, #1
From Everand
Learning Programming and Computer Science: 1, #1
MATHY WISDOM
No ratings yet
MF:OPENTEXT Enterprize Analyzer
No ratings yet
MF:OPENTEXT Enterprize Analyzer
7 pages
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Raspberry Pi :The Ultimate Step by Step Raspberry Pi User Guide (The Updated Version )
From Everand
Raspberry Pi :The Ultimate Step by Step Raspberry Pi User Guide (The Updated Version )
Jason Scotts
4/5 (4)
User's Guide: IBM SPSS Analytic Server
No ratings yet
User's Guide: IBM SPSS Analytic Server
40 pages
20 Windows Tools Every SysAdmin Should Know
From Everand
20 Windows Tools Every SysAdmin Should Know
padmin
5/5 (2)
Mastering Wireshark: A Comprehensive Guide to Network Analysis: Security Books
From Everand
Mastering Wireshark: A Comprehensive Guide to Network Analysis: Security Books
Erwin Dirks
No ratings yet
Intro To IBM Problem Determination Tools
No ratings yet
Intro To IBM Problem Determination Tools
242 pages
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
From Everand
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
Dr. Hedaya Mahmood Alasooly
No ratings yet
Introduction-It Skills
No ratings yet
Introduction-It Skills
20 pages
Logstash Made Easy: A Beginner's Guide to Log Ingestion and Transformation
From Everand
Logstash Made Easy: A Beginner's Guide to Log Ingestion and Transformation
Robert Johnson
No ratings yet
Data Analysis Salary of Data Professions
No ratings yet
Data Analysis Salary of Data Professions
14 pages
Oracle GoldenGate 11g Implementer's guide
From Everand
Oracle GoldenGate 11g Implementer's guide
John P Jeffries
5/5 (1)
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Oracle Quick Guides: Part 1 - Oracle Basics: Database and Tools
From Everand
Oracle Quick Guides: Part 1 - Oracle Basics: Database and Tools
Malcolm Coxall
No ratings yet
Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet
Data Analysis With Python - FreeCodeCamp PDF
No ratings yet
Data Analysis With Python - FreeCodeCamp PDF
28 pages
Data Analysis With Python
50% (2)
Data Analysis With Python
24 pages
SAS Statistics Data Analysis Certification Questions: Unofficial SAS Data analysis Certification and Interview Questions
From Everand
SAS Statistics Data Analysis Certification Questions: Unofficial SAS Data analysis Certification and Interview Questions
equitypress
4.5/5 (2)
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet
Mastering Python Network Automation: Automating Container Orchestration, Configuration, and Networking with Terraform, Calico, HAProxy, and Istio
From Everand
Mastering Python Network Automation: Automating Container Orchestration, Configuration, and Networking with Terraform, Calico, HAProxy, and Istio
Tim Peters
No ratings yet
Data Analysis With Python - FreeCodeCamp
No ratings yet
Data Analysis With Python - FreeCodeCamp
28 pages
Evaluation of Some Windows and Linux Intrusion Detection Tools
From Everand
Evaluation of Some Windows and Linux Intrusion Detection Tools
Dr. Hedaya Alasooly
No ratings yet
DBMS Chapter9 Exercise Answers
No ratings yet
DBMS Chapter9 Exercise Answers
3 pages
Images Clos Account
No ratings yet
Images Clos Account
1 page
Data Driven DSS
No ratings yet
Data Driven DSS
22 pages
Exam AZ-120 topic 5 question 12 discussion - ExamTopics
No ratings yet
Exam AZ-120 topic 5 question 12 discussion - ExamTopics
3 pages
Idenfy: Identity Verification in Telecommunication Industry
No ratings yet
Idenfy: Identity Verification in Telecommunication Industry
10 pages
Service Innovation Through Application Programming Interfaces - Towards A Typology of Service Designs
No ratings yet
Service Innovation Through Application Programming Interfaces - Towards A Typology of Service Designs
12 pages
Jugal Java Developer (2)
No ratings yet
Jugal Java Developer (2)
8 pages
BBM CBFA
No ratings yet
BBM CBFA
2 pages
Event Counter
No ratings yet
Event Counter
102 pages
Design Principles For Web Connectivity
No ratings yet
Design Principles For Web Connectivity
46 pages
Tamis
No ratings yet
Tamis
2 pages
Establish and Install Security Measures
No ratings yet
Establish and Install Security Measures
5 pages
Release NotesMagnifiGO 5.3R3
No ratings yet
Release NotesMagnifiGO 5.3R3
3 pages
Installing BI Apps 11.1.1.8.1: Part 1 - Pre-Requisites - Red Stack Tech
0% (1)
Installing BI Apps 11.1.1.8.1: Part 1 - Pre-Requisites - Red Stack Tech
82 pages
College Admission System: Department of Computer Science
No ratings yet
College Admission System: Department of Computer Science
33 pages
Introduction of Software Testing
No ratings yet
Introduction of Software Testing
9 pages
NetBackup9.x EEB Guide
No ratings yet
NetBackup9.x EEB Guide
54 pages
Importance of Software Testing in Software Development Life Cycle
No ratings yet
Importance of Software Testing in Software Development Life Cycle
4 pages
Barangay Resident Information Management With Issuance System.
No ratings yet
Barangay Resident Information Management With Issuance System.
13 pages
Configuring SQL Server 2005 For Use With ShipConstructor - ShipConstructor Knowledge Base - ShipConstructor Knowledgebase
No ratings yet
Configuring SQL Server 2005 For Use With ShipConstructor - ShipConstructor Knowledge Base - ShipConstructor Knowledgebase
8 pages
Apra CPG 234 Workbook
No ratings yet
Apra CPG 234 Workbook
2 pages
CST8276 Lab 9 Raman
No ratings yet
CST8276 Lab 9 Raman
11 pages
New Thumma
No ratings yet
New Thumma
5 pages
BSIT Final Test
50% (4)
BSIT Final Test
4 pages
Introduction To Facebook Analytics 101
No ratings yet
Introduction To Facebook Analytics 101
15 pages

Running Databricks Migrations Code Analyzer

Uploaded by

Running Databricks Migrations Code Analyzer

Uploaded by

Running Databricks Migrations Code Analyzer

Running Databricks Migrations Code Analyzer​ 1

A) Analyzer Package Download and Integrity Checks

B) Share Folder for Sending back the generated Code Analysis

-rwxr-xr-x@ 1 user.name staff 12345678 Jan 30 12:50 analyzer

2.​ EXTRACT THE METADATA

3.​ RUN THE ANALYZER

All SQL code and DBT SQL code

( like Redshift, Snowflake, Teradata , Oracle, SQL server, Synapse,

●​ sql_ddl → DDL files

3)​ Also copy sql_other to analyzer_input/sql_other

4)​ Now run analyzer using analyzer_input as input folder

Adjust this list ( ksh,sh,bteq,sql) in the analyzer command below

sqlsplit -d sql_ddl -o analyzer_input/sql_ddl

analyzer -t SQL -E ksh,sh,bteq,sql -d analyzer_input -r

analyzer.exe -d “<folder with dstx/ssis exported files>” -t SSIS -u

4. INTERPRETING THE RESULTS

Exporting XML’s out of ETL tools

Informatica PowerCenter XML export

( Or do it manually via exporting the entire folder and save it as XML)

For Informatica Cloud (IICS):

For SQL-Based Systems (Snowflake, Teradata, Netezza, Oracle, Synapse,

( Get the code/scripts/stored procedures from the Database in a folder)

Also, note that every database object (table/view/procedure/function/macro etc) should be

Azure Synapse (Dedicated)

​ Switch to Object Explorer Details view

Export View DDLs

●​ Exporting jobs in ODI is detailed in this document: 20 Exporting and Importing 11

SAP Business Objects Data Services

SQL Code Analysis

1.​ At least one loop

1.​ Number of loops greater than 5

1.​ Number of loops greater than 8

Simple Statement count is determined by regex patterning in analyzer config file.

Conventional Statement count = Total Statement count - Simple Statement count

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

Alteryx Code Analysis

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

BODS Code Analysis

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

SAS Code Analysis

1.​ Macro definition count > 3

1.​ Macro definition count > 7

1.​ Macro definition count > 15

Pentaho Code Analysis

1.​ Number of expressions with 5+ function calls between 2 and 4

1.​ Three MEDIUM breaks from the list above

1.​ Three COMPLEX breaks from the list above

-i input file OR comma-separated list of files

[-s plug in newline after semicolon]

You might also like

Running Databricks Migrations Code Analyzer 1

2. EXTRACT THE METADATA

3. RUN THE ANALYZER

● sql_ddl → DDL files

3) Also copy sql_other to analyzer_input/sql_other

4) Now run analyzer using analyzer_input as input folder

Switch to Object Explorer Details view

● Exporting jobs in ODI is detailed in this document: 20 Exporting and Importing 11

1. At least one loop

1. Number of loops greater than 5

1. Number of loops greater than 8

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above

1. Macro definition count > 3

1. Macro definition count > 7

1. Macro definition count > 15

1. Number of expressions with 5+ function calls between 2 and 4

1. Three MEDIUM breaks from the list above

1. Three COMPLEX breaks from the list above