0% found this document useful (0 votes)

13 views11 pages

OverviewofBigSQL3.0Functionality

The document provides an overview of IBM's Big SQL 3.0 functionality, which integrates SQL capabilities into the Hadoop ecosystem to enhance data processing and analysis. It highlights four main pillars of Big SQL 3.0: rich SQL language support, application portability and integration, performance optimization, and enterprise-level capabilities. Additionally, the document offers a tutorial for users to gain hands-on experience with Big SQL 3.0 through a cloud environment setup.

Uploaded by

pk.sf25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views11 pages

OverviewofBigSQL3.0Functionality

Uploaded by

pk.sf25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/264604883

Overview of Big SQL 3.0 Functionality

Article · July 2014

CITATIONS READS
0 2,202

1 author:

Maria Schwenger
IBM
8 PUBLICATIONS 1 CITATION

SEE PROFILE

All content following this page was uploaded by Maria Schwenger on 29 September 2014.

The user has requested enhancement of the downloaded file.

Overview of Big SQL 3.0 Functionality
Apache HadoopTM (Hadoop) has received huge recognition in the Big Data market space because of the
great flexibility in processing heterogeneous data structures. However, to add even greater value, it is critical for
Hadoop to integrate with various analytical tools that are widely used by different enterprises. Most of these tools
utilize standard Structure Query Language (SQL) and SQL procedural functionality. It is also important to allow the
enterprises to utilize their existing SQL expertise, which they have been building for years.
In the last few years, this notion brought to life multiple SQL-on-Hadoop solutions such as Cloudera
Impala®, Teradata-Aster®, Presto®, Pivotol Hawq® and others. The purpose of IBM InfoSphere® BigInsightsTM
(BigInsights) - Big SQL 3.0 is to provide users a SQL interface built upon the Hadoop infrastructure that is
consistent with the SQL interfaces they’re using in the relational database world. [2]
This article consists of two parts. The first part introduces the Big SQL 3.0 functionality as part of the SQL-
on-Hadoop concept. The second part provides a step by step tutorial of how to get hands on experience with IBM’s
InfoSphere BigInsights 3.0.

Introduction to Big SQL 3.0

Big Data is derived from multiple sources. It involves not just traditional relational data, but all paradigms
of unstructured data sources that are growing at a significant rate. For instance, machine-derived data multiplies
quickly and contains rich, diverse content that needs to be discovered. Another example, human-derived data from
social media is more textual, but the valuable insight are often overloaded with many possible meanings. Trying to
efficiently extract the meaningful insights from such data sources quickly and easily is challenging.[6]

Many people believe that the Hadoop technology is the best answer to those challenges. Hadoop is an open
source platform, which many vendors extend in search of robust additional functionality. [4] IBM InfoSphere®
BigInsightsTM (BigInsights) is IBM's approach to enrich Hadoop with enterprise class capabilities. The latest release
of BigInsights - 3.0, provides a single platform to manage all data types, and obtain insights that integrates the usual
silos of data. With such integration, BigInsights is a viable platform for storage, analysis and visualization of wide
variety of data sources.[5] Several new features of BigInsights 3.0 concentrate on integration and workload
optimizations that bring improvements for storing, processing and querying data. Big SQL 3.0 is one of those
capabilities. (See Figure 1)

Fig. 1. Logical view of Hadoop ecosystem with Big SQL 3.0

The idea of SQL-on-Hadoop is not new to the industry. In addition to the open source Apache Hive, several vendors
like Cloudera Impala®, Teradata-Aster®, Presto®, Pivotol Hawq®, Hadapt®, Stinger® and others are present in the
SQL-on-Hadoop landscape today.[1] With SQL querying, the vendors try to extend the traditional MapReduce
functionality built for large scale and complex data processing. However, up until now, it has been especially
difficult to provide high performance, full ANSI SQL compliance and enterprise level support of rich SQL language
that executes over the heterogeneous Hadoop ecosystem.[4] BigInsights – Big SQL 3.0 is a SQL-on-Hadoop
functionality that breaks the dependency with Hadoop’s traditional MapReduce framework.[1] It delivers a new, more
comprehensive way to rich SQL support that also includes enterprise level performance, security, management and
monitoring - all features traditionally found in the IBM® DB2® for Linux, UNIX and Windows (DB2 LUW)
relational database management system.

Big SQL 3.0 achieves it’s capability by leveraging IBM’s leading relational SQL expertise. Big SQL 3.0
introduces IBM SQL compiler and runtime engine built on IBM’s expertise in shared-nothing, low-latency parallel
processing database architecture. It has the advantage to deploy directly to the Hadoop cluster and access that data
natively while still allowing the data to remain in its original state on the Hadoop Distributed File System (HDFS).[4]
It provides the ability to access custom data storage formats through a Java I/O module. It provides a logical view of
the data where users use the Hive database catalog (HCatalog) for table definitions, metadata, location, storage
format and encoding of input files.

There are four main Big SQL 3.0 functionality pillars built and run upon the Hadoop cluster: rich SQL
language support, performance, application portability and integration, and enterprise capabilities.[1]

Functionality of Big SQL 3.0

Let’s review briefly the four main pillars of Big SQL 3.0 grouped by functional areas.

As the name suggests, the first focus of Big SQL 3.0 is the rich SQL support that leverages IBM’s SQL
capability layer to provide the most complete ANSI SQL coverage. It stretches over wide variety of standard SQL
relational functionality that provides support for: variety of data types (tinyint, smallint, int, bigint, boolean, float,
double, string, varchar, and others); standard APIs and common client drivers (Java Database Connectivity (JDBC)
or Object Database Connectivity (ODBC)); procedural language structures such as in DB2 SQL PL or user defined
functions in Java or C++. It has built-in statistical, analytical, and business intelligent functions that encapsulate
SQL querying with very few modifications. It is designed for large-scale analytic aggregation and supports some of
the main OLAP functions like LEAD, LAG, RANK, DENSE_RANK, CORRELATION, STDDEV, REGR_AVGX,
and others. Big SQL 3.0 comes also equipped with an extensive library of more than 250 built-in functions and 25
aggregates.

Big SQL 3.0 is a simple, but comprehensive way to apply SQL to existing data processed by Hadoop. It
provides a logical view of the data that uses Hive Catalog (HCatalog) for table definitions, metadata, location,
storage format and encoding of input files. When creating a table, HCatalog definition and Hive APIs get stored in
HDFS warehouse directory within its designated schema. Hive's metastore catalogs table definitions could be read
and/or write, and they are shared across the Hadoop ecosystem - they can be accessed by other components (such as
Pig, Sqoop, etc.) throughout the Hadoop cluster.[2] To facilitate query execution, the HCatalog stores some of its
metadata locally on each node. Based on this, when in the parallel processing mode, the server connects to each
specific node that contains the partitioned query-able data and generates a result set of stored data within this
specific node of the cluster. The query execution also stores intermediate data into the Big SQL 3.0 engine memory
for any future or additional processing. (See Figure 2)

Although the rich SQL support is the essence of the BIG SQL 3.0, there are other very important features
that compliment and raise the value of the SQL functionality.

The second functionality pillar is application portability and integration within BigInsights – Big SQL 3.0.
This feature introduces portability that allows data sharing amongst the Hadoop ecosystem and other SQL querying
software products as well as usage of SQL language instead of the MapReduce/Java syntax. In some ways, this
integration is the ultimate driving force behind the Big SQL 3.0 feature. This is an enhancement that fully integrates
with third party SQL application tools that leverage common client drivers like JDBC and ODBC drivers. The
benefit is that the existing queries can be run with little to zero modifications. Big SQL 3.0 tables can be accessed
using the db2 data server client as well as the Big SQL JDBC driver. This is the same client that is used to access
DB2 on Linux, Unix and Windows. This makes it possible to easily access Big SQL tables with many popular API’s
including PERL, Python and Ruby. The Big SQL node and database are cataloged using the same commands as for
a DB2 LUW database. In addition, the CONNECT statement and almost all SQL statement are consistent with DB2
LUW. In this illustration, a query is run to access a table in Big SQL on a Hadoop cluster from the DB2 client
command window. Also available is an advanced query federation of query-able archives that allows data to be
combined - the data does not need to live in isolation on a single system. This is an additional integration capability
that allows us to send a distributed request to multiple data sources with a single SQL statement. These data sources
include different relational database management systems like DB2, Teradata, Oracle®, Netezza®, etc.

Big SQL 3.0’s third functionality pillar is performance, facilitating monitoring of resources and software
management. The performance tooling is built from the ground up to traverse low latency and optimized query
execution. Its operations run on memory, and its daemons live directly on the Hadoop cluster. [1] As a result there is
less overhead and reduced latency to retrieve data. There is no need to initiate MapReduce jobs on the cluster where
data is located.

Big SQL 3.0 query optimization has the following forms: query rewrites, query plan optimization, predicate
pushdowns and statistics-driven optimization. Query rewrites, is an ideal way to execute queries with minimal
resources. Query plan optimization includes many sub queries optimization. For instance, one of the main sub
queries is a join amongst data that is spread out throughout the Hadoop cluster. Other queries are shared aggregation
and distinct elimination, which leverage key definitions for a cohesive rewrite. Predicate pushdown processes data
reads based on static variable definitions. For example, Big SQL 3.0 distinct elimination does its query writes by
eliminating the use of the DISTINCT data type if the key values are already unique.[1] Finally, statistics-driven
optimization stores its column data in the query optimizer to determine an ideal query execution. These
transformations are based upon years of IBM’s relational database experience and greatly improve statistics and
heuristics.

The enterprise level capabilities are the fourth pillar of the Big SQL 3.0. These capabilities include
functionality like monitoring, management and security of the SQL engine running over the Hadoop system.
Resource, event, memory and workload management enhance the Hadoop functionality for stored data in
BigInsights. The security capabilities include secure user authentication, authorization, auditing capabilities, and
fine grained row/column access control. Another key component is the scheduler that eliminates any stray data not
relevant for a SQL query (known as partition elimination). It also ensures data processing amongst nodes is done
cohesively, as close to the data as possible.
Fig. 2. Hadoop components interaction with Big SQL 3.0

A simple way to get hands-on-experience with Big SQL 3.0

This section describes how to get hands on experience with Big SQL 3.0. IBM has provided an easy way
for you to get familiar with Big SQL 3.0. We have set up a cloud environment to allow you to get hands on
experience with this component of BigInsights.
NOTE: On Windows platform, the URLs can be accessed using Mozilla Firefox. On MAC platforms, use Safari.

SETUP

1. Go to http://bigsql.imdemocloud.com/ and submit the request form to gain access to Big SQL 3.0 Technology
Preview.[7]
 The IM Demo Cloud Team will email you with instructions on how to setup your account on the cloud.

 Once you’ve set up your account, go to http://my.imdemocloud.com and login with your credentials.
 After you log in, verify you have access to Big SQL 3.0 Technology Preview project.

2) Click on the Big SQL 3.0 Technology Preview project.

 Go through the guided tour.

 Once the tour is done, click on the tab called “Systems”.

 Under “Systems” it will include information for BigInsights web console and hostname of the Big SQL 3.0
environment.

 Launch the BigInsights web console by clicking on “BigInsights Web Console” or going through URL:
http://hostname:8080. Use IM Demo cloud credentials to login.

 Verify BigInsights services are up and running for Big SQL 3.0 by clicking on “Cluster Status”.

BigInsights supports a command line interface for Big SQL 3.0 called Java SQL shell (JSqsh pronounced “J-
skwish”). JSqsh is an open source project for querying JDBC databases through a shell-like environment.[3] To issue
Big SQL 3.0 commands from JSqsh, launch a terminal window (for instance, PUTTY).

3) Use BigInsights hostname and log in using IM Demo Cloud credentials.

4) Launch the JSqsh shell by issuing command: $JSQSH_HOME/bin/jsqsh.

5) During the initial launch, begin the JSqsh setup wizard by hitting Enter.

6) When prompted, enter C to launch the Connection wizard.

 The Wizard will display a list of JDBC drivers, enter the number corresponding to IBM DB2 db2 (IBM
DB2).

 The wizard will display some default values to the connection properties and prompts you to change them.

 Change each of its variables, one at a time, by entering the variable number and specifying it’s designated
new value.

o For instance, to change the value of the password variable, type variable 5 and hit Enter. It will
prompt you to enter a new password. Type the new password value and hit Enter again.

o Do this and apply changes to all variables.

o Inspect the new settings and verify your change.

o Enter t to test your connection and verify the test succeeded.

o Enter s to save your connection information and specify bigsql as the connection name.

o Once the connection is created, enter q to quit the wizard.

7) Connect to the bigsql engine from the JSqsh terminal – once the Big SQL 3.0 connection is created.

 Type \connect bigsql and hit Enter.

 Once you’re connected, verify JSqsh command prompt starts with 1>.

USAGE

Now that you’ve created the bigsql engine, you can issue Big SQL 3.0 queries to become familiar with the JSqsh
shell.

1) Type \show tables –e | more to display information for all available tables.

 This information will vary depending on your environment.

2) Next, create a simple Hadoop table called demo

 create Hadoop table demo (col1 int, col2 varchar(5));

3) Insert a row into the table.

 insert into demo values (1, ‘one’);

4) To view metadata for the tables, use \describe command with the table name in uppercase.

 \describe USER.DEMO
 Another way to query the system for metadata, run the following SQL command

o select tabschema, colname, colon, typename, length from syscat.columns where tabschema =
USER and tabname= ‘DEMO’;

5) Load a data file from a local file system into HDFS.

 Use Hadoop put command to load the data: “hadoop fs –put /local/file/system/f.txt
/hadoop/file/system/.”

6) For complex Big SQL 3.0 querying:

 Create a Hadoop table with table schema, delimiter value and table format.

CREATE HADOOP TABLE IF NOT EXISTS sample

( key INT NOT NULL
, value INT NOT NULL
, total_unique_key DOUBLE
, total_value_per_key DOUBLE
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

 Next, load data from HDFS into the newly created table.

load hadoop using file url ‘file:///sample/data/file.txt’ with SOURCE PROPERTIES (‘field.delimiter’=’\t’)
INTO TABLE SAMPLE overwrite;

 Run a SELECT command to verify data was loaded into the table

Select * from sample;

Now you are ready to create your custom tables and try to test more complex queries.

Note: Within JSqsh, a query can be split across multiple lines. Query will only execute with a
semi-colon at the end. To review the history of commands that were recently executed in JSqsh,
type \history and hit Enter.

Conclusion

As we outlined above, the BigInsights – Big SQL 3.0 introduces rich SQL support in combination with
features that provide enterprise level performance, management and monitoring. IBM also provides an easy way for
you to explore this functionality. Give it a try and let us know what you think!
References
[1] Gary, Scott C. Ozcan, Fatma. (April 2014) SQL-on-Hadoop without compromise. How Big SQL 3.0 from IBM
represents an important leap forward for speed, portability and robust functionality in SQL-on-Hadoop solutions.
IBM Software Group. IBM.com. Retrieved from: http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=SA&subtype=WH&htmlfid=SWW14019USEN#loaded

[2] IBM InfoSphere BigInsights Version 3.0. IBM® Big SQL. IBM Knowledge Center. IBM.com Retrieved from:
http://www-
01.ibm.com/support/knowledgecenter/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_sq
l_access.html

[3] Saracco, Cindy. (July 2014). Big SQL on Hadoop Part 2: The Big SQL Command Line Interface. Developers
Works. IBM.com. Retrieved from: https://developer.ibm.com/hadoop/docs/853/

[4] White, Tom. (2012). Hadoop: The Definitive Guide 3rd Edition. Sebastopol, CA: O’Reilly Media Inc.

[5] InfoSphere BigInsights: Bringing the power of Hadoop to the Enterprise. IBM.com: Retrieved from:
www.ibm.com/software/data/infosphere/biginsights

[6] IBM Big Data and Information Management. Now known as IBM Watson Foundations. IBM.com: Retreived
from: http://www-01.ibm.com/software/data/bigdata/

[7] Katsnelson, Leon (Author). (2013). Welcome to the BigInsights Quick Start on the IM Demo Cloud. United
States. Youtube.com

Additional Information

1) Additional information on SQL-on-Hadoop at: http://www-01.ibm.com/common/ssi/cgi-

bin/ssialias?infotype=SA&subtype=WH&htmlfid=SWW14019USEN#loaded

2) Big SQL 3.0 additional information: http://www-

01.ibm.com/support/knowledgecenter/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/
bi_sql_access.html

3) For more information in JSqsh, consult the following documentation.

(http://sourceforge.net/apps/mediawiki/jsqsh/index.php?title=Main_Page) and
https://developer.ibm.com/hadoop/docs/853/

4) Learn more about InfoSphere BigInsights at: www.ibm.com/software/data/infosphere/biginsights

5) Learn more about IBM Big Data Platform: http://www-01.ibm.com/software/data/bigdata/

View publication stats

Tes Inventory Valuation
No ratings yet
Tes Inventory Valuation
1,366 pages
FNDCPGSC9902
No ratings yet
FNDCPGSC9902
129 pages
Day 3 - MCQ of ICT Skills & Unit 3 DBMS_Qs
No ratings yet
Day 3 - MCQ of ICT Skills & Unit 3 DBMS_Qs
39 pages
Data Mining: Concepts and Techniques
100% (2)
Data Mining: Concepts and Techniques
27 pages
Rules A DBA Should Follow
No ratings yet
Rules A DBA Should Follow
6 pages
10gprep
No ratings yet
10gprep
91 pages
DBSM Report 3
No ratings yet
DBSM Report 3
21 pages
Express DB Tree
No ratings yet
Express DB Tree
172 pages
SQL Interview Questions: One-Stop Resource
No ratings yet
SQL Interview Questions: One-Stop Resource
30 pages
OLAP Data Mining
No ratings yet
OLAP Data Mining
44 pages
Big SQL
No ratings yet
Big SQL
6 pages
JDBC Pbo
No ratings yet
JDBC Pbo
18 pages
Samreport-Lite User'S Guide: Autodesk
No ratings yet
Samreport-Lite User'S Guide: Autodesk
38 pages
Big Data: Spot Business Trends, Prevent Diseases, C Ombat Crime and So On"
No ratings yet
Big Data: Spot Business Trends, Prevent Diseases, C Ombat Crime and So On"
8 pages
BIG DATA ANALYTICS USING HADOOP TOOLS – APACHE HIVE VS APACHE PIG_1604726800
No ratings yet
BIG DATA ANALYTICS USING HADOOP TOOLS – APACHE HIVE VS APACHE PIG_1604726800
5 pages
A Comparison of Big Data Analytics Approaches Based On Hadoop Mapreduce
No ratings yet
A Comparison of Big Data Analytics Approaches Based On Hadoop Mapreduce
9 pages
BDA Report
No ratings yet
BDA Report
11 pages
F2FS
No ratings yet
F2FS
2 pages
Database Architectures: Current Trends and Their Relationships To Environmental Data Management
No ratings yet
Database Architectures: Current Trends and Their Relationships To Environmental Data Management
8 pages
Course Flyer Course Overview and Sample Certificate Together For The MDMS by RBPL New
No ratings yet
Course Flyer Course Overview and Sample Certificate Together For The MDMS by RBPL New
3 pages
Data Analytics and Visualization Lab Experiment List 2025
No ratings yet
Data Analytics and Visualization Lab Experiment List 2025
1 page
IT Auditing, Hall, 4e: Security Part II: Auditing Database Systems
No ratings yet
IT Auditing, Hall, 4e: Security Part II: Auditing Database Systems
25 pages
Design Elements of Cache Architectures
No ratings yet
Design Elements of Cache Architectures
1 page
Ds Netapp E2600 Storage System
No ratings yet
Ds Netapp E2600 Storage System
4 pages
Replicating or Copying Netbackup Image Catalogs
No ratings yet
Replicating or Copying Netbackup Image Catalogs
3 pages
Exploratory Data Analysis EDA On Power BI 1712874850
No ratings yet
Exploratory Data Analysis EDA On Power BI 1712874850
9 pages
Disk Cloning For Data Recovery With GNU Ddrescue PDF
No ratings yet
Disk Cloning For Data Recovery With GNU Ddrescue PDF
2 pages
A User Behavior Analysis Method Based On Big Data
No ratings yet
A User Behavior Analysis Method Based On Big Data
6 pages
Unit 5 PDF
100% (1)
Unit 5 PDF
32 pages
Day 2
No ratings yet
Day 2
2 pages
Chapter1 Answers
88% (8)
Chapter1 Answers
5 pages
Computerized Accounting System and E-Filing of Tax Return': Project Submitted by
No ratings yet
Computerized Accounting System and E-Filing of Tax Return': Project Submitted by
3 pages
Anshi Resume
No ratings yet
Anshi Resume
5 pages
Api Ar Invoice
No ratings yet
Api Ar Invoice
5 pages
Code Review Checklist For A
No ratings yet
Code Review Checklist For A
4 pages
HarperDB Architecture and Querying Solutions: The Complete Guide for Developers and Engineers
From Everand
HarperDB Architecture and Querying Solutions: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Comprehensive Guide to Hive Architecture and Query Language: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Hive Architecture and Query Language: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cloudant Essentials: Definitive Reference for Developers and Engineers
From Everand
Cloudant Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced SQL Queries: Writing Efficient Code for Big Data
From Everand
Advanced SQL Queries: Writing Efficient Code for Big Data
Robert Johnson
5/5 (2)
SQLite Essentials: Definitive Reference for Developers and Engineers
From Everand
SQLite Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PrestoDB in Practice: Definitive Reference for Developers and Engineers
From Everand
PrestoDB in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Database Management with HeidiSQL: Definitive Reference for Developers and Engineers
From Everand
Efficient Database Management with HeidiSQL: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Couchbase Essentials: Definitive Reference for Developers and Engineers
From Everand
Couchbase Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Redash Data Analytics and Dashboarding: Definitive Reference for Developers and Engineers
From Everand
Redash Data Analytics and Dashboarding: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Microsoft SQL Server Text Book
From Everand
Microsoft SQL Server Text Book
Manish Soni
No ratings yet
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
SQL Demystified: A Beginner's Roadmap to Data Retrieval and Management
From Everand
SQL Demystified: A Beginner's Roadmap to Data Retrieval and Management
Kaushal Mehta
No ratings yet
Databricks Platform Essentials: Definitive Reference for Developers and Engineers
From Everand
Databricks Platform Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Technical Guide to H2O Application and Workflow: Definitive Reference for Developers and Engineers
From Everand
Technical Guide to H2O Application and Workflow: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Workflow with RStudio: Definitive Reference for Developers and Engineers
From Everand
Efficient Workflow with RStudio: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering SQLite with Python: From Basics to Advanced Techniques
From Everand
Mastering SQLite with Python: From Basics to Advanced Techniques
Robert Johnson
No ratings yet
Mastering BigQuery: Scalable Analytics on Google Cloud
From Everand
Mastering BigQuery: Scalable Analytics on Google Cloud
Robert Johnson
No ratings yet
Pentaho Solutions and Architecture: Definitive Reference for Developers and Engineers
From Everand
Pentaho Solutions and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DBeaver Essentials: Definitive Reference for Developers and Engineers
From Everand
DBeaver Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
From Everand
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
Robert Johnson
No ratings yet
Structured Query Language Simplified: Efficient and Effective Database Management
From Everand
Structured Query Language Simplified: Efficient and Effective Database Management
Angela White
No ratings yet
Real-Time Big Data Analytics: Emerging Trends
From Everand
Real-Time Big Data Analytics: Emerging Trends
Trilokesh Khatri
No ratings yet
RDBMS In-Depth: Mastering SQL and PL/SQL Concepts, Database Design, ACID Transactions, and Practice Real Implementation of RDBM (English Edition)
From Everand
RDBMS In-Depth: Mastering SQL and PL/SQL Concepts, Database Design, ACID Transactions, and Practice Real Implementation of RDBM (English Edition)
Dr. Madhavi Vaidya
No ratings yet
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
From Everand
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
From Everand
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
Robert Johnson
No ratings yet
Microsoft SQL Server 2012 with Hadoop
From Everand
Microsoft SQL Server 2012 with Hadoop
Debarchan Sarkar
1/5 (1)
Learn SQL with MySQL: Retrieve and Manipulate Data Using SQL Commands with Ease
From Everand
Learn SQL with MySQL: Retrieve and Manipulate Data Using SQL Commands with Ease
Ashwin Pajankar
No ratings yet
Learn SQL in 24 Hours
From Everand
Learn SQL in 24 Hours
Alex Nordeen
5/5 (4)
Power BI DAX: A Guide to Using Basic Functions in Data Analysis
From Everand
Power BI DAX: A Guide to Using Basic Functions in Data Analysis
Kiet Huynh
No ratings yet
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
From Everand
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
From Everand
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
Robert Johnson
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
DBA's Guide to NoSQL
From Everand
DBA's Guide to NoSQL
The Enlightened DBA
5/5 (1)
Mastering DuckDB: High-Performance Analytics Made Easy
From Everand
Mastering DuckDB: High-Performance Analytics Made Easy
Robert Johnson
No ratings yet
Learning SQL: Master SQL Fundamentals
From Everand
Learning SQL: Master SQL Fundamentals
Kiet Huynh
No ratings yet
Learn SQL: Database Management Basics
From Everand
Learn SQL: Database Management Basics
Kiet Huynh
No ratings yet
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Professional Microsoft SQL Server 2012 Integration Services
From Everand
Professional Microsoft SQL Server 2012 Integration Services
Brian Knight
No ratings yet
Professional Access 2013 Programming
From Everand
Professional Access 2013 Programming
Ben Clothier
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Oracle Quick Guides: Part 3 - Coding in Oracle: SQL and PL/SQL
From Everand
Oracle Quick Guides: Part 3 - Coding in Oracle: SQL and PL/SQL
Malcolm Coxall
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Oracle Quick Guides: Part 1 - Oracle Basics: Database and Tools
From Everand
Oracle Quick Guides: Part 1 - Oracle Basics: Database and Tools
Malcolm Coxall
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
SQL Server Interview Questions You'll Most Likely Be Asked
From Everand
SQL Server Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

OverviewofBigSQL3.0Functionality

Uploaded by

OverviewofBigSQL3.0Functionality

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Overview of Big SQL 3.0 Functionality

Article · July 2014

The user has requested enhancement of the downloaded file.

Introduction to Big SQL 3.0

Fig. 1. Logical view of Hadoop ecosystem with Big SQL 3.0

Functionality of Big SQL 3.0

A simple way to get hands-on-experience with Big SQL 3.0

2) Click on the Big SQL 3.0 Technology Preview project.

 Once the tour is done, click on the tab called “Systems”.

3) Use BigInsights hostname and log in using IM Demo Cloud credentials.

4) Launch the JSqsh shell by issuing command: $JSQSH_HOME/bin/jsqsh.

6) When prompted, enter C to launch the Connection wizard.

o Do this and apply changes to all variables.

o Inspect the new settings and verify your change.

o Enter t to test your connection and verify the test succeeded.

o Once the connection is created, enter q to quit the wizard.

 Type \connect bigsql and hit Enter.

 This information will vary depending on your environment.

2) Next, create a simple Hadoop table called demo

 create Hadoop table demo (col1 int, col2 varchar(5));

3) Insert a row into the table.

 insert into demo values (1, ‘one’);

5) Load a data file from a local file system into HDFS.

6) For complex Big SQL 3.0 querying:

CREATE HADOOP TABLE IF NOT EXISTS sample

Select * from sample;

1) Additional information on SQL-on-Hadoop at: http://www-01.ibm.com/common/ssi/cgi-

2) Big SQL 3.0 additional information: http://www-

3) For more information in JSqsh, consult the following documentation.

4) Learn more about InfoSphere BigInsights at: www.ibm.com/software/data/infosphere/biginsights

5) Learn more about IBM Big Data Platform: http://www-01.ibm.com/software/data/bigdata/

View publication stats

You might also like