Dee Lab Manual1 2024
Dee Lab Manual1 2024
INDEX
3. Load Data into the following tables from CSV file, TEXT File and 9
Google G Drive
9. Using two different files and merge the Files and then load into table 28
using Python in colab and use appropriate python libraries
10. Using Data from Two different Files, merge the Files, remove 30
duplicate rows and replace NULL values with #### and then
load into table using Python in Colab and use appropriate
python libraries
11 Overview on MYSQL 33
12 Features of MySQL 35
2
Lab 1 - Setup a Simple Data Engineering Development Infrastructure in My
SQL Opensource
AIM:
To setup a simple data engineering development infrastructure in MYSQL opensource
Output:
import sys
import mysql.connector
import mysql.connector
from mysql.connector import Error
import pandas as vAR_pd
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened’)
vAR_df = vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['index','course_id','course_na
me','course_original_id','is_demo','created_by',
'created_datetime','updated_by','updated_datetime'])
vAR_df.head()
vAR_df2.head()
3
Output:
Cursor opened
vAR_df2 = vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['EMPID','EMPFIRSTNAM
E','EMPLASTNAME','ADDRESS','CITY'])
vAR_df2.head()
#******************************************************************
#vAR_df2 =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['EMPID','EMPFIRSTNAME','EMPLA
STNAME','ADDRESS','CITY'])
#vAR_df2.head()
#vAR_df2 =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['EMPID','EMPFIRSTNAME','EMPLA
STNAME','ADDRESS','CITY'])
#vAR_df2.head()
4
Output:
5
Lab 2- Create following tables with appropriate columns
Aim:
To create the following tables with appropriate columns.
Customer table
Product table
Sales order table
Output:
import sys
import mysql.connector
import mysql.connector
from mysql.connector import Error
import pandas as vAR_pd
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
Customer Table:
1. Customer_ID
2. Customer_Name
3. Customer_Address
4. Locality
6
5. City
6. State
7. Country
8. Postal_Code
9. EMail_Address
10.Phone_Number
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Customer_ID','Customer_Name','Cust
omer_Address','Locality','City','State','Country','Postal_Code','Email_Address','Phone_Numbe
r'])
vAR_df.head()
Output:
Product Table:
1. Product_ID
2. Customer_ID
3. Product_Family
4. Product_Group
5. Product
6. SKU
7. Unit_Price
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Product_ID','Customer_ID','Product_F
amily','Product_Group','Product','SKU','Unit_Price'])
vAR_df.head()
7
Output:
1. Sales_Order_ID
2. Customer_ID
3. Product_ID
4. Sales_Order_Date
5. Quantity
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Sales_Order_ID','Customer_ID','Produ
ct_ID','Sales_Order_Date','Quantity'])
vAR_df.head()
Output:
8
Lab 3- Load Data into the following tables from CSV file, TEXT File and Google
G Drive
Aim:
To load data into the following tables from CSV file, TEXT file and google drive
• Customer Table
• Product Table
• Sales Order Tables
Output:
import sys
import mysql.connector
import mysql.connector
from mysql.connector import Error
import pandas as vAR_pd
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
Output:
Mounted at /content/drive
9
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Customer_ID','Customer_Name','Cust
omer_Address','Locality','City','State','Country','Postal_Code','Email_Address','Phone_Numbe
r'])
vAR_df.head()
Output:
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Product_ID','Customer_ID','Product_F
amily','Product_Group','Product','SKU','Unit Price$'])
vAR_df.head()
Output:
10
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Sales_Order_ID','Customer_ID','Produ
ct_ID','Sales_Order_Date','Quantity'])
vAR_df.head()
Output:
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Customer_ID','Customer_Name','Cust
omer_Address','Locality','City','State','Country','Postal_Code','Email_Address','Phone_Numbe
r'])
vAR_df.head()
Output:
11
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Product_ID','Customer_ID','Product_F
amily','Product_Group','Product','SKU','Unit Price$'])
vAR_df.head()
Output:
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Sales_Order_ID','Customer_ID','Produ
ct_ID','Sales_Order_Date','Quantity'])
vAR_df.head()
Output:
12
Lab 4- Validate data in the below data loaded tables
o Check if the CustomerID, ProductID and Sales OrderID are unique and not null.
o Check if the Data type of Customer Names, Product Names are CHAR.
o Check if the Customer Email has a format: [email protected]
o Check if there are any null values
o Check if there are any duplicate values
Aim:
To validate data in the below data loaded tables
• Customer Table • Product Table • Sales Order Tables
o Check if the CustomerID, ProductID and Sales OrderID are unique and not null.
o Check if the Data type of Customer Names, Product Names are CHAR.
o Check if the Customer Email has a format: [email protected]
o Check if there are any null values
o Check if there are any duplicate values
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
13
Customer table:
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
## Check if the Data type of Customer Names, Product Names are CHAR
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
14
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
Customer_ID = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
Customer_Name = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
Locality = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE City
= NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE State
= NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
Country = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
Postal_Code = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
EMail_Address = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_CUSTOMER WHERE
Phone_Number = NULL")
#vAR_cursor.fetchall()
#**************************************************************************
*****************
Output:
Product table:
#vAR_cursor.fetchall()
#**************************************************************************
*****************
15
### Check if the Product_ID is Unique
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
## Check if the Data type of Customer Names, Product Names are CHAR
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
Output:
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
#vAR_cursor.fetchall()
#**************************************************************************
*****************
17
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_SALES_ORDER WHERE
Product_ID = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_SALES_ORDER WHERE
Sales_Order_Date = NULL")
#vAR_Query = vAR_cursor.execute("SELECT * FROM DSAI_SALES_ORDER WHERE
Quantity = NULL")
#vAR_cursor.fetchall()
#**************************************************************************
*****************
Output:
18
Lab-5 Write a Select query with the following conditions using python
• Use INNER JOIN to merge the Customer table with the Product table on CustomerID
• Use LEFT JOIN to merge above result with the Sales order table on ProductID
• Use RIGHT JOIN to merge the result of first conditions with the Sales order table on
ProductID
Aim:
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd from sqlalchemy import create_engine
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
19
vAR_df =
vAR_pd.DataFrame(vAR_cursor.fetchall(),columns=['Customer_ID','Customer_Name','Cust
omer_Address','Locality','City','State','Country','Postal_Code','EMail_Address','Phone_Numb
er','Product_ID','Customer_ID1','Product_Family','Product_Group','Product','SKU','Unit_Pric
e$','Sales_Order_ID','Customer_ID2','Product_ID1','Sales_Order_Date','Quantity'])
vAR_df1 =
vAR_df[['Customer_ID','Customer_Name','Customer_Address','Locality','City','State','Countr
y','Postal_Code','EMail_Address','Phone_Number','Product_ID','Product_Family','Product_Gr
oup','Product','SKU','Unit_Price$','Sales_Order_ID','Sales_Order_Date','Quantity']]
vAR_engine =
create_engine('mysql+mysqldb://[dssaiai_struct_u]:[~z=wL1jg~Q4$]@[66.42.60.177]:[3306
]/[dssaiai_lms_structure]', echo=False)
vAR_df1.head()
Output:
20
Lab-6 Write a Select query with the following conditions using python
• Use INNER JOIN to merge the Customer table with the Product table on CustomerID
• Use LEFT JOIN to merge above result with the Sales order table on ProductID
• Use RIGHT JOIN to merge the result of first conditions with the Sales order table on
ProductID
Aim:
To write a select query with the following conditions using python
• Use INNER JOIN to merge the Customer table with the Product table on CustomerID
• Use LEFT JOIN to merge above result with the Sales order table on ProductID
• Use RIGHT JOIN to merge the result of first conditions with the Sales order table on
ProductID
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
#*****************************************************************
21
# Update Unit Price = x where SKU = x
#vAR_cursor.fetchall()
#*****************************************************************
#vAR_cursor.fetchall()
#*****************************************************************
#vAR_cursor.fetchall()
#*****************************************************************
22
vAR_cursor.fetchall()
Output:
23
Lab-7 Perform Data Transformation with some logic using python
• Add the First name and the Last Name as Full Name
• Change the Customers Full Name with Upper Case to Proper Case
• CHANGE the Sales Order Date Format as DDMMYYYY
• Sort the product in the higher order of Revenue
Aim:
To perform data transformation with some logic using python.
• Add the First name and the Last Name as Full Name
• Change the Customers Full Name with Upper Case to Proper Case
• CHANGE the Sales Order Date Format as DDMMYYYY
• Sort the product in the higher order of Revenue
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd from sqlalchemy import create_engine
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
#*****************************************************************
24
# Add the First name and the Last Name as Full Name
#vAR_cursor.fetchall()
#*****************************************************************
# Change the Customers Full Name with Upper Case to Proper Case
#vAR_cursor.fetchall()
#*****************************************************************
#vAR_cursor.fetchall()
#*****************************************************************
vAR_cursor.fetchall()
#*****************************************************************
Output:
25
Lab-8 Perform data validation with some logic using python
Aim:
To perform data validation with some logic using python.
• Validate for Customer Full Name without Double Spaces
• Validate for Sales Order Data with incorrect date correct (Should be MMYYDD)
• Validate for Incorrect Product UOM’s
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd from sqlalchemy import create_engine
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
#*****************************************************************
#vAR_cursor.fetchall()
26
#*****************************************************************
# Validate for Sales Order Data with incorrect date correct (Should be YYMMDD)
#vAR_cursor.fetchall()
#*****************************************************************
DSAI_DATA_MODEL") vAR_cursor.fetchall()
#*****************************************************************
Output:
27
Lab-9 Using two different files and merge the Files and then load into table using
Python in colab and use appropriate python libraries
Aim:
To create a python program using two different files and merge the Files and then load
into table in colab and use appropriate python libraries.
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd from sqlalchemy import create_engine
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
#*****************************************************************
# Create Two Tables from two Different file and load them
#vAR_cursor.fetchall()
#*****************************************************************
28
# Mount Data Files into Colab Notebook
Output:
Mounted at /content/drive
#vAR_cursor.fetchall()
vAR_cursor.fetchall()
Output:
29
Lab-10 Using Data from Two different Files, merge the Files, remove duplicate rows
and replace NULL values with #### and then load into table using Python In Colab
and use appropriate python libraries.
Aim:
To create a python program using data from two different files, merge the files, remove
duplicate rows and replace NULL values with #### and then load into table using python in
colab and use appropriate python libraries.
Output:
import sys
import mysql.connector
import mysql.connector from mysql.connector import Error
import pandas as vAR_pd from sqlalchemy import create_engine
vAR_conn = ""
vAR_conn = mysql.connector.connect(host="66.42.60.177",database =
"dssaiai_lms_structure",user = "dssaiai_struct_u",password = "~z=wL1jg~Q4$",port=3306)
vAR_cursor = vAR_conn.cursor()
print('Cursor Opened')
Output:
Cursor opened
#*****************************************************************
# Create Two Tables from two Different file and load them
#vAR_cursor.fetchall()
30
#*****************************************************************
Output:
Mounted at /content/drive
#vAR_cursor.fetchall()
#vAR_cursor.fetchall()
#vAR_cursor.fetchall()
vAR_cursor.fetchall()
Output:
[]
vAR_cursor.fetchall()
31
vAR_Query = vAR_cursor.execute("UPDATE Product_Prices SET Price = '####' WHERE
Product_Name= 'Tablet'")
vAR_cursor.fetchall()
Output:
[]
32
Lab 11 – Overview on MYSQL
A database is a separate application that stores a collection of data. Each database has one or
more distinct APIs for creating, accessing, managing, searching and replicating the data it
holds.
Other kinds of data stores can also be used, such as files on the file system or large hash tables
in memory but data fetching and writing would not be so fast and easy with those type of
systems.
Nowadays, we use relational database management systems (RDBMS) to store and manage
huge volume of data. This is called relational database because all the data is stored into
different tables and relations are established using primary keys or other keys known as Foreign
Keys.
A Relational DataBase Management System (RDBMS) is a software that −
Enables you to implement a database with tables, columns and indexes.
Guarantees the Referential Integrity between rows of various tables.
Updates the indexes automatically.
Interprets an SQL query and combines information from various tables.
RDBMS Terminology
Before we proceed to explain the MySQL database system, let us revise a few definitions
related to the database.
Database − A database is a collection of tables, with related data.
Table − A table is a matrix with data. A table in a database looks like a simple
spreadsheet.
Column − One column (data element) contains data of one and the same kind,
for example the column postcode.
Row − A row (= tuple, entry or record) is a group of related data, for example
the data of one subscription.
Redundancy − Storing data twice, redundantly to make the system faster.
Primary Key − A primary key is unique. A key value can not occur twice in one
table. With a key, you can only find one row.
Foreign Key − A foreign key is the linking pin between two tables.
Compound Key − A compound key (composite key) is a key that consists of
multiple columns, because one column is not sufficiently unique.
Index − An index in a database resembles an index at the back of a book.
Referential Integrity − Referential Integrity makes sure that a foreign key value
always points to an existing row.
MySQL is one of the most recognizable technologies in the modern big data ecosystem. Often
called the most popular database and currently enjoying widespread, effective use regardless
of industry, it’s clear that anyone involved with enterprise data or general IT should at least
aim for a basic familiarity of MySQL.
With MySQL, even those new to relational systems can immediately build fast, powerful, and
secure data storage systems. MySQL’s programmatic syntax and interfaces are also perfect
gateways into the wide world of other popular query languages and structured data stores.
What is MySQL?
MySQL is a relational database management system (RDBMS) developed by Oracle that is
based on structured query language (SQL).
33
A database is a structured collection of data. It may be anything from a simple shopping list to
a picture gallery or a place to hold the vast amounts of information in a corporate network. In
particular, a relational database is a digital store collecting data and organizing it according to
the relational model. In this model, tables consist of rows and columns, and relationships
between data elements all follow a strict logical structure. An RDBMS is simply the set of
software tools used to actually implement, manage, and query such a database.
MySQL is integral to many of the most popular software stacks for building and maintaining
everything from customer-facing web applications to powerful, data-driven B2B services. Its
open-source nature, stability, and rich feature set, paired with ongoing development and
support from Oracle, have meant that internet-critical organizations such as Facebook, Flickr,
Twitter, Wikipedia, and YouTube all employ MySQL backends.
34
Lab 12 – Features of MySQL
MySQL Database
MySQL is a fast, easy-to-use RDBMS being used for many small and big businesses. MySQL
is developed, marketed and supported by MySQL AB, which is a Swedish company. MySQL
is becoming so popular because of many good reasons −
MySQL is released under an open-source license. So you have nothing to pay to
use it.
MySQL is a very powerful program in its own right. It handles a large subset of
the functionality of the most expensive and powerful database packages.
MySQL uses a standard form of the well-known SQL data language.
MySQL works on many operating systems and with many languages including
PHP, PERL, C, C++, JAVA, etc.
MySQL works very quickly and works well even with large data sets.
MySQL is very friendly to PHP, the most appreciated language for web
development.
MySQL supports large databases, up to 50 million rows or more in a table. The
default file size limit for a table is 4GB, but you can increase this (if your
operating system can handle it) to a theoretical limit of 8 million terabytes (TB).
MySQL is customizable. The open-source GPL license allows programmers to
modify the MySQL software to fit their own specific environments.
MySQL enjoys the most widespread use in many industries, business users from new
webmasters to experienced managers should strive to understand its main characteristics.
Deciding whether to use this technology, and communicating about it effectively, starts with a
review of MySQL’s basic availability, structure, philosophy, and usability.
Established Oracle and third-party migration tools further allow MySQL to move data to and
from a vast set of general storage systems, whether these are designed to be on-premises or
cloud-based. MySQL can be deployed in virtualized environments, distributed or centralized,
and even exists as portable standalone libraries for learning purposes, testing, or small
applications.
MySQL’s wide compatibility with all these other systems and software makes it a particularly
practical choice of RDBMS in most situations.
35
This allows RDBMSs to better optimize actions like data retrieval, updating information, or
more complex actions like aggregations. A logical model is defined over all of the contents of
the database, describing for example the values allowed in individual columns, characteristics
of tables and views, or how indices from two tables are related.
Relational models have remained popular for several reasons. They empower users with
intuitive, declarative programming languages — essentially telling the database what result is
wanted in language akin to, or at least comprehensible as, written english, instead of
meticulously coding up each step of the procedure leading to that result. This moves a lot of
the work into the RDBMS and SQL engines, better enforcing logical rules and saving valuable
resources and manpower.
MySQL is open-source
Any individual or enterprise may freely use, modify, publish, and expand on Oracle’s open-
source MySQL code base. The software is released under the GNU General Public License
(GPL).
For MySQL code needing to be integrated or included in a commercial application (or if open-
source software is not a priority), enterprises can purchase a commercially licensed version
from Oracle.
Again, these options provide organizations with additional flexibility if deciding to work with
MySQL. The public and community-based nature of open-source releases enriches MySQL’s
documentation and online support culture, while also ensuring that sustained or newly-
developed capabilities never stray too far from current user needs.
In fact, MySQL makes many concessions to supporting the widest possible variety of data
structures, from the standard but rich logical, numeric, alphanumeric, date, and time types, to
more advanced JSON or geospatial data. Beyond mere data types and an expansive built-in
feature set, the MySQL ecosystem also includes a variety of tools, easing everything from
server management to reporting and data analysis.
Regardless of the RDBMS’s overarching architecture, users can invariably find a MySQL
feature allowing them to model and codify data how they wish. MySQL remains one of the
most straightforward database technologies to learn and use.
36
Lab – 13 SQL Vs. MySQL
The relational model was first delineated in a 1970 paper by Edgar F. Codd. One of the first
commercial programming languages related to the model, SQL, was developed shortly after at
IBM. For some time, SQL was the most widely used database language, adopted as an ANSI
standard in 1986 and in ISO a year later.
DQL: The data query language (DQL) is the most familiar and is used to run queries on
databases and extract information from stored data. For example, selecting and returning the
maximum value in a column.
DDL: A data definition language (DDL) is used to codify a database’s particular structures and
schemas. Creating a table or defining data types is an example.
DCL: A data control language (DCL) defines access, authorizations, and permissions for users
and processes accessing the database, including granting administrator privileges, or restricting
users to read-only privileges only.
DML: And finally, a data manipulation language (DML) is used to make modifications on
existing components of a database, like inserting records, updating values in cells, or deleting
data.
Swedish company MySQL AB first released MySQL in 1995. Like much of the database
software which followed the initial rise of relational systems, MySQL is simply an extension
of the original SQL standard, adding more features, support, procedural programming, control-
flow mechanisms, and more.
Though typically installed on individual machines, MySQL now includes deep support for
distributed applications and inclusion in most cloud data platforms.
Relative to many data storage and processing solutions on the market today, MySQL is an older
technology, but it shows no signs of flagging in either popularity or utility. In fact, MySQL has
enjoyed a recent resurgence over even more specialized modern storage systems, due to its
speed, reliability, ease of use, and wide compatibility.
Whether your business already uses MySQL or is planning new systems or migrations to this
RDBMS, the importance of data integration cannot be overstated. Talend provides a
comprehensive suite of apps for managing data ecosystems from end to end, allowing
businesses to collect, transform, govern, and share fast and trusted data from any system. Try
Talend Data Fabric today for a seamless data ecosystem.
37
Lab – 14 Working MySQL on Linux/UNIX
The recommended way to install MySQL on a Linux system is via RPM. MySQL AB makes
the following RPMs available for download on its website −
MySQL − The MySQL database server manages the databases and tables,
controls user access and processes the SQL queries.
MySQL-client − MySQL client programs, which make it possible to connect to
and interact with the server.
MySQL-devel − Libraries and header files that come in handy when compiling
other programs that use MySQL.
MySQL-shared − Shared libraries for the MySQL client.
MySQL-bench − Benchmark and performance testing tools for the MySQL
database server.
The MySQL RPMs listed here are all built on a SuSE Linux system, but they will usually work
on other Linux variants with no difficulty.
Now, you will need to adhere to the steps given below, to proceed with the installation −
Login to the system using the root user.
Switch to the directory containing the RPMs.
Install the MySQL database server by executing the following command.
Remember to replace the filename in italics with the file name of your RPM.
38
Lab – 15 Working MySQL on Windows
The default installation on any version of Windows is now much easier than it used to be, as
MySQL now comes neatly packaged with an installer. Simply download the installer package,
unzip it anywhere and run the setup.exe file.
The default installer setup.exe will walk you through the trivial process and by default will
install everything under C:\mysql.
Test the server by firing it up from the command prompt the first time. Go to the location of
the mysqld server which is probably C:\mysql\bin, and type −
mysqld.exe --console NOTE − If you are on NT, then you will have to use mysqld-nt.exe instead of
mysqld.exe
If all went well, you will see some messages about startup and InnoDB. If not, you may have
a permissions issue. Make sure that the directory that holds your data is accessible to whatever
user (probably MySQL) the database processes run under.
MySQL will not add itself to the start menu, and there is no particularly nice GUI way to stop
the server either. Therefore, if you tend to start the server by double clicking the mysqld
executable, you should remember to halt the process by hand by using mysqladmin, Task List,
Task Manager, or other Windows-specific means.
Verifying MySQL Installation
After MySQL, has been successfully installed, the base tables have been initialized and the
server has been started: you can verify that everything is working as it should be via some
simple tests.
Use the mysqladmin Utility to Obtain Server Status
Use mysqladmin binary to check the server version. This binary would be available in /usr/bin
on linux and in C:\mysql\bin on windows.
[root@host]# mysqladmin --version It will produce
your installation −
mysqladmin Ver 8.23 Distrib 5.0.9-0, for redhat-linux-gnu on i386
If you do not get such a message, then there may be some problem in your installation and you
would need some help to fix it.
Execute simple SQL commands using the MySQL Client
You can connect to your MySQL server through the MySQL client and by using
the mysql command. At this moment, you do not need to give any password as by default it
will be set as blank.
You can just use following command −
[root@host]# mysql
It should be rewarded with a mysql> prompt. Now, you are connected to the MySQL server
and you can execute all the SQL commands at the mysql> prompt as follows −
mysql> SHOW DATABASES;
+ +
| Database |
+ +
39
| mysql |
| test |
+ +
2 rows in set (0.13 sec)
Post-installation Steps
MySQL ships with a blank password for the root MySQL user. As soon as you have
successfully installed the database and the client, you need to set a root password as given in
the following code block −
[root@host]# mysqladmin -u root password "new_password";
Now to make a connection to your MySQL server, you would have to use the following
command −
[root@host]# mysql -u root -p
Enter password:*******
UNIX users will also want to put your MySQL directory in your PATH, so you won't have to
keep typing out the full path everytime you want to use the command-line client.
For bash, it would be something like −
export PATH = $PATH:/usr/bin:/usr/sbin Running MySQ
If you want to run the MySQL server at boot time, then make sure you have the following entry
in the /etc/rc.local file.
/etc/init.d/mysqld start Also,you shou
40