0% found this document useful (0 votes)

35 views

How To Run SQL Queries Against The Database?

The document provides instructions on how to run SQL queries against a database to retrieve and analyze data. It explains that the SELECT statement is used to retrieve data from tables. Examples are given to show how to select specific columns, apply filters, sorting, and joins to retrieve subsets of genes meeting certain criteria. The examples demonstrate how to extract stage-specific gene clusters from a database table containing read counts for different stages.

Uploaded by

Jonathan Robinson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

How To Run SQL Queries Against The Database?

Uploaded by

Jonathan Robinson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

How to run SQL Queries against the database?

As most of the databases in the world, our database uses a special language named SQL
(Structured Query Language) to execute queries.

This language allows different operations, but the most common is to retrieve the data. For
this purpose SELECT statement is used.

How SELECT works?

Well, it is quite intuitive once you know the basic things.

First it is good to know:

 The name of the table that you want to query.
 The structure of the table (fields, data types…)

Let’s start with an example:

 We want to retrieve all the data included in the table: all_info. The list of the tables
available is shown in the drop‐down menu Table Info:

 Select the option "all_info", and the description of the table will appear:

… and scrolling:

This basically means that:

The table “all_info” has 13 fields (columns) with the names:

cluster_id    unique cluster identifier (ranges from 1 .. 29608)
cluster_name     ong name of the cluster (rA_c02_1 ‐ re‐assembly number 02 cluster 1)
cluster_seq DNA sequence of the cluster
cluster_length    length of the sequence (in nucleotides)
P_reads number of reads in POLYP stage
S_reads   number of reads in STROBILA stage
E_reads number of reads in EPHYRA stage
pep_seq predicted peptide sequence (or No Prediction tag)
pep_length length of the peptide (in amino acids)
score peptide prediction score (given by EstScan)
start_nuc position of the first nucleotide of the ORF
stop_nuc position of the last nucleotide of the ORF

The table also indicates the type of the data: numeric (int or smallint) or text (varchar,
mediumtext).

Now we are going to construct and run our first query (retrieve all the data from a given
table).

Just type the following on the query field:

SELECT *
FROM all_info

Which means: Retrieve (SELECT) all the fields (*) FROM table named: all_info.

NOTE: The syntax is case insensitive, so “SELECT” is the same as “select” or “SElecT”.
The same happens with the column names and symbols.

 Press the button and a table with the results should appear (If not,
check that you have written the query correctly):

IMPORTANT: The results are limited to 1000 rows by default. Depending of the query, the
results can be very large and memory consuming. Type "0" in the field for using no limit.

Saving results as CSV

In order to export the result to a file, the button can be used. This will save
the results and the filters applied to the table in a CSV file that one can download to the
computer and open in Ms Excel, for example.

Hiding Columns

Question: How can I retrieve only data from columns cluster_id, P_reads, S_reads, E_reads
and total_reads, for example?

Answer: Just change the * for the field names separated by commas (,) except the last.

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info

Ordering Data

Question: How do I order the data ascendant or descendant?
Answer: There are two possibilities, but with a slightly difference:

 Click on the column’s header. This will order the values of this column ascendant or
descendant, but only affects to the values displayed. Here we sorted all the rows
according to the expression level in strobila (see small arrowhead near "S_reads").

 Use the clause ORDER BY in your query and use ASC or DESC to indicate the order.
This affects to all the values in the database table.

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
ORDER BY S_reads DESC

Filtering Data

Question: How do I filter the data?
Answer: There are two possibilities:

 Use the filter fields on the column header (text or numeric data). But, once time
more, it only affects to the values displayed. If your result is greater than the limit
specified (1000 by default) it could be some values on the results table that are not
showed.

For numeric values the following operators are available:

equal to: = N
greater than: > N
lesser than: < N
lesser or equal: <= N
greater or equal: >= N
range of values: N1 .. N2

 Use the clause WHERE and then the condition of your filter (it allows more complex
searches). This is the recommended way:

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE S_reads>1000 and E_reads<1000

 Example with arithmetical and logical operations:

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE (S_reads+1) / (P_reads+1) >= 500

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE (S_reads+1) / (P_reads+1) >= 500 and total_reads>2000

 Adding an ORDER BY:

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE (S_reads+1) / (P_reads+1) >= 500 and total_reads>2000
ORDER BY E_reads DESC

Selecting sub‐sets of genes

Question: How can I retrieve the list of strobila‐specific genes? For example, we want to see
only the clusters where more than 80% of all reads originate from the strobila stage.
Moreover, we want to retrieve only the clusters where the total number of reads is >=10.

Answer: We have to add additional conditions to our previous query. 80% threshold means
that dividing the number of reads in strobila (S_reads) by the total number of reads
(total_reads) we should get values >= 0.8. Total read number (total_reads value) should be
>=10. So, we need to use a clause WHERE with two conditions:

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE S_reads / total_reads >= 0.8 and total_reads >=10

As a result we will get a list of 345 strobila‐specific clusters (see Fuchs et al. Fig. 2B):

And now let us get the list of the polyp‐specific genes. We need to change just one
parameter (P_reads) and our new query will be:

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE P_reads / total_reads >= 0.8 and total_reads >=10

As a result we will get a list of 336 polyp‐specific clusters (see Fuchs et al. Fig2B):

Getting the list of the ephyra‐specific genes is easy now. Here is the corresponding query:

SELECT cluster_id, P_reads, S_reads, E_reads, total_reads
FROM all_info
WHERE E_reads / total_reads >= 0.8 and total_reads >=10

Working with the microarray data

The advantage of a relational database (like MySQL) is that it allows to work with large data
sets and gives absolute flexibility in "asking" question of any level of complexity. It is easy to
link different data types together, for example, sequence data with the corresponding
expression values, peptide prediction, images and so on. For analysing the data one needs to
describe the "question" as a set of mathematical and logical operators (in a similar way like
in R, MatLab and similar programs).

In the following examples we will use the table "array_normalized".

This table contains mean signal values from independent experiments
((replicate_1+replicate_2+replicate_3)/3). Mean signal values across the stages (polyp, 14
days 5‐Aza‐Cytidine, 14 days control, ..., Ephyra) have been normalized based on the
expression level of elongation factor‐1‐alpha (EF1α). Values in the table has not been
subjected to logarithmic transformation. Log2 or Log10 transformation is important for
presenting data in a form of a heat map, but for comparing expression that operation is not
necessary.

The table "array_normalized" contains 12 field (columns):

id_entry unique entry identifier (primary key)
id_oligo unique oligonucleotide name
P_signal expression in POLYP (24h at 10°C)
AZA14_signal expression in POLYP (14 days at 10°C, incubated in 5‐Aza‐cytidine)
CON14_signal expression in POLYP (14 days at 10°C, DMSO control)
AZA16_signal expression in POLYP (16 days at 10°C, incubated in 5‐Aza‐cytidine)
CON16_signal expression in POLYP (16 days at 10°C, DMSO control)
ES_signal expression in STROBILA with 1 segment
LS_signal expression in STROBILA with 5 segments
E_signal expression in EPHYRA (freshly detached)
cl_name long name of the cluster (1‐RA_1 ‐ cluster 1, rc_8‐RA_8 ‐ cluster 8)
cl_id unique cluster identifier (ranges from 1 .. 29608)

1) To view all the values from the table type:

SELECT * FROM array_normalized

The table with results should appear (if not, please check that the query has been correctly
written):

IMPORTANT: The results are limited to 1000 rows by default. Depending of the query, the
results can be very large and memory consuming. Type "0" in the field for using no limit.

2) To find all the genes where expression in early strobila is 100 times stronger than in a
polyp type:
SELECT * FROM array_normalized
WHERE ES_signal / P_signal >= 100
ORDER by cl_id ASC

IMPORTANT: Results will be sorted according to the cluster identifiers in ascending order
(ORDER by cl_id ASC).

You can also sort the results by clicking on the column's headers.

3) To identify genes which are up‐regulated during the temperature induction and might
function as a strobilation inducer one will need a bit more complex query with many
conditions (now we will describe the hypothetical model in Fig.4A):

SELECT * FROM array_normalized
where
(P_signal+AZA14_signal+CON14_signal+AZA16_signal+CON16_signal+ES_signal+LS_signal+E
_signal>=100)
and P_signal<50
and CON14_signal>AZA14_signal
and CON16_signal>AZA16_signal
and ES_signal / P_signal>=5
and LS_signal / P_signal>=10
and LS_signal >= 1000
order by cl_id ASC

Here is the short explanation of the query:

1) We want to select genes which are expressed not extremely weak ‐ cumulative expression
must be >= 100:

(P_signal+AZA14_signal+CON14_signal+AZA16_signal+CON16_signal+ES_signal+LS_signal+E
_signal>=100)

2) and the expression in the polyp stage must be weak:

and P_signal<50

3) now we check that the genes are 5‐AZA‐cytidin sensitive and the expression increases at
cold temperature:

and CON14_signal>AZA14_signal
and CON16_signal>AZA16_signal
and ES_signal / P_signal>=5
and LS_signal / P_signal>=10

4) expression in late strobila must be relatively high:

and LS_signal >= 1000

5) ordering according to the cluster IDs (gene idenifiers):

order by cl_id ASC

As a result we get a list of potential strobilation inducers (27 clusters) represented by the
heat map in Fig.4B in Fuchs et al. Expression dynamics of these genes follows the model
represented in the Fig.4A.

(See screenshot in the next page)

Thus, by using a simple set of commands one can extract a lot of information with nearly
unlimited flexibility. It is also possible to combine data from several tables (JOIN statement).

There is more information about databases and SQL at, for example:
http://www.w3schools.com/sql/

One can also consult MySQL Reference Manual at:
http://dev.mysql.com/doc/refman/5.5/en/index.html

ECSE 2010 Circuits
No ratings yet
ECSE 2010 Circuits
3 pages
Database System 3.19
No ratings yet
Database System 3.19
33 pages
Structured Query Languages SQL
No ratings yet
Structured Query Languages SQL
170 pages
02 SQL 1 Final
No ratings yet
02 SQL 1 Final
41 pages
sql_tutorial
No ratings yet
sql_tutorial
46 pages
Chapter-1-: 1.1. What Is SQL?
No ratings yet
Chapter-1-: 1.1. What Is SQL?
22 pages
5.4
No ratings yet
5.4
27 pages
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
Advanced SAS Interview Questions You'll Most Likely Be Asked
From Everand
Advanced SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SQL Part 1
No ratings yet
SQL Part 1
4 pages
Introduction To Structured Query Language
No ratings yet
Introduction To Structured Query Language
23 pages
Chapter 3. Introduction To SQL: Objectives
No ratings yet
Chapter 3. Introduction To SQL: Objectives
38 pages
Chapter 3. Introduction To SQL: Objectives
No ratings yet
Chapter 3. Introduction To SQL: Objectives
38 pages
Chapter 3. Introduction To SQL: Objectives
No ratings yet
Chapter 3. Introduction To SQL: Objectives
38 pages
@@database (SQL)
No ratings yet
@@database (SQL)
82 pages
SQL For Marketers & Growth Hackers
No ratings yet
SQL For Marketers & Growth Hackers
49 pages
SQL
No ratings yet
SQL
42 pages
F.3 Computer Literacy Database: Select From Where AND AND
No ratings yet
F.3 Computer Literacy Database: Select From Where AND AND
18 pages
The Practice File
No ratings yet
The Practice File
50 pages
ashdbms_537fb839-ca68-434f-b6cf-296582e1fbd7
No ratings yet
ashdbms_537fb839-ca68-434f-b6cf-296582e1fbd7
30 pages
pukardbms_1e627a62-07f9-4c89-987a-0ea3a91fb18c
No ratings yet
pukardbms_1e627a62-07f9-4c89-987a-0ea3a91fb18c
30 pages
SQL Is A Standard Language For Accessing and Manipulating Databases. What Is SQL?
No ratings yet
SQL Is A Standard Language For Accessing and Manipulating Databases. What Is SQL?
25 pages
sql-primer
No ratings yet
sql-primer
8 pages
knowledge notes 5
No ratings yet
knowledge notes 5
13 pages
Intro
No ratings yet
Intro
4 pages
ICT Unit 3 Notes - G12
100% (1)
ICT Unit 3 Notes - G12
5 pages
SQL Query
No ratings yet
SQL Query
9 pages
Writing SQL Queries Bsics
No ratings yet
Writing SQL Queries Bsics
9 pages
Structure Query Language (SQL)
No ratings yet
Structure Query Language (SQL)
112 pages
Lecture8 SQL PartI Jan30 2018
No ratings yet
Lecture8 SQL PartI Jan30 2018
52 pages
SQL SELECT - Basic Queries-020425-064414
No ratings yet
SQL SELECT - Basic Queries-020425-064414
4 pages
Learn SQL
No ratings yet
Learn SQL
70 pages
Lecture 2.1.1
No ratings yet
Lecture 2.1.1
21 pages
Dms Lab Manual Updated
No ratings yet
Dms Lab Manual Updated
87 pages
SQL For Beginners
No ratings yet
SQL For Beginners
79 pages
Writing SQL Queries
No ratings yet
Writing SQL Queries
15 pages
SQL-99: Schema Definition, Basic Constraints, and Queries
100% (1)
SQL-99: Schema Definition, Basic Constraints, and Queries
92 pages
ch03 - DS - The Relational Model 2 - SQL
No ratings yet
ch03 - DS - The Relational Model 2 - SQL
40 pages
Dbms Module 4
No ratings yet
Dbms Module 4
127 pages
Unit 6 SQL
No ratings yet
Unit 6 SQL
23 pages
Writing SQL Queries - Let's Start With The Basics
No ratings yet
Writing SQL Queries - Let's Start With The Basics
6 pages
SQL Lecture Notes Compilation
No ratings yet
SQL Lecture Notes Compilation
6 pages
Learn SQL
No ratings yet
Learn SQL
11 pages
Chapter 4 Database Management-1
No ratings yet
Chapter 4 Database Management-1
125 pages
DBMS 3
No ratings yet
DBMS 3
29 pages
Big Data Analytics: Database - SQL
No ratings yet
Big Data Analytics: Database - SQL
17 pages
Inf3710 - Course 01 - SQL
No ratings yet
Inf3710 - Course 01 - SQL
22 pages
02 Modernsql
No ratings yet
02 Modernsql
7 pages
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet
Lecture 10: SQL - DML: Reference: Read Chapter 4 of The Textbook
No ratings yet
Lecture 10: SQL - DML: Reference: Read Chapter 4 of The Textbook
41 pages
What Is SQL?
No ratings yet
What Is SQL?
18 pages
Lab 05 - Database Systems
No ratings yet
Lab 05 - Database Systems
17 pages
DBMS Lab Manual
From Everand
DBMS Lab Manual
Jitendra Patel
1.5/5 (3)
1 SQL - Structured Query Language: 1.1 Tables
No ratings yet
1 SQL - Structured Query Language: 1.1 Tables
19 pages
Lab 1
No ratings yet
Lab 1
9 pages
Structured Query Language (SQL) : Prepared By:-Rahul
No ratings yet
Structured Query Language (SQL) : Prepared By:-Rahul
79 pages
4 SQL Select
No ratings yet
4 SQL Select
20 pages
SQL Queries Part1
No ratings yet
SQL Queries Part1
42 pages
Lecture #8 to 9
No ratings yet
Lecture #8 to 9
33 pages
Excel Techniques
From Everand
Excel Techniques
Online Trainees
2/5 (1)
Basic SQL
100% (1)
Basic SQL
90 pages
Learn Javascript in A DAY!
100% (8)
Learn Javascript in A DAY!
192 pages
Syllabus For Written Test and Selection Criteria For Recruitment To The Post of SUPERINTENDENT
No ratings yet
Syllabus For Written Test and Selection Criteria For Recruitment To The Post of SUPERINTENDENT
2 pages
2020 Maths GR 12 T2 Inverse Exponential
No ratings yet
2020 Maths GR 12 T2 Inverse Exponential
10 pages
FT-IR - Experiment Setup
No ratings yet
FT-IR - Experiment Setup
34 pages
Refreshers_general Education_gurong Pinoy Reviewer
No ratings yet
Refreshers_general Education_gurong Pinoy Reviewer
5 pages
Class 7 Term 2 Worksheet 3A - Estimation
100% (1)
Class 7 Term 2 Worksheet 3A - Estimation
3 pages
Antenna Lec
100% (1)
Antenna Lec
29 pages
FreeRADIUS Implementation Ch16
No ratings yet
FreeRADIUS Implementation Ch16
25 pages
Mathematics Curriculum Guide Grade 10
No ratings yet
Mathematics Curriculum Guide Grade 10
3 pages
E1251 Aug 3:0 Linear and Nonlinear Optimization: Instructor
No ratings yet
E1251 Aug 3:0 Linear and Nonlinear Optimization: Instructor
2 pages
Estimation of Hydraulic Conductivity From Grain Size Analyses
No ratings yet
Estimation of Hydraulic Conductivity From Grain Size Analyses
96 pages
مذكرة عملى ف2 حاسبات
No ratings yet
مذكرة عملى ف2 حاسبات
33 pages
Binary Division Attack For Elliptic Curve Discrete
No ratings yet
Binary Division Attack For Elliptic Curve Discrete
16 pages
C++ Functions
No ratings yet
C++ Functions
5 pages
Blanco Oliver Irimia Diéguez2021 - Article - ImpactOfOutreachOnFinancialPer
No ratings yet
Blanco Oliver Irimia Diéguez2021 - Article - ImpactOfOutreachOnFinancialPer
36 pages
12 Inertia
No ratings yet
12 Inertia
9 pages
Taylor Series in Chemistry
No ratings yet
Taylor Series in Chemistry
4 pages
hypothesis 练习答案
No ratings yet
hypothesis 练习答案
13 pages
Q3 3RD Summative Test Math3
100% (1)
Q3 3RD Summative Test Math3
2 pages
0606 m17 Ms 22
No ratings yet
0606 m17 Ms 22
8 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
Comparing Quantities Worksheet Class 8
No ratings yet
Comparing Quantities Worksheet Class 8
8 pages
Matlab 14
No ratings yet
Matlab 14
14 pages
Chapter 6
No ratings yet
Chapter 6
20 pages
Or Notes (Unit Iv)
100% (1)
Or Notes (Unit Iv)
24 pages
An Application of Reinforcement Learning To Aerobatic Helicopter Flight
No ratings yet
An Application of Reinforcement Learning To Aerobatic Helicopter Flight
8 pages
The Raven Progressive Matrices Tests Their Theoret
No ratings yet
The Raven Progressive Matrices Tests Their Theoret
53 pages
Design and Use of The Microsoft Excel Solver
No ratings yet
Design and Use of The Microsoft Excel Solver
26 pages
Goal Programming: Case Study 9.1
100% (1)
Goal Programming: Case Study 9.1
13 pages

How To Run SQL Queries Against The Database?

Uploaded by

How To Run SQL Queries Against The Database?

Uploaded by

How to run SQL Queries against the database?

You might also like