0% found this document useful (0 votes)

53 views22 pages

Top Most Common SQL Coding Errors in Data Science

Uploaded by

vlearning365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views22 pages

Top Most Common SQL Coding Errors in Data Science

Uploaded by

vlearning365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Top Most Common SQL Coding

Errors in Data Science

Let's look at some common SQL coding errors that data science beginners
usually make. We will look at the concepts and hands-on practice using
real examples.

Fail fast, make mistakes, and learn quickly but never repeat the same error. As data
science beginners, you are probably excited to start working with SQL. But, there are
some common SQL coding errors that we have identified many beginners make on our
platform.

We will look at some of these mistakes along with an example from the StrataScratch
platform where available so that you never repeat the same mistakes again.

At StrataScratch, we have over 1000 coding questions on SQL and Python, with
thousands of users solving these questions monthly. Due to such a strong community of
users, many different approaches are available to solve each question. We have
analyzed some of the solutions that our users posted and identified patterns in the
common coding errors you guys make.

This article will be helpful for people starting in the field of data science and who have
begun coding recently. This will cover things you should avoid when writing your SQL
query and some of the best practices to follow. Before moving into the topic right away,
let’s quickly see what the order of execution is for a generic SQL query.

Order of Execution of SQL Query

It is critical to understand the order of execution of your SQL query. This will help you
write better/efficient queries while avoiding syntactic or semantic errors. The order must
be in the format below, or your code will have errors.
1. Get the data (This can be FROM one table or from 2 tables when JOIN executes)
2. Filter the data (the WHERE clause is executed when the data is available)
3. Grouping (When using aggregation, GROUP BY is executed after filtering the
data)
4. HAVING clause
5. SELECT statements
6. ORDER BY

Once you understand the above order of execution, you can avoid making some typical
SQL errors in your code. Now let’s focus on some of the most common SQL coding
errors beginner data science folks commit.

Most Common SQL Coding Errors

SQL Coding Error #1: Use of Reserved Keywords
Reserved keywords are SQL keywords that should not be used when writing your SQL
query. These keywords have special meanings in the relational engine. For example,
MAX is a reserved keyword in SQL used to compute the maximum value. ORDER is a
keyword used to sort the data using the ORDER BY clause. When used in the query,
such keywords throw an error if not handled correctly.
Example from StrataScratch:

Most Active Users On Messenger

Meta/Facebook Messenger stores the number of messages between users in
a table named 'fb_messages'. In this table 'user1' is the sender,
'user2' is the receiver, and 'msg_count' is the number of messages
exchanged between them. Find the top 10 most active users on
Meta/Facebook Messenger by counting their total number of messages
sent and received. Your solution should output usernames and the count
of the total messages they sent or received

Link: https://platform.stratascratch.com/coding/10295-most-active-users-on-messenger

Dataset:

If you look at the dataset closely, you can see there are id, date, user1, user2, and
msg_count columns. While working on this question, we observed that a lot of
beginners make a similar coding error of using the SQL keyword “user” in the query.
Code With Error

SELECT id,
user1 user
FROM fb_messages;

In the above example, the candidate is trying to select the ID and the user_1 field from
the table but using user as an alias. User is a keyword in SQL and can’t be used like
that. To avoid this error, we can use the below query.

Code Without Error

SELECT id,
user1 AS user
FROM fb_messages;

In the above query, we have used the AS keyword to give an alias to the column
user1. Thus, we need to use AS and can’t have any shortcuts when using SQL
keywords.

Run the query to see if you get the same output.

The reserved keywords should ideally be avoided in your queries, so instead of using
the user as the name of the column, we can change it to username as in the below
query.

SELECT id,
user1 AS username
FROM fb_messages;

Thus, if you want to use the reserved keywords as aliases, you must use AS for giving
alias, and there shouldn’t be any shortcuts.

SQL Coding Error #2: Column as Reserved Keyword

This is a problem similar to the previous one. This SQL coding error occurs when the column in
the table is named as the reserved keyword.

Suppose you have a table named it_problems. It’s a list of IT problems categorized as
internal or external.

id problem int
1 Forgotten password Internal
2 Slow performance Internal
3 Slow performance External
4 Application crashes External
5 Printer problems Internal
6 USB problems Internal

Code With Error

If you wanted to count the number of problems by the problem type (Internal/External), you
would write this code.

SELECT int,
COUNT(id) AS problem_count
FROM it_problems
GROUP BY int;

In MySQL, this would return an error. However, the query runs without the problem in
PostgreSQL and returns this output.
int problem_count
External 2
Internal 4

Code Without Error

There are two ways of avoiding this problem. Except renaming the column in the database, that
is.

You could use backticks the following way.

SELECT `int`,
COUNT(id) AS problem_count
FROM it_problems
GROUP BY `int`;

Or you could name the table before the reserved keyword column name.

SELECT it_problems.int,
COUNT(id) AS problem_count
FROM it_problems
GROUP BY it_problems.int;

Both ways would work in MySQL.

Each DB has different reserved keywords; you can find those in the documentation.

- Postgre https://www.postgresql.org/docs/current/sql-keywords-appendix.html
- MySQL https://dev.mysql.com/doc/refman/8.0/en/keywords.html
- MS SQL Server
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/reserved-keywords-transa
ct-sql?view=sql-server-ver16
SQL Coding Error #3: Data De-Duplication
From the vast amount of solutions we have on our platform, we identified one of the
most common SQL coding errors while using a DISTINCT keyword in SQL queries.

Some questions usually ask to output a user or a product based on a certain condition
where a user/product might appear in multiple rows. Many users do not use the
DISTINCT keyword in the query, which results in duplicated user/product in the output.
An example is shown below:

Finding User Purchases

Write a query that'll identify returning active users. A returning
active user is a user that has made a second purchase within 7 days of
any other of their purchases. Output a list of user_ids of these
returning active users.

Link: https://platform.stratascratch.com/coding/10322-finding-user-purchases

Dataset:

From the above question, let’s imagine you need to find the user IDs that either buy milk
or bread or both and output the user IDs in ascending order.
Code With Error (Semantic - Duplication in the Data)

SELECT user_id
FROM amazon_transactions
WHERE item IN ('milk','bread')
ORDER BY user_id;

From the above query, the output of the code will have repeated user IDs since one
user might have bought both milk and bread. Thus, in order to de-duplicate the data, we
need to use the DISTINCT keyword, as in the below example.

Code Without Error

SELECT DISTINCT user_id

FROM amazon_transactions
WHERE item IN ('milk','bread')
ORDER BY user_id;

Run the query to see if you get the same output.

When writing your SQL queries, it’s critical to think about the question and understand
whether the results need to be de-duplicated or not. If you think yes, then use the
DISTINCT clause to avoid any duplicates in your output.
SQL Coding Error #4: Wrong Understanding of the DISTINCT Clause
In the above section, we saw the importance of using DISTINCT keywords in cases
where we don’t need any duplication. This DISTINCT keyword can be used for one
column or can be used for all the columns that the user selects.

By analyzing the solutions on the StrataScratch platform, we realized that there is a

common misconception about using the DISTINCT keyword. Data science beginners
usually think that they can apply distinct keywords to only a specific column and not
other columns from the select clause.

Let’s demonstrate this on this question by Airbnb.

Find matching hosts and guests in a way that they are both of the same
gender and nationality
Find matching hosts and guests pairs in a way that they are both of
the same gender and nationality. Output the host id and the guest id
of matched pair.

Link:
https://platform.stratascratch.com/coding/10078-find-matching-hosts-and-guests-in-a-w
ay-that-they-are-both-of-the-same-gender-and-nationality/discussion
Dataset:
airbnb_hosts

airbnb_guests

The questions asks us to find the hosts and guests pairs where they are both of the
same gender and nationality.
Below is an example of an incorrect query.

Code With Error

SELECT h.host_id,
DISTINCT g.guest_id
FROM airbnb_hosts h
INNER JOIN airbnb_guests g ON h.nationality = g.nationality
AND h.gender = g.gender;

In the above example, the code will result in an error. The DISTINCT clause should be
used at the beginning of listing the columns in the select query. Also, DISTINCT cannot
be applied only to one column, but it automatically applies to all the columns listed in the
select statement.

Code Without Error

SELECT DISTINCT h.host_id,

g.guest_id
FROM airbnb_hosts h
INNER JOIN airbnb_guests g ON h.nationality = g.nationality
AND h.gender = g.gender;

In the above example, the code will successfully run. The DISTINCT clause is used
right after the select statement. This doesn’t mean that DISTINCT is only applied to
column1 in the above example, but by default, it applies to all the columns in the select
statement (column1 and column2 in the above example).
Thus, the result of the above query will give unique values of the columns host_id and
guest_id, i.e., the unique combinations. If you want one column to be only unique
values and the other columns to be all values, you need two different
outputs/queries/results.

SQL Coding Error #5: Incorrect Use of LIMIT in Questions Where

RANK() or DENSE_RANK() Should Ideally Be Used
This is another SQL coding error that beginners do in Data Science about using LIMIT
in ranking questions. This will sometimes give the correct answer, but the solution would
be wrong if there are any edge cases. LIMIT is used when checking the sample data in
a table. For example, if we have an employee table, we can do LIMIT 10 on that table to
see the first ten rows. The RANK() function is used to rank the data based on a specific
condition.

We’ll show this in an example.

Ranking Hosts By Beds

Rank each host based on the number of beds they have listed. The host
with the most beds should be ranked 1 and the host with the least
number of beds should be ranked last. Hosts that have the same number
of beds should have the same rank but there should be no gaps between
ranking values. A host can also own multiple properties. Output the
host ID, number of beds, and rank from highest rank to lowest.

Link: https://platform.stratascratch.com/coding/10161-ranking-hosts-by-beds

Dataset:

Now let’s change this question slightly to understand this common SQL coding error. So
the new question would be: List the top 5 host IDs based on the number of beds. If
there are multiple hosts with the same number of beds, then display all host IDs.

Below is the common mistake of using LIMIT in such questions.

Code With Error (Incorrect Solution)

SELECT host_id,
SUM(n_beds) AS number_of_beds
FROM airbnb_apartments
GROUP BY host_id
ORDER BY number_of_beds desc
LIMIT 5;
From the output, you can see the top 5 host IDs based on the number of beds. But, in
reality, there are more hosts with 4 beds, and in the solution, we can only see 1 at the
5th position. Thus, we need to use DENSE_RANK() to rank all the hosts and then filter
based on the rank for each host ID.

Code Without Error (Using DENSE_RANK)

SELECT *
FROM
(
SELECT
host_id,
SUM(n_beds) AS number_of_beds,
DENSE_RANK() OVER(ORDER BY SUM(n_beds) DESC) AS rank
FROM airbnb_apartments
GROUP BY host_id
ORDER BY number_of_beds desc
)A
WHERE RANK <=5
From the output of the above query, we get a total of 7 rows because there are 3 hosts
with rank 5 in the dataset. Thus, using LIMIT would give us wrong results, and thus, it
should be used very carefully in such questions.

The RANK() and DENSE_RANK() are the window functions. It might be a good idea to
make yourself familiar with them in our ultimate guide to SQL window functions.

SQL Coding Error #6: WHERE vs HAVING

Oftentimes, beginners get confused with the WHERE clause and HAVING clause and
do not understand which one to use in which situation. If we look at the order of
execution at the start of this article, the WHERE clause is the first thing the SQL query
executes after getting the data.

The WHERE clause is used to filter specific rows, while the HAVING clause is used to
filter specific groups. The HAVING clause is used when you need to filter based on a
certain aggregation in the data.
Now let’s look at an example where many users try to filter based on the aggregation
using the WHERE clause, but instead, they should be using a HAVING clause.

Positive Ad Channels
Find the advertising channel with the smallest maximum yearly spending
that still brings in more than 1500 customers each year.

Link: https://platform.stratascratch.com/coding/10013-positive-ad-channels

Dataset:

The query tries to find the distinct advertising channels with at least 1,500 customers
acquired through that channel.

Code With Error

SELECT DISTINCT advertising_channel

FROM uber_advertising
WHERE MIN(customers_acquired) > 1500;
The above code will result in an error. In the WHERE condition, the user tries to find the
minimum value of the customers_acquired field using a MIN() function. This is an
aggregate function and can’t be used in the WHERE condition. If we need to implement
a condition based on aggregation, then HAVING must be used. Below is the correct
code for such scenarios.

Code Without Error

SELECT advertising_channel
FROM uber_advertising
GROUP BY advertising_channel
HAVING MIN(customers_acquired) > 1500;

The above code will successfully run and show the following output.

In this case, we have used the aggregate function MIN() in the HAVING clause instead
of the WHERE clause. Thus, it's crucial to read the question carefully and deduce if the
condition needs to be satisfied on every occasion or only once. With practice, beginners
in data science must get familiar with WHERE and HAVING clauses and when to use
what.

You can find more on this topic in our database interview questions article here →
https://www.stratascratch.com/blog/database-interview-questions/.
SQL Coding Error #7: Float Division
This is another common SQL coding error beginners make when computing a division
between two integer values. Let’s take an example.

Consider you have three columns in the table sales_table – date, sales, and
orders.

date sales orders

2023-02-01 4,597 49
2023-02-02 5,418 41
2023-02-03 6,974 12
2023-02-04 12,574 28
2023-02-05 41,897 58
2023-02-06 6,987 56

You need to calculate a derived column sales_per_order. The columns sales and
orders are integers, but sales_per_order should be of floating type.

Code With Error (Semantic)

SELECT date,
sales,
orders,
sales/orders AS sales_per_order
FROM sales_table;

The above query will run and give an output, but the new column generated will have an
integer rather than a floating value.

date sales orders sales_per_order

2023-02-01 4,597 49 93
2023-02-02 5,418 41 132
2023-02-03 6,974 12 581
2023-02-04 12,574 28 449
2023-02-05 41,897 58 722
2023-02-06 6,987 56 124
This is not what we wanted, and thus, we are calling it a semantic error. To get the
floating type column, you need to have at least 1 column with float type. Below is the
correct query where we convert one column into a floating type.

Code Without Error

SELECT date,
sales,
orders,
CAST(sales AS FLOAT)/orders AS sales_per_order
FROM sales_table;

From the above query, we will get the correct result for the derived column. In this, we
first converted the sales column into float using a CAST() function and then divided it
with the orders column. Even if there is only one float, the operation's output will result
in a float type.

date sales orders sales_per_order

2023-02-01 4,597 49 93.82
2023-02-02 5,418 41 132.15
2023-02-03 6,974 12 581.17
2023-02-04 12,574 28 449.07
2023-02-05 41,897 58 722.36
2023-02-06 6,987 56 124.77

Thus, by just transforming at least one column to a float type, the result of the division is
going to be a floating number. Another way to change the integer column to a float is by
multiplying the value with 1.0, which is similar to casting.

SQL Coding Error #8: Entities Need to Be Associated With Two

Specific Instances From a Category
It might be difficult to understand what the problem here is, but we’ll explain. It’s a
common problem where you need to output entities that need to have a combination of
two (or more) specific values from the same column. For example, the user needs to be
both an Android and iPhone user.
Many users try to solve this problem by using the AND logical operator in the WHERE
clause.

Let’s look at the example where we want to find the user IDs that have at least one
‘Refinance’ and one ‘InSchool’ submission. In other words, they need to have at
submissions of both categories.

Submission Types
Write a query that returns the user ID of all users that have created
at least one ‘Refinance’ submission and at least one ‘InSchool’
submission.

Link: https://platform.stratascratch.com/coding/2002-submission-types/discussion

Dataset:
Code With Error

SELECT user_id
FROM loans
WHERE type = 'Refinance' AND type = 'InSchool';

This code won’t return an error per se, but the output will be empty. Why? The WHERE
clause doesn’t have the context of other rows. The AND operator says the type has to
be ‘Refinance’ or ‘InSchool’, but this is never true – there’s no one row with the
‘Refinance’ and ‘InSchool’ values in the same row.

Code Without Error

One way to write the correct code is to use the INTERSECT operator. It will combine two
SELECT statements and return their intersections as output, i.e., the rows common to both
SELECT statements.

SELECT user_id
FROM loans
WHERE type = 'Refinance'
INTERSECT
SELECT user_id
FROM loans
WHERE type = 'InSchool';

The other approach could be to write two CTE. One that returns user IDs with the
‘Refinance’ submission, the other with ‘InSchool’.

Then write the SELECT statement where you JOIN the two CTEs and return the distinct
user IDs to remove duplicates.

WITH refinance AS (
SELECT user_id
FROM loans
WHERE type = 'Refinance'
),

inschool AS (
SELECT user_id
FROM loans
WHERE type = 'InSchool')

SELECT DISTINCT r.user_id

FROM refinance r
JOIN inschool i
ON r.user_id = i.user_id;

Both queries will return the user whose ID is 108.

Summary
In this article, we covered the top SQL coding errors that Data Science beginners make
by analyzing the solutions submitted to our platform. This will help the readers
understand what these mistakes are, how to avoid them in the future, or what the
workarounds can be. Also, we discussed the order of execution of SQL queries to help
beginners understand what part executes first and what part executes last. We hope
this article will help you in your journey to become a data scientist.

Don’t be overwhelmed with the topics that we discussed today. Remember, Rome
wasn't built in one day, so stick with StrataScratch, and slowly and steadily you will get
to your desired position. All the best.

SQL Server 2012 Exam 70-461 Errata
No ratings yet
SQL Server 2012 Exam 70-461 Errata
31 pages
World of SQL
No ratings yet
World of SQL
30 pages
Crack Your Data Engineering SQL Round
100% (1)
Crack Your Data Engineering SQL Round
112 pages
SQL Text Functions and Advanced Techniques
No ratings yet
SQL Text Functions and Advanced Techniques
7 pages
22 DBMS Unit3
No ratings yet
22 DBMS Unit3
89 pages
Data Analysis With SQL: Mysql Cheat Sheet
100% (1)
Data Analysis With SQL: Mysql Cheat Sheet
4 pages
Teradata SQL Course Outline
No ratings yet
Teradata SQL Course Outline
47 pages
Data Analysis With SQL: Postgresql Cheat Sheet
No ratings yet
Data Analysis With SQL: Postgresql Cheat Sheet
4 pages
chp04 05 More SQL
No ratings yet
chp04 05 More SQL
23 pages
Week 09-10 (Oct 5-Oct 17) MC Data Query Language
No ratings yet
Week 09-10 (Oct 5-Oct 17) MC Data Query Language
35 pages
Intermediate SQL Edited
No ratings yet
Intermediate SQL Edited
95 pages
3 Notes of 3 Unit
No ratings yet
3 Notes of 3 Unit
36 pages
18+ SQL Best Practices & Optimisation Interview Q&As - 800+ Big Data & Java Interview FAQs
No ratings yet
18+ SQL Best Practices & Optimisation Interview Q&As - 800+ Big Data & Java Interview FAQs
15 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
16 pages
Real DSA and SQL Interview Questions Solutions
No ratings yet
Real DSA and SQL Interview Questions Solutions
15 pages
SQL 101 A Beginner S Guide To Database From 1 To N Dev Nodrm
No ratings yet
SQL 101 A Beginner S Guide To Database From 1 To N Dev Nodrm
47 pages
SQL Interview Questions Guide
No ratings yet
SQL Interview Questions Guide
11 pages
Advanced SQL Techniques for Data Science
No ratings yet
Advanced SQL Techniques for Data Science
38 pages
Module III DBMS
No ratings yet
Module III DBMS
13 pages
SQL Solutions Detailed Cleaned
No ratings yet
SQL Solutions Detailed Cleaned
5 pages
SQL Interview Cheat Sheet
No ratings yet
SQL Interview Cheat Sheet
16 pages
SQL Simplification for Data Scientists
No ratings yet
SQL Simplification for Data Scientists
22 pages
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
62 pages
Appendix 2 - SQL Answers: Activity 7.1
No ratings yet
Appendix 2 - SQL Answers: Activity 7.1
8 pages
SQL Guide
No ratings yet
SQL Guide
23 pages
SQL Joins and Queries Guide
No ratings yet
SQL Joins and Queries Guide
47 pages
70+ SQL Interview Questions
No ratings yet
70+ SQL Interview Questions
19 pages
KPMG Data Analyst Interview Questions
No ratings yet
KPMG Data Analyst Interview Questions
30 pages
General Structure Comparison Grouping Display Order Logical Operators Output Union
No ratings yet
General Structure Comparison Grouping Display Order Logical Operators Output Union
5 pages
Unit-2 Chapter-3 SQL Queries
No ratings yet
Unit-2 Chapter-3 SQL Queries
47 pages
30 Most Common SQL Queries For Beginners - Whizlabs Blog
No ratings yet
30 Most Common SQL Queries For Beginners - Whizlabs Blog
7 pages
SQL Joins: Interview Prep Guide
No ratings yet
SQL Joins: Interview Prep Guide
8 pages
Ivy Ms-Access SQL Slides v6
No ratings yet
Ivy Ms-Access SQL Slides v6
99 pages
SQL Basics and Examples Guide
No ratings yet
SQL Basics and Examples Guide
68 pages
My SQL
No ratings yet
My SQL
10 pages
SQL by Rohan
No ratings yet
SQL by Rohan
13 pages
Lecture 4 SQL Adv II PDF
No ratings yet
Lecture 4 SQL Adv II PDF
47 pages
Week 2 - SQL Joins, Subqueries, Groupby
No ratings yet
Week 2 - SQL Joins, Subqueries, Groupby
14 pages
SQL 1721960421
No ratings yet
SQL 1721960421
131 pages
SQL Query Examples and Operators Guide
No ratings yet
SQL Query Examples and Operators Guide
7 pages
SQL Table Management and Joins Guide
No ratings yet
SQL Table Management and Joins Guide
36 pages
SQL For Data Analysis PDF
100% (1)
SQL For Data Analysis PDF
10 pages
Interactive SQL Tutorial Guide
No ratings yet
Interactive SQL Tutorial Guide
70 pages
Aaaaaa
No ratings yet
Aaaaaa
15 pages
SQL Syntax Cheat Sheet: Basics to Advanced
No ratings yet
SQL Syntax Cheat Sheet: Basics to Advanced
15 pages
SQL Notes
No ratings yet
SQL Notes
24 pages
Unit 3
No ratings yet
Unit 3
64 pages
Structured Query Language
No ratings yet
Structured Query Language
68 pages
Database SQL
No ratings yet
Database SQL
24 pages
Day3 Datanalyst
No ratings yet
Day3 Datanalyst
10 pages
12 Essential SQL Interview Questions and Answers - Upwork™
No ratings yet
12 Essential SQL Interview Questions and Answers - Upwork™
16 pages
SQL Operators Functions and Keywords
No ratings yet
SQL Operators Functions and Keywords
9 pages
The Most Commonly Used SQL Queries
No ratings yet
The Most Commonly Used SQL Queries
29 pages
Guide To SQL Queries Basics
No ratings yet
Guide To SQL Queries Basics
15 pages
My Contacts 1
No ratings yet
My Contacts 1
10 pages
MAPEH Assessment (Health 3rd Quarter)
100% (10)
MAPEH Assessment (Health 3rd Quarter)
2 pages
Brazos County Health Inspections Report
No ratings yet
Brazos County Health Inspections Report
5 pages
PR Avon Tubetech 21feb23
No ratings yet
PR Avon Tubetech 21feb23
7 pages
Sheeja 2
No ratings yet
Sheeja 2
1 page
Introduction To Documentary
No ratings yet
Introduction To Documentary
45 pages
Science Technology and Society - LP3
No ratings yet
Science Technology and Society - LP3
14 pages
Bamboo Pavement Case Study
No ratings yet
Bamboo Pavement Case Study
1 page
ELASTO-DECK 5001 HT Waterproofing Guide
No ratings yet
ELASTO-DECK 5001 HT Waterproofing Guide
3 pages
Academic Resume-Emmery Flanagan 1
No ratings yet
Academic Resume-Emmery Flanagan 1
2 pages
Waybill-2023-06-21 09 - 33 - 41
No ratings yet
Waybill-2023-06-21 09 - 33 - 41
10 pages
MSME
No ratings yet
MSME
3 pages
Hytrin (Kandungan Sama Dengan Hytroz)
No ratings yet
Hytrin (Kandungan Sama Dengan Hytroz)
7 pages
円筒形容器の選択理由
No ratings yet
円筒形容器の選択理由
1 page
National English Exam Prep
100% (1)
National English Exam Prep
20 pages
Present Continuous - Affirmative and Negative PDF
No ratings yet
Present Continuous - Affirmative and Negative PDF
3 pages
Syllabus AEC
No ratings yet
Syllabus AEC
1 page
BAT Product Academy - BLEND - Fact Sheet
No ratings yet
BAT Product Academy - BLEND - Fact Sheet
2 pages
Pop Culture
No ratings yet
Pop Culture
270 pages
Being and Time A Revised Edition of The Stambaugh Translation Martin Heidegger Full Digital Chapters
No ratings yet
Being and Time A Revised Edition of The Stambaugh Translation Martin Heidegger Full Digital Chapters
59 pages
Angelarium Oracle of Emanations
No ratings yet
Angelarium Oracle of Emanations
145 pages
Reverse Charge Tax on Service Supplies
No ratings yet
Reverse Charge Tax on Service Supplies
12 pages
Mercedes Benz M264 M260 1.5L 2.0L Engine
50% (2)
Mercedes Benz M264 M260 1.5L 2.0L Engine
8 pages
Corrigé Type Anglais BAC G
No ratings yet
Corrigé Type Anglais BAC G
2 pages
L - Chap-5
No ratings yet
L - Chap-5
34 pages
Engineering Economy Practice Problems
No ratings yet
Engineering Economy Practice Problems
4 pages
Schedule Test Series@Vision IAS
No ratings yet
Schedule Test Series@Vision IAS
16 pages
Predicting Heart Disease at Early Stages Using Machine Learning: A Survey
No ratings yet
Predicting Heart Disease at Early Stages Using Machine Learning: A Survey
4 pages
EMX1
No ratings yet
EMX1
3 pages
Build A Large Space Saving CNC Router For Under 60
No ratings yet
Build A Large Space Saving CNC Router For Under 60
10 pages

Top Most Common SQL Coding Errors in Data Science

Uploaded by

Top Most Common SQL Coding Errors in Data Science

Uploaded by

Top Most Common SQL Coding

Errors in Data Science

Order of Execution of SQL Query

Most Common SQL Coding Errors

Most Active Users On Messenger

Code Without Error

Run the query to see if you get the same output.

SQL Coding Error #2: Column as Reserved Keyword

Code With Error

Code Without Error

You could use backticks the following way.

Both ways would work in MySQL.

Finding User Purchases

Code Without Error

SELECT DISTINCT user_id

Run the query to see if you get the same output.

By analyzing the solutions on the StrataScratch platform, we realized that there is a

Let’s demonstrate this on this question by Airbnb.

Code With Error

Code Without Error

SELECT DISTINCT h.host_id,

SQL Coding Error #5: Incorrect Use of LIMIT in Questions Where

We’ll show this in an example.

Ranking Hosts By Beds

Below is the common mistake of using LIMIT in such questions.

Code With Error (Incorrect Solution)

Code Without Error (Using DENSE_RANK)

SQL Coding Error #6: WHERE vs HAVING

Code With Error

SELECT DISTINCT advertising_channel

Code Without Error

date sales orders

Code With Error (Semantic)

date sales orders sales_per_order

Code Without Error

date sales orders sales_per_order

SQL Coding Error #8: Entities Need to Be Associated With Two

Code Without Error

SELECT DISTINCT r.user_id

Both queries will return the user whose ID is 108.

You might also like