SQL DISTINCT Query

Tutorial Playlist

SQL DISTINCT Query is a general statement used to fetch data using queries. It will make sure that there are no duplicate values in the columns by applying the DISTINCT keyword. The SQL DISTINCT Query will remove all the duplicate rows from the table or column. The returned result will be of unique values based on a specific column where the DISTINCT keyword is used. In this article, we will learn about DISTINCT Query in SQL, its performance, and its use cases. 

Table of Contents:

What is DISTINCT in SQL?

The DISTINCT in SQL is a keyword that is used to fetch only the unique values in a column or multiple columns. It removes all the duplicates from the table and returns only the records that don’t have any duplicate values. DISTINCT can be used in a SELECT statement to specify the particular row or column on which it has to be applied.  

Syntax:

SELECT DISTINCT column1, column2, ...
FROM table_name;

Example:

CREATE TABLE employees (
    id INT,
    name VARCHAR(100),
    department VARCHAR(50),
    job_title VARCHAR(50)
);
INSERT INTO employees (id, name, department, job_title) VALUES
(1, 'Aarush', 'HR', 'Manager'),
(2, 'Binni', 'IT', 'Developer'),
(3, 'Gaurav', 'Marketing', 'Analyst');
SELECT DISTINCT department FROM employees;

Output:

 Distinct_create_table

Explanation: Here, the SQL SELECT DISTINCT Query returned only the unique values and neglected the duplicate records from the table.

Why Do We Need To Use DISTINCT?

The DISTINCT keyword in SQL is used to remove all duplicate data from the table. It maintains the query integrity and helps you when there is a need to fetch the data. As all the values are unique, fetching the values will be easier. This DISTINCT keyword is very useful when handling reports or any important logs. 

When to Use DISTINCT?

We can use the DISTINCT keyword in SQL when we are querying a large dataset, where duplicates are inevitable. There will be unnecessary duplicates that need to be deleted at that time. The DISTINCT keyword is used to remove duplicates in SQL. When you need to create reports with all the unique values, you can use DISTINCT. 

Master SQL DISTINCT Query – Unlock Advanced SQL Skills Today!
Enroll now and transform your future!
quiz-icon

How to Use SELECT DISTINCT in SQL Queries?

There are two syntaxes that can be used in single and SELECT DISTINCT with multiple columns to get the unique records from the database. The DISTINCT keyword in SQL will be applied to the entire row of the specified column. It won’t apply to individual columns. It may affect the performance when working on a large dataset, as it needs to scan every column in the database to remove duplicates in SQL. 

Let’s create a dataset to get unique values from a single column and multiple columns. 

Example:

CREATE TABLE orders (
    order_id INT,
    customer_name VARCHAR(50),
    product_name VARCHAR(50)
);
INSERT INTO orders (order_id, customer_name, product_name) VALUES
(1, 'Karan', 'Laptop'),
(2, 'Yuva', 'Phone'),
(3, 'Karan', 'Laptop'),
(4, 'Yuva', 'Tablet'),
(5, 'Karan', 'Phone')

SQL SELECT DISTINCT for a Single Column

You can apply the DISTINCT keyword in SQL to a particular column to get only the unique values from the table. 

Syntax:

SELECT DISTINCT column_name
FROM table_name;

Example:

SELECT DISTINCT customer_name FROM orders; 

Output:

Distinct_single column

Explanation: Here, the DISTINCT keyword fetched a single column (customer_name) from the orders table. Even though there are two Karan and two Yuva names in the table, the distinct keyword removed the duplicates and fetched only the unique values. 

SELECT DISTINCT for Multiple Columns: How It Works?

You can use SELECT DISTINCT with multiple columns at the same time to get all the unique or distinct values from multiple columns. 

Syntax:

SELECT DISTINCT column1, column2
FROM table_name;

Example:

CREATE TABLE orders (
    order_id INT,
    customer_name VARCHAR(50),
    product_name VARCHAR(50)
);
INSERT INTO orders (order_id, customer_name, product_name) VALUES
(1, 'Karan', 'Laptop'),
(2, 'Yuva', 'Phone'),
(3, 'Karan', 'Laptop'),
(4, 'Yuva', 'Tablet'),
(5, 'Karan', 'Charger');
SELECT DISTINCT customer_name, product_name FROM orders;

Output:

Distinct_mulitple_column

Explanation: Though the customer name returned twice, their product name differs, so the SELECT DISTINCT with multiple columns compares the two columns and returns the SQL SELECT DISTINCT values. 

SQL DISTINCT with ORDER BY, WHERE, and LIMIT

The DISTINCT can be used with other clauses in SQL, like ORDER BY, WHERE, and LIMIT Clauses. 

Let’s create a dataset to perform DISTINCT with other SQL Clauses. 

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(50),
    city VARCHAR(50),
    states VARCHAR(50)
);
INSERT INTO customers (customer_id, customer_name, city, states) VALUES
(1, 'Johar', 'Mumbai', 'Maharashtra'),
(2, 'Babu', 'Lucknow', 'UP'),
(3, 'Tinku', 'Punjab', 'Chandigarh'),
(4, 'Chahar', 'Lucknow', 'UP');
SELECT * FROM customers;

Output:

distinct with other clauses table creation

This is how the table looks before applying other SQL Clauses. 

1. DISTINCT with ORDER BY in SQL

In a table, the DISTINCT will filter the duplicates from the table, and then the ORDER BY will sort the result based on the condition. 

Example:

SELECT DISTINCT city FROM customers
ORDER BY city ASC;

Output:

DISTINCT with ORDER BY

Explanation: The DISTINCT first removed the duplicates, and then the ORDER BY arranged the city names in ascending order. 

Get 100% Hike!

Master Most in Demand Skills Now!

2. DISTINCT with WHERE Clause in SQL

The WHERE clause will first filter the table based on a specific condition, and then the DISTINCT will apply to that to remove duplicates in SQL. 

Example:

SELECT DISTINCT city
FROM customers
WHERE states = 'UP'

Output:

DISTINCT-with-WHERE

Explanation: Here, the WHERE clause filtered the states named as “UP,” then the DISTINCT fetched the cities that matched the states. 

3. DISTINCT with LIMIT in SQL 

The DISTINCT will eliminate all the duplicates, and then the  LIMIT clause will limit the number of data points to be printed. 

Example:

SELECT DISTINCT states
FROM customers
LIMIT 4;

Output:

distinct with limit

 

Explanation: Here, the SQL SELECT DISTINCT keyword first removes all duplicate state names from the customers table. Then, the LIMIT 4 clause restricts the output to only 4 unique state entries.

DISTINCT with Aggregate Functions in SQL

Many aggregate functions in SQL can be used with DISTINCT to fetch the unique values. 

1. Using COUNT() with DISTINCT in SQL

The DISTINCT will filter the table with only unique values, and the COUNT() function will count the number of unique values present in the table. 

Example:

SELECT COUNT(*) AS total_customers FROM customers;
SELECT COUNT(DISTINCT city) AS unique_cities FROM customers;

Output:

COUNT() with DISTINCT

Explanation: The COUNT() first counted the number of customers, and then the DISTINCT keyword fetched the cities the customer name matched with, then filtered the unique cities from them, and then the COUNT function counted the number of unique cities. 

2. Using SUM() with DISTINCT in SQL

The SUM() function adds all the total orders from the table. But if we apply DISTINCT to it, the SUM() will add the unique orders from the table. 

Example:

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(20),
    city VARCHAR(20),
    states VARCHAR(20)
);
INSERT INTO customers (customer_id, customer_name, city, states) VALUES
(1, 'Johar', 'Mumbai', 'Maharashtra'),
(2, 'Babu', 'Lucknow', 'UP'),
(3, 'Tinku', 'Punjab', 'Chandigarh'),
(4, 'Chahar', 'Lucknow', 'UP');
ALTER TABLE customers ADD total_orders INT;
UPDATE customers SET total_orders = 5 WHERE customer_id = 1;
UPDATE customers SET total_orders = 3 WHERE customer_id = 2;
UPDATE customers SET total_orders = 3 WHERE customer_id = 3;
UPDATE customers SET total_orders = 6 WHERE customer_id = 4;
SELECT SUM(DISTINCT total_orders) AS total_order_sum_distinct
FROM customers;

Output: 

COUNT() with DISTINCT

Explanation: Here, the SUM() function adds all values in the total_order column, including repeated ones.SUM(DISTINCT total_order) adds only the unique values from the column, counting each distinct number just once.

3. Using AVG() with DISTINCT in SQL

The AVG() will get the average number of customers based on the orders, and then the DISTINCT will filter out the duplicates. 

Example:

ALTER TABLE customers ADD total_orders INT;
UPDATE customers SET total_orders = 5 WHERE customer_id = 1;
UPDATE customers SET total_orders = 3 WHERE customer_id = 2;
UPDATE customers SET total_orders = 4 WHERE customer_id = 3;
UPDATE customers SET total_orders = 6 WHERE customer_id = 4;
SELECT AVG(DISTINCT total_orders) AS avg_distinct_orders
FROM customers;

Output:

AVG() with DISTINCT

Explanation: Here, the average of four orders will be 3.75, but we used DISTINCT, so it fetched only the unique value and calculated the average of only the distinct value. So, the total orders after using DISTINCT are 4.

Creating SQL Views with DISTINCT in MySQL

The Views can be used when you want to fetch the unique values frequently. It will work best on a MySQL server.

Example:

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(20),
    city VARCHAR(20),
    states VARCHAR(20)
);
INSERT INTO customers (customer_id, customer_name, city, states) VALUES
(1, 'Johar', 'Mumbai', 'Maharashtra'),
(2, 'Babu', 'Lucknow', 'UP'),
(3, 'Tinku', 'Punjab', 'Chandigarh'),
(4, 'Chahar', 'Lucknow', 'UP');
CREATE VIEW distinct_cities AS
SELECT DISTINCT city FROM customers; 
ALTER TABLE customers ADD total_orders INT;
UPDATE customers SET total_orders = 5 WHERE customer_id = 1;
UPDATE customers SET total_orders = 3 WHERE customer_id = 2;
UPDATE customers SET total_orders = 3 WHERE customer_id = 3;
UPDATE customers SET total_orders = 6 WHERE customer_id = 4;
SELECT * FROM distinct_cities;

Output:

DISTINCT in Views

Explanation: The VIEW fetched the unique city names. As DISTINCT already filters the unique city names by comparing them with customers. 

Best Practices for Using SELECT DISTINCT in SQL

  1. DISTINCT should be used when it is necessary. So, make sure that before using DISTINCT, there are no duplicates. 
  2. In many situations, DISTINCT with JOINS can be used to remove duplicates that occurred due to many-to-many relationships. 
  3. If the query has an index in it. Then using DISTINCT will improve the performance speed. 
  4. Using LIMIT with DISTINCT will reduce the processing time, as the LIMIT will reduce the size of the column. 
  5. If you want to use an aggregate function, prefer using GROUP BY over DISTINCT. 
  6. Avoid using SQL SELECT DISTINCT * unless necessary. Always specify the columns to optimize query performance.”

SQL DISTINCT vs GROUP BY: Key Differences

GROUP BY DISTINCT
It will only group the data that matches with each other. The DISTINCT clause in SQL will filter out all duplicates from the column.
It is used with the help of aggregate functions. This keyword is used for getting unique values.
It will group data for further calculations. The DISTINCT clause in SQL removes duplicate rows by comparing values across selected columns, returning unique rows.
SELECT column1, COUNT(*)
FROM table_name
GROUP BY column1;
SELECT DISTINCT column1, column2
FROM table_name;

SELECT DISTINCT in Real-World SQL Use Cases

Below are the real-world examples of SQL DISTINCT queries.

Case 1: List of customers located at different locations

Example:

CREATE TABLE ecommerce_customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(50),
    email VARCHAR(100),
    states VARCHAR(50)
);
INSERT INTO ecommerce_customers (customer_id, customer_name, email, states) VALUES
(1, 'Ayaan', '[email protected]', 'USA'),
(2, 'Baskar', '[email protected]', 'Canada'),
(3, 'Charith', '[email protected]', 'USA'),
(4, 'Praveen', '[email protected]', 'UK'),
(5, 'Daku', '[email protected]', 'Canada');
SELECT DISTINCT states FROM ecommerce_customers;

Output:

distinct realworld eg1

Explanation: Here, the DISTINCT filtered all the unique cities where customers are located. 

Case 2: Listing the courses that Intellipaat is offering to students.

Example:

CREATE TABLE course_enrollments (
    student_id INT,
    course_id INT,
    course_name VARCHAR(100)
);
INSERT INTO course_enrollments (student_id, course_id, course_name) VALUES
(101, 1, 'Python Basics'),
(102, 1, 'Python Basics'),
(103, 2, 'Data Science'),
(101, 2, 'Data Science'),
(104, 3, 'Web Development');
SELECT DISTINCT course_name FROM course_enrollments;

Output:

Distinct_real_eg_2

Explanation: Here, the DISTINCT filtered out all the unique courses and removed the duplicate courses. 

SQL Unlocked: Learn for Free, Succeed for Life
Unlock the power of data with SQL and kickstart your career—absolutely free!
quiz-icon

Conclusion

The SQL DISTINCT is a keyword that is used to remove duplicates from the query result. This makes sure that there are no duplicates and all the data in the row and column is unique. This will be very helpful when you need to filter the data, for data analysis and reporting. DISTINCT should be handled carefully and should be used when it is necessary, as it may reduce the query performance. In this blog, you have learned about the DISTINCT clause in SQL, when to use it, how to use it, and its performance. 

Take your skills to the next level by enrolling in the SQL Training Course today and gaining hands-on experience. Also, prepare for job interviews with SQL Interview Questions, prepared by industry experts.

Check out other related SQL blogs:


LIKE Query in SQL

Essential Features of SQL

SQL EXISTS

SQL BETWEEN Operator

LIKE and BETWEEN Operator in SQL

How to Alter Table in SQL: ADD, DROP, MODIFY, RENAME

SQL Server Data Types

Performance Tuning in Oracle

 

Frequently Asked Questions
Q1. What does SELECT DISTINCT do?

It returns unique (non-duplicate) rows based on the selected columns.

Q2. Does DISTINCT apply to all columns in the query?

Yes, it considers all selected columns together for uniqueness.

Q3. Is SELECT DISTINCT the same as GROUP BY?

No, DISTINCT removes duplicates, while GROUP BY is used for aggregation.

Q4. Can DISTINCT be used with ORDER BY or LIMIT?

Yes, it works fine with both to sort or limit the unique results.

Q5. How to use SELECT DISTINCT in SQL?

Use SELECT DISTINCT column_name FROM table_name to retrieve only unique values from a column in SQL.

About the Author

Data Engineer, Tata Steel Nederland

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.