0% found this document useful (0 votes)
13 views

SQL Notes - 3

SQL Class Notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

SQL Notes - 3

SQL Class Notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

To swap gender values in a table, you can use a CASE statement in the UPDATE query.

Here's an example SQL that swaps the gender values, assuming your gender column
has values like 'Male' and 'Female':

SQL Query:

sql

Copy code

WITH UpdatedGenders AS (

SELECT

id, -- Assuming you have an ID or unique key to identify each row

gender,

CASE

WHEN gender = 'Male' THEN 'Female'

WHEN gender = 'Female' THEN 'Male'

ELSE gender -- If there are other values, leave them unchanged

END AS new_gender

FROM your_table

UPDATE your_table

SET gender = u.new_gender

FROM UpdatedGenders u

WHERE your_table.id = u.id;

Explanation:

1. CTE (UpdatedGenders): This common table expression (CTE) creates a


temporary result set that assigns the swapped gender based on the current value
of the gender column.

2. CASE statement: It swaps 'Male' with 'Female' and vice versa. Any other gender
values remain unchanged.

3. UPDATE: The update query then sets the new gender based on the CTE.

Make sure to replace your_table and id with your actual table name and unique
identifier.
4o

You said:

Write a sql to find the Median values, without using the inbuilt function

ChatGPT said:

ChatGPT

To calculate the median in SQL without using an inbuilt median function, you can do
this by:

1. Sorting the data.

2. Handling odd and even record counts separately.

3. For odd count: Returning the middle value.

4. For even count: Returning the average of the two middle values.

Here is an SQL query using Common Table Expressions (CTEs) to find the median:

SQL Query:

sql

Copy code

WITH RankedValues AS (

SELECT

your_column,

ROW_NUMBER() OVER (ORDER BY your_column) AS row_num,

COUNT(*) OVER () AS total_count

FROM your_table

SELECT AVG(your_column) AS median_value

FROM RankedValues

WHERE row_num IN ( (total_count + 1) / 2, (total_count + 2) / 2 );

Explanation:

1. RankedValues CTE:

o We use the ROW_NUMBER() window function to assign a sequential rank


(row_num) to each value in the your_column, sorted in ascending order.
o The COUNT(*) OVER () gives the total number of records in the table
(total_count).

2. Main Query:

o We calculate the median by checking if the row number is in the middle.

o For an odd count, the ROW_NUMBER() that represents (total_count + 1) /


2 will give the middle row.

o For an even count, it takes the average of the two middle values by
including both (total_count + 1) / 2 and (total_count + 2) / 2.

Notes:

• Replace your_column and your_table with the actual column and table names.

• This approach works for both numeric and date values.

4o

You said:

instead of using IN funciton, can you write in a different way, say using "between" or
"greater or lesser than" functions?

ChatGPT said:

ChatGPT

Certainly! To achieve the same result using conditions such as BETWEEN, we can
modify the query so that it handles the median calculation without relying on the IN
function. Here's the revised query using the BETWEEN clause:

SQL Query:

sql

Copy code

WITH RankedValues AS (

SELECT

your_column,

ROW_NUMBER() OVER (ORDER BY your_column) AS row_num,

COUNT(*) OVER () AS total_count

FROM your_table

)
SELECT AVG(your_column) AS median_value

FROM RankedValues

WHERE row_num BETWEEN (total_count + 1) / 2 AND (total_count + 2) / 2;

Explanation:

1. RankedValues CTE:

o This part remains the same, where each value gets a row number
(row_num) and the total number of rows (total_count).

2. Main Query:

o The BETWEEN clause ensures that we retrieve either one value (for an odd
number of rows) or two values (for an even number of rows) to calculate
the median.

o For an odd count, both expressions (total_count + 1) / 2 and (total_count +


2) / 2 evaluate to the same row number, so it returns that row.

o For an even count, it returns two middle rows, and AVG() calculates the
average of those two values.

This approach avoids using IN and instead makes use of the BETWEEN clause, while
still correctly handling both odd and even cases for median calculation.

4o

You said:

Can you explain with sample values

ChatGPT said:

ChatGPT

Sure! Let's go through an example using sample values to see how the SQL query works
for both odd and even number of rows.

Sample Data (Odd number of rows):

Let’s assume we have the following values in your_column:

diff

Copy code

your_column

------------
10

20

30

40

50

Steps to find the median:

1. RankedValues CTE: The query first assigns a row number (row_num) to each
value in the sorted order and calculates the total count of rows.

diff

Copy code

your_column | row_num | total_count

------------|---------|-------------

10 | 1 | 5

20 | 2 | 5

30 | 3 | 5

40 | 4 | 5

50 | 5 | 5

2. Finding the Median:

o Since there are 5 rows (odd number), the median is the value in the
middle row.

o The middle row is calculated as:

middle row=(5+1)/2=3\text{middle row} = (5 + 1) / 2 = 3middle row=(5+1)/2=3

o Using BETWEEN to find rows between (total_count + 1) / 2 and


(total_count + 2) / 2, both will evaluate to row 3.

o The result is:

diff

Copy code

your_column | row_num

------------|---------
30 | 3

o Therefore, the median value is 30.

Sample Data (Even number of rows):

Now, let’s assume we have the following values:

diff

Copy code

your_column

------------

10

20

30

40

Steps to find the median:

1. RankedValues CTE: The query assigns a row number (row_num) to each value in
sorted order and calculates the total count of rows.

diff

Copy code

your_column | row_num | total_count

------------|---------|-------------

10 | 1 | 4

20 | 2 | 4

30 | 3 | 4

40 | 4 | 4

2. Finding the Median:

o Since there are 4 rows (even number), the median is the average of the
two middle values.
o The two middle rows are calculated as:
middle rows=(4+1)/2=2and(4+2)/2=3\text{middle rows} = (4 + 1) / 2 = 2
\quad \text{and} \quad (4 + 2) / 2 = 3middle rows=(4+1)/2=2and(4+2)/2=3

o Using BETWEEN to find rows between row 2 and row 3:

diff

Copy code

your_column | row_num

------------|---------

20 | 2

30 | 3

o Therefore, the median value is: 20+302=25\frac{20 + 30}{2} = 25220+30


=25

Conclusion:

• For odd number of rows, the median is the middle value (in this case, 30).

• For even number of rows, the median is the average of the two middle values (in
this case, 25).

The query dynamically calculates whether there are an odd or even number of rows,
and handles both cases effectively using the BETWEEN clause.

You might also like