Notes-FUNCTION in SQL Snowflake
Notes-FUNCTION in SQL Snowflake
Note: Only the red highlighted functions are kept in your syllabus.
Scalar Functions
A scalar function is a function that returns one value per invocation; in most cases,
you can think of this as returning one value per row. This contrasts with Aggregate
Functions, which return one value per group of rows.
Category Description
Included in the Syllabus
Conversion Functions Convert expressions from one data type to another data
type.
Date & Time Functions Manipulate dates, times, and timestamps.
Numeric Functions Perform rounding, truncation, exponent, root, logarithmic,
and trigonometric operations on numeric values.
String & Binary Functions Manipulate and transform string input.
String Functions (Regular Subset of strings functions for performing operations on items
Expressions) that match a regular expression.
Hash Functions Hash values to signed 64-bit integers using a deterministic
algorithm.
Metadata Functions Retrieve data or metadata about database objects (e.g. tables)
or files (e.g. staged files).
-----------------------------------------------------------------------------------------------
Scalar Functions — functions that take a single row/value as input and return a
single value:
1. Conversion Functions
CAST , ::
Converts a value of one data type into another data type. The semantics of CAST
are the same as the semantics of the corresponding TO_ datatype conversion
functions. If the cast is not possible, an error is raised. For more details, see the
individual TO_ datatype conversion functions.
Syntax
Department of Computer Engineering
CAST( <source_expr> AS <target_data_type> )
<source_expr> :: <target_data_type>
Arguments
source_expr
target_data_type
The data type to which to convert the expression. If the data type supports
additional properties, such as precision and scale (for numbers/decimals),
the properties can be included.
RENAME FIELDS
For structured OBJECTs, specifies that you want to change the OBJECT to
use different key-value pairs. The values in the original object are copied to
the new key-value pairs in the order in which they appear.
ADD FIELDS
For structured OBJECTs, specifies that you want to add key-value pairs to the
OBJECT.
The values for the newly added keys will be set to NULL. If you want to
assign a value to these keys, call the OBJECT_INSERT function instead.
Examples
Convert a string containing a number to a decimal with specified scale (2):
Department of Computer Engineering
SELECT CAST('1.2345' AS DECIMAL(15,2));
+---------------------------------+
| CAST('1.2345' AS DECIMAL(15,2)) |
|---------------------------------|
| 1.23 |
+---------------------------------+
Convert the same string to a decimal with scale 5, using the :: notation:
SELECT '1.2345'::DECIMAL(15,5);
+-------------------------+
| '1.2345'::DECIMAL(15,5) |
|-------------------------|
| 1.23450 |
+-------------------------+
List of Functions
Department of Computer Engineering
Sub-category Function Notes
TIMESTAMP_FROM_PARTS
Addition/Subtraction ADD_MONTHS
DATEADD Accepts relevant date
and time parts (see
next section for
details).
DATEDIFF Accepts relevant date
and time parts (see
next section for
details).
MONTHS_BETWEEN (included in the
syllabus)
TIMEADD Alias for DATEADD.
TIMEDIFF Alias for DATEDIFF.
TIMESTAMPADD Alias for DATEADD.
TIMESTAMPDIFF Alias for DATEDIFF.
Department of Computer Engineering
Sub-category Function Notes
2.1 DATE_PART
Extracts the specified date or time part from a date, time, or timestamp.
Alternatives:
Department of Computer Engineering
EXTRACT , HOUR / MINUTE / SECOND , YEAR* / DAY* / WEEK* /
MONTH / QUARTER
Syntax
DATE_PART( <date_or_time_part> , <date_or_time_expr> )
Returns
The data type of the return value is NUMBER.
Usage Notes
date_or_time_part must be one of the values listed in Supported Date and Time
Parts.
When date_or_time_part is week (or any of its variations), the output is
controlled by the WEEK_START session parameter.
When date_or_time_part is dayofweek or yearofweek (or any of their
variations), the output is controlled by
the WEEK_OF_YEAR_POLICY and WEEK_START session
parameters.
For more details, including examples, see Calendar Weeks and Weekdays.
Examples
This shows a simple example of extracting part of a DATE:
SELECT TO_TIMESTAMP('2013-05-08T23:39:20.123-07:00') AS
"TIME_STAMP1",
Department of Computer Engineering
DATE_PART(YEAR, "TIME_STAMP1") AS "EXTRACTED YEAR";
+-------------------------+----------------+
| TIME_STAMP1 | EXTRACTED YEAR |
|-------------------------+----------------|
| 2013-05-08 23:39:20.123 | 2013 |
+-------------------------+----------------+
SELECT TO_TIMESTAMP('2013-05-08T23:39:20.123-07:00') AS
"TIME_STAMP1",
DATE_PART(EPOCH_SECOND, "TIME_STAMP1") AS "EXTRACTED
EPOCH SECOND";
+-------------------------+------------------------+
| TIME_STAMP1 | EXTRACTED EPOCH SECOND |
|-------------------------+------------------------|
| 2013-05-08 23:39:20.123 | 1368056360 |
+-------------------------+------------------------+
SELECT TO_TIMESTAMP('2013-05-08T23:39:20.123-07:00') AS
"TIME_STAMP1",
DATE_PART(EPOCH_MILLISECOND, "TIME_STAMP1") AS
"EXTRACTED EPOCH MILLISECOND";
+-------------------------+-----------------------------+
| TIME_STAMP1 | EXTRACTED EPOCH MILLISECOND |
|-------------------------+-----------------------------|
| 2013-05-08 23:39:20.123 | 1368056360123 |
+-------------------------+-----------------------------+
Categories:
2.2 EXTRACT
Extracts the specified date or time part from a date, time, or timestamp.
Department of Computer Engineering
Alternative for DATE_PART.
Syntax
EXTRACT( <date_or_time_part> FROM <date_or_time_expr> )
Usage Notes
date_or_time_part must be one of the values listed in Supported Date and Time
Parts.
For additional usage notes, see Returns for DATE_PART.
Examples
SELECT EXTRACT(YEAR FROM TO_TIMESTAMP('2013-05-
08T23:39:20.123-07:00')) AS v
FROM (values(1)) v1;
------+
V |
------+
2013 |
------+
These functions are alternatives to using the DATE_PART (or EXTRACT) function
with the equivalent time part (see Supported Date and Time Parts).
Syntax
HOUR( <time_or_timestamp_expr> )
MINUTE( <time_or_timestamp_expr> )
SECOND( <time_or_timestamp_expr> )
Department of Computer Engineering
Usage Notes
Function Name Time Part Extracted from Time / Timestamp Possible Values
HOUR Hour of the specified day 0 to 23
MINUTE Minute of the specified hour 0 to 59
SECOND Second of the specified minute 0 to 59
Examples
This demonstrates the HOUR, MINUTE, and SECOND functions:
2.4 LAST_DAY
Returns the last day of the specified date part for a date or timestamp. Commonly
used to return the last day of the month for a date or timestamp.
Syntax
LAST_DAY( <date_or_time_expr> [ , <date_part> ] )
Usage Notes
(Required) must be a date or timestamp expression.
date_or_time_expr
date_part (Optional) is the date part for which the last day is returned. Possible
values are year, quarter, month, or week (or any of their supported variations). For
details, see Supported Date and Time Parts.
Examples
Return the last day of the month for the specified date (from a timestamp):
Return the last day of the year for the specified date (from a timestamp):
2.5 MONTHNAME
Extracts the three-letter month name from the specified date or timestamp.
Syntax
MONTHNAME( <date_or_timestamp_expr> )
Examples
SELECT MONTHNAME(TO_DATE('2015-05-01')) AS MONTH;
-------+
MONTH |
Department of Computer Engineering
-------+
May |
-------+
SELECT MONTHNAME(TO_TIMESTAMP('2015-04-03 10:00')) AS MONTH;
-------+
MONTH |
-------+
Apr |
-------+
2.6 NEXT_DAY
Returns the date of the first specified DOW (day of week) that occurs after the
input date.
Syntax
NEXT_DAY( <date_or_time_expr> , <dow_string> )
Arguments
date_or_time_expr
dow_string
Specifies the day of week used to calculate the date for the previous day. The
value can be a string literal or an expression that returns a string. The string
must start with the first two characters (case-insensitive) of the day name:
su (Sunday)
mo (Monday)
tu (Tuesday)
we (Wednesday)
th (Thursday)
fr (Friday)
sa (Saturday)
Department of Computer Engineering
Any leading spaces and trailing characters, including spaces, in the string are
ignored.
Usage Notes
The return value is always a date regardless of whether date_or_time_expr is a
date or timestamp.
Examples
Return the date of the next Friday that occurs after the current date:
+--------------+-------------+
| Today's Date | Next Friday |
|--------------+-------------|
| 2018-06-12 | 2018-06-15 |
+--------------+-------------+
2.7 PREVIOUS_DAY
Returns the date of the first specified DOW (day of week) that occurs before the
input date.
Syntax
PREVIOUS_DAY( <date_or_time_expr> , <dow> )
Arguments
date_or_time_expr
dow_string
Department of Computer Engineering
Specifies the day of week used to calculate the date for the previous day. The
value can be a string literal or an expression that returns a string. The string
must start with the first two characters (case-insensitive) of the day name:
su (Sunday)
mo (Monday)
tu (Tuesday)
we (Wednesday)
th (Thursday)
fr (Friday)
sa (Saturday)
Any leading spaces and trailing characters, including spaces, in the string
are ignored.
Usage Notes
The return value is always a date regardless of whether date_or_time_expr is a
date or timestamp.
Examples
Return the date of the previous Friday that occurred before the current date:
+--------------+-----------------+
| Today's Date | Previous Friday |
|--------------+-----------------|
| 2018-06-12 | 2018-06-08 |
+--------------+-----------------+
2.8 MONTHS_BETWEEN
Returns the number of months between two DATE or TIMESTAMP values.
Examples
This example shows differences in whole months. The first pair of dates have the
same day of the month (the 15th). The second pair of dates are both the last days in
their respective months (February 28th and March 31st).
SELECT
MONTHS_BETWEEN('2019-03-15'::DATE,
'2019-02-15'::DATE) AS MonthsBetween1,
MONTHS_BETWEEN('2019-03-31'::DATE,
'2019-02-28'::DATE) AS MonthsBetween2;
+----------------+----------------+
| MONTHSBETWEEN1 | MONTHSBETWEEN2 |
|----------------+----------------|
| 1.000000 | 1.000000 |
+----------------+----------------+
3. Numeric Functions
Numeric functions operate on numeric values and perform operations such as
rounding and exponentiation.
3.1 ABS
Returns the absolute value of a numeric expression.
Department of Computer Engineering
Syntax
ABS( <num_expr> )
Examples
SELECT column1, abs(column1)
FROM (values (0), (1), (-2), (3.5), (-4.5), (null));
+---------+--------------+
| COLUMN1 | ABS(COLUMN1) |
|---------+--------------|
| 0.0 | 0.0 |
| 1.0 | 1.0 |
| -2.0 | 2.0 |
| 3.5 | 3.5 |
| -4.5 | 4.5 |
| NULL | NULL |
+---------+--------------+
3.2 CEIL
Returns values from input_expr rounded to the nearest equal or larger integer, or to the
nearest equal or larger value with the specified number of places after the decimal
point.
Syntax
CEIL( <input_expr> [, <scale_expr> ] )
Arguments
input_expr
The value or expression to operate on. The data type should be one of the
numeric data types, such as FLOAT or NUMBER.
scale_expr
The number of digits the output should include after the decimal point. The
expression should evaluate to an integer from -38 to +38.
Department of Computer Engineering
The default scale_expr is zero, meaning that the function removes all digits after
the decimal point.
For information about negative scales, see the Usage Notes below.
Returns
The data type of the returned value is NUMBER(precision, scale).
If the input scale was greater than or equal to zero, then the output scale generally
matches the input scale.
For example:
3.3 FLOOR
Returns values from input_expr rounded to the nearest equal or smaller integer, or to
the nearest equal or smaller value with the specified number of places after the
decimal point.
Syntax
FLOOR( <input_expr> [, <scale_expr> ] )
Arguments
input_expr
The value or expression to operate on. The data type should be one of the
numeric data types, such as FLOAT or NUMBER.
scale_expr
Department of Computer Engineering
The number of digits the output should include after the decimal point. The
expression should evaluate to an integer from -38 to +38.
The default scale_expr is zero, meaning that the function removes all digits after
the decimal point.
For information about negative scales, see the Usage Notes below.
Returns
The data type of the returned value is NUMBER(precision, scale).
If the input scale was greater than or equal to zero, then the output scale generally
matches the input scale.
For example:
3.4 MOD
Returns the remainder of input expr1 divided by input expr2.
Syntax
MOD( <expr1> , <expr2> )
Arguments
expr1
A numeric expression.
expr2
Department of Computer Engineering
A numeric expression.
Returns
Returns either an integer or a fixed-point decimal number.
Usage Notes
Both expr1 and expr2 must be numeric expressions. They are not required to be
integers.
Examples
The following example shows usage of the MOD() function on both integer and non-
integer values:
Output:
+------+------+
| MOD1 | MOD2 |
+------+------+
| 1 | 0.9 |
+------+------+
3.5 ROUND
Returns rounded values for input_expr.
Syntax
ROUND( <input_expr> [ , <scale_expr> [ , <rounding_mode> ] ] )
ROUND( EXPR => <input_expr> ,
SCALE => <scale_expr>
[ , ROUNDING_MODE => <rounding_mode> ] )
Department of Computer Engineering
Arguments
Required:
The value or expression to operate on. The data type should be one of the
numeric data types, such as FLOAT or NUMBER.
If you specify the EXPR => named argument, you must also specify
the SCALE => named argument.
Optional:
Examples
This first example shows a simple use of ROUND, with the default number of decimal
places (0):
3.6 SIGN
Returns the sign of its argument:
Syntax
SIGN( <expr> )
Department of Computer Engineering
Examples
SELECT SIGN(5), SIGN(-1.35e-10), SIGN(0);
---------+-----------------+---------+
SIGN(5) | SIGN(-1.35E-10) | SIGN(0) |
---------+-----------------+---------+
1 | -1 |0 |
---------+-----------------+---------+
Note
TRUNC is overloaded; it can also be used as a date/time function to truncate dates,
times, and timestamps to a specified part.
Syntax
TRUNCATE( <input_expr> [ , <scale_expr> ] )
Arguments
input_expr
The value or expression to operate on. The data type should be one of the
numeric data types, such as FLOAT or NUMBER.
scale_expr
The number of digits the output should include after the decimal point. The
expression should evaluate to an integer from -38 to +38.
Department of Computer Engineering
The default scale_expr is zero, meaning that the function removes all digits after
the decimal point.
For information about negative scales, see the Usage Notes below.
Returns
The data type of the returned value is NUMBER(precision, scale).
If the input scale was greater than or equal to zero, then the output scale generally
matches the input scale.
For example:
Examples
The following examples demonstrate the TRUNC function.
4.1 REGEXP
Returns true if the subject matches the specified pattern. Both inputs must be text
expressions.
REGEXP is similar to the LIKE function, but with POSIX extended regular
expressions instead of SQL LIKE pattern syntax. It supports more complex matching
conditions than LIKE.
Syntax
<subject> REGEXP <pattern>
Arguments
Required:
subject
Subject to match.
pattern
Pattern to match.
Department of Computer Engineering
Returns
The data type of the returned value is BOOLEAN.
Examples
The example below shows how to use REGEXP with a simple wildcard expression:
SELECT v
FROM strings
WHERE v REGEXP 'San* [fF].*'
ORDER BY v;
+---------------+
|V |
|---------------|
| San Francisco |
+---------------+
The backslash character \ is the escape character in regular expressions, and specifies
special characters or groups of characters. For example, \s is the regular expression
for whitespace.
The Snowflake string parser, which parses literal strings, also treats backslash as an
escape character. For example, a backslash is used as part of the sequence of
characters that specifies a tab character. Thus to create a string that contains a single
backslash, you must specify two backslashes. For example, compare the string in the
input statement below with the corresponding string in the output:
This example shows how to search for strings that start with “San”, where “San” is
a complete word (e.g. not part of “Santa”). \b is the escape sequence for a word
boundary.
This example shows how to search for a blank followed by a backslash. Note that
the single backslash to search for is represented by four backslashes below; for
REGEXP to look for a literal backslash, that backslash must be escaped, so you need
two backslashes. The string parser requires that each of those backslashes be
escaped, so the expression contains four backslashes to represent the one backslash
that the expression is searching for:
The following example is the same as the preceding example, except that it uses $$ as
a string delimiter to tell the string parser that the string is a literal and that
backslashes should not be interpreted as escape sequences. (The backslashes are still
interpreted as escape sequences by REGEXP.)
4.2 REGEXP_LIKE
Returns true if the subject matches the specified pattern. Both inputs must be text
expressions.
REGEXP_LIKE is similar to the LIKE function, but with POSIX extended regular
expressions instead of SQL LIKE pattern syntax. It supports more complex matching
conditions than LIKE.
Syntax
REGEXP_LIKE( <subject> , <pattern> [ , <parameters> ] )
Arguments
Required:
subject
Department of Computer Engineering
Subject to match.
pattern
Pattern to match.
Optional:
parameters
String of one or more characters that specifies the parameters used for
searching for matches. Supported values:
c ,i,m,e,s
Returns
The data type of the returned value is BOOLEAN.
Usage Notes
The function implicitly anchors a pattern at both ends (i.e. '' automatically
becomes '^$', and 'ABC' automatically becomes '^ABC$'). To match any string
starting with ABC, the pattern would be 'ABC.*'.
The backslash character ( \) is the escape character. For more information,
see Specifying Regular Expressions in Single-Quoted String Constants.
For more usage notes, see the General Usage Notes for regular expression
functions.
Collation Details
Arguments with collation specifications are currently not supported.
Examples
Create a table with names of cities:
5. Aggregate Functions
Aggregate functions operate on values across rows to perform mathematical
calculations such as sum, average, counting, minimum/maximum values, standard
deviation, and estimation, as well as some non-mathematical operations.
An aggregate function takes multiple rows (actually, zero, one, or more rows) as
input and produces a single output. In contrast, scalar functions take one row as input
and produce one row (one value) as output.
An aggregate function always returns exactly one row, even when the input
contains zero rows. Typically, if the input contained zero rows, the output is NULL.
However, an aggregate function could return 0, an empty string, or some other value
when passed zero rows.
General Aggregation
AVG
COUNT
COUNT_IF
Department of Computer Engineering
MAX
MAX_BY
MEDIAN
MIN
MIN_BY
SUM
5.1 AVG
Returns the average of non-NULL records. If all records inside a group are NULL,
the function returns NULL.
Syntax
Aggregate function
Window function
Arguments
expr1
expr2
expr3
SELECT *
FROM avg_example
ORDER BY int_col, d;
+---------+----------+------+------+
| INT_COL | D | S1 | S2 |
|---------+----------+------+------|
| 1 | 1.10000 | 1.1 | one |
| 1 | 10.00000 | 10 | ten |
| 2 | 2.40000 | 2.4 | two |
| 2 | NULL | NULL | NULL |
| 3 | NULL | NULL | NULL |
| NULL | 9.90000 | 9.9 | nine |
+---------+----------+------+------+
Calculate the average of the columns that are numeric or that can be converted to
numbers:
SELECT
int_col,
AVG(int_col) OVER(PARTITION BY int_col)
FROM avg_example
ORDER BY int_col;
+---------+-----------------------------------------+
| INT_COL | AVG(INT_COL) OVER(PARTITION BY INT_COL) |
|---------+-----------------------------------------|
| 1| 1.000 |
| 1| 1.000 |
| 2| 2.000 |
| 2| 2.000 |
| 3| 3.000 |
| NULL | NULL |
+---------+-----------------------------------------+
5.2 COUNT
Returns either the number of non-NULL records for the specified columns, or the
total number of records.
Syntax
Department of Computer Engineering
Aggregate function
COUNT( * )
Examples
This is an example of using COUNT with NULL values. The query also includes
some COUNT(DISTINCT) operations:
5.3 COUNT_IF
Returns the number of records that satisfy a condition or NULL if no records satisfy
the condition.
Syntax
Aggregate function
COUNT_IF( <condition> )
Window function
COUNT_IF( <condition> )
OVER ( [ PARTITION BY <expr1> ] [ ORDER BY <expr2> [ ASC | DESC
] [ <window_frame> ] ] )
Arguments
condition
expr1
The column to partition on, if you want the result to be split into multiple
windows.
expr2
The column to order each window on. Note that this is separate from any
ORDER BY clause to order the final result set.
Returns
Department of Computer Engineering
If the function does not return NULL, the data type of the returned value is
NUMBER.
Usage Notes
When this function is called as a window function:
If an ORDER BY sub-clause is used inside the OVER() clause, then a
window frame must be used. If no window frame is specified, then the
default is a cumulative window frame:
Examples
The examples in this section demonstrate how to use the COUNT_IF function.
The following example passes in TRUE for the condition, which returns the count of
all rows in the table:
The following example returns the number of rows where the value in J_COL is
greater than the value in I_COL:
5.4 MAX
Returns the maximum value for the records within expr. NULL values are ignored
unless all the records are NULL, in which case a NULL value is returned.
Syntax¶
Aggregate function
MAX( <expr> )
Examples
The following examples demonstrate how to use the MAX function.
SELECT k, d
FROM sample_table
ORDER BY k, d;
+------+------+
|K |D |
|------+------|
|1 |1 |
|1 |3 |
|1 |5 |
|2 |2 |
| 2 | NULL |
| 3 | NULL |
| NULL | 1 |
| NULL | 7 |
+------+------+
Use the MAX function to retrieve the largest value in the column named d:
Combine the GROUP BY clause with the MAX function to retrieve the largest values in each
group (where each group is based on the value of column k):
SELECT k, MAX(d)
FROM sample_table
GROUP BY k
ORDER BY k;
+------+--------+
| K | MAX(D) |
|------+--------|
|1 |5 |
|2 |2 |
| 3 | NULL |
| NULL | 7 |
+------+--------+
5.5 MAX_BY
Department of Computer Engineering
Finds the row(s) containing the maximum value for a column and returns the value
of another column in that row.
If multiple rows contain the specified maximum value, the function is non-
deterministic.
The function returns an ARRAY containing the values of a column for the
rows with the highest values of a specified column.
The values in the ARRAY are sorted by their corresponding values in the
column containing the maximum values.
If multiple rows contain these highest values, the function is non-
deterministic.
Syntax
MAX_BY( <col_to_return>, <col_containing_maximum> [ ,
<maximum_number_of_values_to_return> ] )
Arguments
Required:
col_to_return
col_containing_maximum
maximum_number_of_values_to_return
Returns
If maximum_number_of_values_to_return is not specified, the function returns a
value of the same type as col_to_return.
If maximum_number_of_values_to_return is specified, the function returns an
ARRAY containing values of the same type as col_to_return. The values in the
ARRAY are sorted by their corresponding col_containing_maximum values.
Usage Notes
The function ignores NULL values in col_containing_maximum.
If all values in col_containing_maximum are NULL, the function returns
NULL (regardless of whether the
optional maximum_number_of_values_to_return argument is specified).
Examples
The following examples demonstrate how to use the MAX_BY function.
To run these examples, execute the following statements to set up the table and data
for the examples:
The following example returns the ID of the employee with the highest salary:
Because more than one row contains the maximum value for
the salary column, the function is non-deterministic and might return the
employee ID for a different row in subsequent executions.
The function ignores the NULL value in the salary column when determining
the rows with the maximum values.
The following example returns an ARRAY containing the IDs of the employees with
the three highest salaries:
As shown in the example, the values in the ARRAY are sorted by their
corresponding values in the salary column. So, MAX_BY returns the IDs of
employees sorted by their salary in descending order.
If more than one of these rows contain the same value in the salary column, the order
of the returned values for that salary is non-deterministic.
5.6 MEDIAN
Syntax
Aggregate function
MEDIAN( <expr> )
Window function
Argument
expr
Department of Computer Engineering
The expression must evaluate to a numeric data type
(INTEGER, FLOAT , DECIMAL, or equivalent).
Returns
Returns a FLOAT or DECIMAL (fixed-point) number, depending upon the input.
Usage Notes
If the number of non-NULL values is an odd number greater than or equal to
1, this returns the median (“center”) value of the non-NULL values.
If the number of non-NULL values is an even number, this returns a value
equal to the average of the two center values. For example, if the values are
1, 3, 5, and 20, then this returns 4 (the average of 3 and 5).
If all values are NULL, this returns NULL.
If the number of non-NULL values is 0, this returns NULL.
DISTINCT is not supported for this function.
When used as a window function:
This function does not support:
ORDER BY sub-clause in the OVER() clause.
Window frames.
Examples
This shows how to use the function.
Get the MEDIAN value for column v. The function returns NULL because there are
no rows.
Get the MEDIAN value for each group. Note that because the number of values in
group k = 2 is an even number, the returned value for that group is the mid-point
between the two middle numbers.
5.7 MIN
Returns the minimum value for the records within expr. NULL values are ignored
unless all the records are NULL, in which case a NULL value is returned.
Syntax
Aggregate function
MIN( <expr> )
Window function
Returns
Department of Computer Engineering
The data type of the returned value is the same as the data type of the input values.
Usage Notes
For compatibility with other systems, you can specify the DISTINCT
keyword as an argument for the function, but it does not have any effect.
If the function is called as a window function, the window can include an
optional window_frame. The window_frame (either cumulative or sliding) specifies
the subset of rows within the window for which the summed values are
returned. If no window_frame is specified, the default is the following cumulative
window frame (in accordance with the ANSI standard for window functions):
Collation Details
The comparisons follow the collation based on the input arguments’ collations
and precedences.
The collation of the result is the same as the collation of the input.
Examples
The following examples demonstrate how to use the MIN function.
SELECT k, d
FROM sample_table
ORDER BY k, d;
+------+------+
|K |D |
|------+------|
|1 |1 |
Department of Computer Engineering
|1 |3 |
|1 |5 |
|2 |2 |
| 2 | NULL |
| 3 | NULL |
| NULL | 1 |
| NULL | 7 |
+------+------+
Use the MIN function to retrieve the smallest value in the column named d:
Combine the GROUP BY clause with the MIN function to retrieve the smallest
values in each group (where each group is based on the value of column k):
SELECT k, MIN(d)
FROM sample_table
GROUP BY k
ORDER BY k;
+------+--------+
| K | MIN(D) |
|------+--------|
|1 |1 |
|2 |2 |
| 3 | NULL |
| NULL | 1 |
+------+--------+
Use a PARTITION BY clause to break the data into groups based on the value of k.
This is similar to, but not identical to, using GROUP BY. In particular, GROUP BY
produces one output row per group, while PARTITION BY produces one output
row per input row.
Use a windowing ORDER BY clause to create a sliding window two rows wide, and
output the lowest value within that window. (Remember that ORDER BY in the
windowing clause is separate from ORDER BY at the statement level.) This example
uses a single partition, so there is no PARTITION BY clause in the OVER() clause.
5.8 MIN_BY
Department of Computer Engineering
Finds the row(s) containing the minimum value for a column and returns the value
of another column in that row.
If multiple rows contain the specified minimum value, the function is non-
deterministic.
The function returns an ARRAY containing the values of a column for the
rows with the lowest values of a specified column.
The values in the ARRAY are sorted by their corresponding values in the
column containing the minimum values.
If multiple rows contain these lowest values, the function is non-deterministic.
Syntax
MIN_BY( <col_to_return>, <col_containing_mininum> [ ,
<maximum_number_of_values_to_return> ] )
Arguments
Required:
col_to_return
col_containing_mininum
maximum_number_of_values_to_return
Returns
If maximum_number_of_values_to_return is not specified, the function returns a
value of the same type as col_to_return.
If maximum_number_of_values_to_return is specified, the function returns an
ARRAY containing values of the same type as col_to_return. The values in
the ARRAY are sorted by their corresponding col_containing_mininum values.
Usage Notes
The function ignores NULL values in col_containing_mininum.
If all values in col_containing_mininum are NULL, the function returns NULL
(regardless of whether the
optional maximum_number_of_values_to_return argument is specified).
Examples
The following examples demonstrate how to use the MIN_BY function.
To run these examples, execute the following statements to set up the table and data
for the examples:
The following example returns the ID of the employee with the lowest salary:
Because more than one row contains the minimum value for
the salary column, the function is non-deterministic and might return the
employee ID for a different row in subsequent executions.
The function ignores the NULL value in the salary column when determining
the rows with the minimum values.
The following example returns an ARRAY containing the IDs of the employees with
the three lowest salaries:
+--------------------------------+
Department of Computer Engineering
| MIN_BY(EMPLOYEE_ID, SALARY, 3) |
|--------------------------------|
|[ |
| 1030, |
| 2020, |
| 1020 |
|] |
+--------------------------------+
As shown in the example, the values in the ARRAY are sorted by their
corresponding values in the salary column. So, MIN_BY returns the IDs of
employees sorted by their salary in ascending order.
If more than one of these rows contain the same value in the salary column, the order
of the returned values for that salary is non-deterministic.
5.9 SUM
Returns the sum of non-NULL records for expr. You can use the DISTINCT keyword
to compute the sum of unique non-null values. If all records inside a group are
NULL, the function returns NULL.
Syntax
Aggregate function
Window function
For details about window_frame syntax, see Window Frame Syntax and Usage.
Arguments
expr1
Department of Computer Engineering
This is an expression that evaluates to a numeric data type (INTEGER,
FLOAT, DECIMAL, etc.).
expr2
expr3
This is the optional expression to order by within each partition. (This does
not control the order of the entire query output.)
Usage Notes
Numeric values are summed into an equivalent or larger data type.
When passed a VARCHAR expression, this function implicitly casts the input
to floating point values. If the cast cannot be performed, an error is returned.
When this function is called as a window function (i.e. with an OVER clause):
If the OVER clause contains an ORDER BY subclause, then:
A window frame is required. If no window frame is specified
explicitly, then the ORDER BY implies a cumulative window
frame:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Examples
CREATE OR REPLACE TABLE sum_example(k INT, d DECIMAL(10,5),
s1 VARCHAR(10), s2 VARCHAR(10));
+------+----------+------+------+
| K| D | S1 | S2 |
|------+----------+------+------|
| 1 | 1.10000 | 1.1 | one |
| 1 | 10.00000 | 10.0 | ten |
| 2 | 2.20000 | 2.2 | two |
| 2 | NULL | NULL | null |
| 3 | NULL | NULL | null |
| NULL | 9.00000 | 9.9 | nine |
+------+----------+------+------+
+----------+---------+
| SUM(D) | SUM(S1) |
|----------+---------|
| 22.30000 | 23.2 |
+----------+---------+
+------+----------+---------+
| K | SUM(D) | SUM(S1) |
|------+----------+---------|
| 1 | 11.10000 | 11.1 |
| 2 | 2.20000 | 2.2 |
| 3 | NULL | NULL |
| NULL | 9.00000 | 9.9 |
+------+----------+---------+