SELECT — Fetches the specified rows and columns from the database.
Select-statement [{set-operator} Select-statement ] ...
Select-statement:
SELECT [ TOP integer-value ]
{ * | [ ALL | DISTINCT ] { column-name | selection-expression } [AS alias] [,...] }
FROM { table-reference } [ join-clause ]...
[WHERE [NOT] boolean-expression [ {AND |
OR} [NOT] boolean-expression]...]
[clause...]
table-reference:
{ table-name [AS alias] | view-name [AS alias] | sub-query AS
alias }
sub-query:
(Select-statement)
join-clause:
,table-reference
[INNER | {LEFT | RIGHT} [OUTER]] JOIN [{table-reference}] [join-condition]
join-condition:
ON conditional-expression
USING (column-reference [,...])
clause:
ORDER BY { column-name | alias } [ ASC | DESC ] [,...]
GROUP BY { column-name | alias } [,...]
HAVING boolean-expression
LIMIT integer-value [OFFSET row-count]
set-operator:
UNION [ALL]
INTERSECT [ALL]
EXCEPT
The SELECT statement retrieves the specified rows and columns from the database, filtered and sorted by any clauses that are included in the statement. In its simplest form, the SELECT statement retrieves the values associated with individual columns. However, the selection expression can be a function such as COUNT and SUM.
The following features and limitations are important to note when using the SELECT statement with VoltDB:
See Appendix C, SQL Functions for a full list of the SQL functions the VoltDB supports.
VoltDB supports the following operators in expressions: addition (+), subtraction (-), multiplication (*), division (*) and string concatenation (||).
TOP n
is a synonym for LIMIT
n
.
The WHERE expression supports the boolean operators: equals (=), not equals (!= or <>), greater than (>), less than (<), greater than or equal to (>=), less than or equal to (<=), LIKE, IS NULL, IS DISTINCT, IS NOT DISTINCT, AND, OR, and NOT. Note, however, although OR is supported syntactically, VoltDB does not optimize these operations and use of OR may impact the performance of your queries.
The boolean expression LIKE provides text pattern matching in a VARCHAR column. The syntax of the LIKE expression
is {string-expression} LIKE '{pattern}'
where the pattern can contain text and wildcards, including
the underscore (_) for matching a single character and the percent sign (%) for matching zero or more characters. The
string comparison is case sensitive.
Where an index exists on the column being scanned and the pattern starts with a text prefix (rather than starting with a wildcard), VoltDB will attempt to use the index to maximize performance, For example, a query limiting the results to rows from the EMPLOYEE table where the primary index¸ the JOB_CODE column, begins with the characters "Temp" looks like this:
SELECT * from EMPLOYEE where JOB_CODE like 'Temp%';
The boolean expression IN determines if a given value is found within a list of alternatives. For example, in the following code fragment the IN expression looks to see if a record is part of Hispaniola by evaluating whether the column COUNTRY is equal to either "Dominican Republic" or "Haiti":
WHERE Country IN ('Dominican Republic', 'Haiti')
Note that the list of alternatives must be enclosed in parentheses. The result of an IN expression is equivalent to a sequence of equality conditions separated by OR. So the preceding code fragment produces the same boolean result as:
WHERE Country='Dominican Republic' OR Country='Haiti'
The advantages are that the IN syntax provides more compact and readable code and can provide improved performance by using an index on the initial expression where available.
The boolean expression BETWEEN determines if a value falls within a given range. The evaluation is inclusive of the end points. In this way BETWEEN is a convenient alias for two boolean expressions determining if a value is greater than or equal to (>=) the starting value and less than or equal to (<=) the end value. For example, the following two WHERE clauses are equivalent:
WHERE salary BETWEEN ? AND ? WHERE salary >= ? AND salary <= ?
The boolean expressions IS DISTINCT FROM and IS NOT DISTINCT FROM are similar to the equals ("=") and not equals ("<>") operators respectively, except when evaluating null operands. If either or both operands are null, the equals and not equals operators return a boolean null value, or false. IS DISTINCT FROM and IS NOT DISTINCT FROM consider null a valid operand. So if only one operand is null IS DISTINCT FROM returns true and IS NOT DISTINCT FROM returns false. If both operands are null IS DISTINCT FROM returns false and IS NOT DISTINCT FROM returns true.
When using placeholders in SQL statements involving the IN list expression, you can either do replacement of individual values within the list or replace the list as a whole. For example, consider the following statements:
SELECT * from EMPLOYEE where STATUS IN (?, ?,?); SELECT * from EMPLOYEE where STATUS IN ?;
In the first statement, there are three parameters that replace individual values in the IN list, allowing you to specify exactly three selection values. In the second statement the placeholder replaces the entire list, including the parentheses. In this case the parameter to the procedure call must be an array and allows you to change not only the values of the alternatives but the number of criteria considered.
The following Java code fragment demonstrates how these two queries can be used in a stored procedure, resulting in equivalent SQL statements being executed:
String arg1 = "Salary"; String arg2 = "Hourly"; String arg3 = "Parttime"; voltQueueSQL( query1, arg1, arg2, arg3); String listargs[] = new String[3]; listargs[0] = arg1; listargs[1] = arg2; listargs[2] = arg3; voltQueueSQL( query2, (Object) listargs);
Note that when passing arrays as parameters in Java, it is a good practice to explicitly cast them as an object to avoid the array being implicitly expanded into individual call parameters.
VoltDB supports the use of CASE-WHEN-THEN-ELSE-END for conditional operations. For example, the following SELECT expression uses a CASE statement to return different values based on the contents of the price column:
SELECT Prod_name, CASE WHEN price > 100.00 THEN 'Expensive' ELSE 'Cheap' END FROM products ORDER BY Prod_name;
For more complex conditional operations with multiple alternatives, use of the DECODE() function is recommended.
VoltDB supports both inner and outer joins.
The SELECT statement supports subqueries as a table reference in the FROM clause. Subqueries must be enclosed in parentheses and must be assigned a table alias. Note that subqueries are only supported in the SELECT statement; they cannot be used in data manipulation statements such UPDATE or DELETE.
You can only join two or more partitioned tables if those tables are partitioned on the same value and joined on equality of the partitioning column. Joining two partitioned tables on non-partitioned columns or on a range of values is not supported. However, there are no limitations on joining to replicated tables.
Extremely large result sets (greater than 50 megabytes in size) are not supported. If you execute a SELECT statement that generates a result set of more than 50 megabytes, VoltDB will return an error.
The SELECT statement can include subqueries. Subqueries are separate SELECT statements, enclosed in parentheses, where the results of the subquery are used as values, expressions, or arguments within the surrounding SELECT statement.
Subqueries, like any SELECT statement, are extremely flexible and can return a wide array of information. A subquery might return:
A single row with a single column — this is sometimes known as a scalar subquery and represents a single value
A single row with multiple columns — this is also known as a row value expression
Multiple rows with one or more columns
In general, VoltDB supports subqueries in the FROM clause, in the selection expression, and in boolean expressions in the WHERE clause or in CASE-WHEN-THEN-ELSE-END operations. However, different types of subqueries are allowed in different situations, depending on the type of data returned.
In the FROM clause, the SELECT statement supports all types of subquery as a table reference. The subquery must be enclosed in parentheses and must be assigned a table alias.
In the selection expression, scalar subqueries can be used in place of a single column reference.
In the WHERE clause and CASE operations, both scalar and non-scalar subqueries can be used as part of boolean expressions. Scalar subqueries can be used in place of any single-valued expression. Non-scalar subqueries can be used in the following situations:
Row value comparisons — Boolean expressions that compare one row value expression to another can use subqueries that resolve to one row with multiple columns. For example:
select * from t1 where (a,c) > (select a, c from t2 where b=t1.b);
IN and EXISTS — Subqueries that return multiple rows can be used as an argument to the IN or EXISTS predicate to determine if a value (or set of values) exists within the rows returned by the subquery. For example:
select * from t1 where a in (select a from t2); select * from t1 where (a,c) in (select a, c from t2 where b=t1.b); select * from t1 where c > 3 and exists (select a, b from t2 where a=t1.a);
ANY and ALL — Multi-row subqueries can also be used as the target of an ANY or ALL comparison, using either a scalar or row expression comparison. For example:
select * from t1 where a > ALL (select a from t2); select * from t1 where (a,c) = ANY (select a, c from t2 where b=t1.b);
Note that subqueries are only supported in the SELECT statement; they cannot be used in data manipulation statements such UPDATE or DELETE or in CREATE VIEW statements or index definitions. Also, VoltDB does not support subqueries in the HAVING, ORDER BY, or GROUP BY clauses.
For the initial release of subqueries in selection and boolean expressions, only replicated tables can be used in the subquery. Both replicated and partitioned tables can be used in subqueries in place of table references in the FROM clause.
VoltDB also supports the set operations UNION, INTERSECT, and EXCEPT. These keywords let you perform set operations on two or more SELECT statements. UNION includes the combined results sets from the two SELECT statements, INTERSECT includes only those rows that appear in both SELECT statement result sets, and EXCEPT includes only those rows that appear in one result set but not the other.
Normally, UNION and INTERSECT provide a set including unique rows. That is, if a row appears in both SELECT results, it only appears once in the combined result set. However, if you include the ALL modifier, all matching rows are included. For example, UNION ALL will result in single entries for the rows that appear in only one of the SELECT results, but two copies of any rows that appear in both.
The UNION, INTERSECT, and EXCEPT operations obey the same rules that apply to joins:
You cannot perform set operations on SELECT statements that reference the same table.
All tables in the SELECT statements must either be replicated tables or partitioned tables partitioned on the same column value, using equality of the partitioning column in the WHERE clause.
The following example retrieves all of the columns from the EMPLOYEE table where the last name is "Smith":
SELECT * FROM employee WHERE lastname = 'Smith';
The following example retrieves selected columns for two tables at once, joined by the employee_id using an implicit inner join and sorted by last name:
SELECT lastname, firstname, salary FROM employee AS e, compensation AS c WHERE e.employee_id = c.employee_id ORDER BY lastname DESC;
The following example includes both a simple SQL query defined in the schema and a client application to call the procedure repeatedly. This combination uses the LIMIT and OFFSET clauses to "page" through a large table, 500 rows at a time.
When retrieving very large volumes of data, it is a good idea to use LIMIT and OFFSET to constrain the amount of data in each transaction. However, to perform LIMIT OFFSET queries effectively, the database must include a tree index that encompasses all of the columns of the ORDER BY clause (in this example, the lastname and firstname columns).
Schema:
CREATE PROCEDURE EmpByLimit AS SELECT lastname, firstname FROM employee WHERE company = ? ORDER BY lastname ASC, firstname ASC LIMIT 500 OFFSET ?; PARTITION PROCEDURE EmpByLimit ON TABLE Employee COLUMN Company;
Java Client Application:
long offset = 0; String company = "ACME Explosives"; boolean alldone = false; while ( ! alldone ) { VoltTable results[] = client.callProcedure("EmpByLimit", company,offset).getResults(); if (results[0].getRowCount() < 1) { // No more records. alldone = true; } else { // do something with the results. } offset += 500; }