Experiment 9 New
Experiment 9 New
SELECT
Querying data from data is done using a SELECT statement:
select_statement::= SELECT [ JSON | DISTINCT ] ( select_clause | '*' )
FROM `table_name`
[ WHERE `where_clause` ]
[ GROUP BY `group_by_clause` ]
[ ORDER BY `ordering_clause` ]
[ PER PARTITION LIMIT (`integer` | `bind_marker`) ]
[ LIMIT (`integer` | `bind_marker`) ]
[ ALLOW FILTERING ]
select_clause::= `selector` [ AS `identifier` ] ( ',' `selector` [ AS `identifier` ] )
selector::== `column_name`
| `term`
| CAST '(' `selector` AS `cql_type` ')'
| `function_name` '(' [ `selector` ( ',' `selector` )_ ] ')'
| COUNT '(' '_' ')'
where_clause::= `relation` ( AND `relation` )*
relation::= column_name operator term
'(' column_name ( ',' column_name )* ')' operator tuple_literal
TOKEN '(' column_name# ( ',' column_name )* ')' operator term
operator::= '=' | '<' | '>' | '<=' | '>=' | '!=' | IN | CONTAINS | CONTAINS KEY
group_by_clause::= column_name ( ',' column_name )*
ordering_clause::= column_name [ ASC | DESC ] ( ',' column_name [ ASC | DESC ] )*
For example:
SELECT name, occupation FROM users WHERE userid IN (199, 200, 207);
SELECT JSON name, occupation FROM users WHERE userid = 199;
SELECT name AS user_name, occupation AS user_occupation FROM users;
The SELECT statements reads one or more columns for one or more
rows in a table. It returns a result-set of the rows matching the
request, where each row contains the values for the selection
corresponding to the query.
Additionally, functions including aggregations can be applied to
the result.
A SELECT statement contains at least a selection clause and the
name of the table on which the selection is executed. CQL
does not execute joins or sub-queries and a select statement only
apply to a single table. A select statement can also have a where
clause that can further narrow the query results. Additional
clauses can order or limit the results. Lastly, queries that require
full cluster filtering can append ALLOW FILTERING to any query.
Selection clause
Selectors
Aliases
Every top-level selector can also be aliased (using AS). If so, the
name of the corresponding column in the result set will be that of
the alias. For instance:
// Without alias
SELECT intAsBlob(4) FROM t;
// intAsBlob(4)
// --------------
// 0x00000004
// With alias
SELECT intAsBlob(4) AS four FROM t;
// four
// ------------
// 0x00000004
Currently, aliases aren’t recognized in the WHERE or ORDER BY clauses in the statement. You must
use the orignal column name instead.
But the following one is not, as it does not select a contiguous set
of rows (and we suppose no secondary indexes are set):
// Needs a blog_title to be set to select ranges of posted_at
token(0) in particular).
For example:
SELECT * FROM posts
WHERE token(userid) > token('tom') AND token(userid) < token('bob');
This query will return all rows that sort after the one having
“John’s Blog” as blog_tile and '2012-01-01' for posted_at in the
clustering order. In particular, rows having a post_at ⇐ '2012-01-
01' will be returned, as long as their blog_title > 'John''s Blog'.
That would not be the case for this example:
SELECT * FROM posts
WHERE userid = 'john doe'
AND blog_title > 'John''s Blog'
AND posted_at > '2012-01-01';
Grouping results
The GROUP BY option can condense all selected rows that share the
same values for a set of columns into a single row.
Using the GROUP BY option, rows can be grouped at the partition
key or clustering column level. Consequently, the GROUP BY option
only accepts primary key columns in defined order as arguments.
If a primary key column is restricted by an equality restriction, it
is not included in the GROUP BY clause.
Aggregate functions will produce a separate value for each group.
If no GROUP BY clause is specified, aggregates functions will
produce a single value for all the rows.
If a column is selected without an aggregate function, in a
statement with a GROUP BY, the first value encounter in each group
will be returned.
Ordering results
The ORDER BY clause selects the order of the returned results. The
argument is a list of column names and each column’s order
(ASC for ascendant and DESC for descendant, The possible orderings
are limited by the clustering order defined on the table:
if the table has been defined without any specific CLUSTERING
ORDER, then the order is as defined by the clustering columns
or the reverse
otherwise, the order is defined by the CLUSTERING ORDER option
and the reversed one.
Limiting results
Allowing filtering
By default, CQL only allows select queries that don’t involve a full
scan of all partitions. If all partitions are scanned, then returning
the results may experience a significant latency proportional to
the amount of data in the table. The ALLOW FILTERING option
explicitly executes a full scan. Thus, the performance of the query
can be unpredictable.
For example, consider the following table of user profiles with
birth year and country of residence. The birth year has a
secondary index defined.
CREATE TABLE users (
username text PRIMARY KEY,
firstname text,
lastname text,
birth_year int,
country text
);