Pypika Documentation
Pypika Documentation
Release 0.18.4
Timothy Heys
1 Abstract 3
2 Contents 5
2.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Advanced Query Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Window Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Extending PyPika . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 API Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 License 35
i
ii
pypika Documentation, Release 0.18.4
Contents 1
pypika Documentation, Release 0.18.4
2 Contents
CHAPTER 1
Abstract
What is PyPika?
PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a simple interface for
building SQL queries without limiting the flexibility of handwritten SQL. Designed with data analysis in mind, PyPika
leverages the builder design pattern to construct queries to avoid messy string formatting and concatenation. It is also
easily extended to take full advantage of specific features of SQL database vendors.
3
pypika Documentation, Release 0.18.4
4 Chapter 1. Abstract
CHAPTER 2
Contents
2.1 Installation
PyPika supports python 2.7 and 3.3+. It may also work on pypy, cython, and jython, but is not being tested for
these versions.
To install PyPika run the following command:
2.2 Tutorial
The entry point for building queries is pypika.Query. In order to select columns from a table, the table must first
be added to the query. For simple queries with only one table, tables and columns can be references using strings. For
more sophisticated queries a pypika.Table must be used.
str(q)
5
pypika Documentation, Release 0.18.4
q.get_sql()
Using pypika.Table
customers = Table('customers')
q = Query.from_(customers).select(customers.id, customers.fname, customers.lname,
˓→customers.phone)
Arithmetic
Arithmetic expressions can also be constructed using pypika. Operators such as +, -, *, and / are implemented by
pypika.Field which can be used simply with a pypika.Table or directly.
q = Query.from_('account').select(
Field('revenue') - Field('cost')
)
Using pypika.Table
accounts = Table('accounts')
q = Query.from_(accounts).select(
accounts.revenue - accounts.cost
)
q = Query.from_(accounts).select(
(accounts.revenue - accounts.cost).as_('profit')
)
6 Chapter 2. Contents
pypika Documentation, Release 0.18.4
table = Table('table')
q = Query.from_(table).select(
table.foo + table.bar,
table.foo - table.bar,
table.foo * table.bar,
table.foo / table.bar,
(table.foo+table.bar) / table.fiz,
)
Filtering
customers = Table('customers')
q = Query.from_(customers).select(
customers.id, customers.fname, customers.lname, customers.phone
).where(
customers.lname == 'Mustermann'
)
Query methods such as select, where, groupby, and orderby can be called multiple times. Multiple calls to the where
method will add additional conditions as
customers = Table('customers')
q = Query.from_(customers).select(
customers.id, customers.fname, customers.lname, customers.phone
).where(
customers.fname == 'Max'
).where(
customers.lname == 'Mustermann'
)
customers = Table('customers')
q = Query.from_(customers).select(
customers.id,customers.fname
).where(
customers.age[18:65] & customers.status.isin(['new', 'active'])
)
SELECT id,fname FROM customers WHERE age BETWEEN 18 AND 65 AND status IN ('new',
˓→'active')
Filtering with complex criteria can be created using boolean symbols &, |, and ^.
AND
2.2. Tutorial 7
pypika Documentation, Release 0.18.4
customers = Table('customers')
q = Query.from_(customers).select(
customers.id, customers.fname, customers.lname, customers.phone
).where(
(customers.age >= 18) & (customers.lname == 'Mustermann')
)
OR
customers = Table('customers')
q = Query.from_(customers).select(
customers.id, customers.fname, customers.lname, customers.phone
).where(
(customers.age >= 18) | (customers.lname == 'Mustermann')
)
XOR
customers = Table('customers')
q = Query.from_(customers).select(
customers.id, customers.fname, customers.lname, customers.phone
).where(
(customers.age >= 18) ^ customers.is_registered
)
Grouping allows for aggregated results and works similar to SELECT clauses.
from pypika import functions as fn
customers = Table('customers')
q = Query.from_(customers).where(
customers.age >= 18
).groupby(
customers.id
).select(
customers.id, fn.Sum(customers.revenue)
)
After adding a GROUP BY clause to a query, the HAVING clause becomes available. The method Query.having()
takes a Criterion parameter similar to the method Query.where().
from pypika import functions as fn
payments = Table('payments')
(continues on next page)
8 Chapter 2. Contents
pypika Documentation, Release 0.18.4
Tables and subqueries can be joined to any query using the Query.join() method. Joins can be performed with
either a USING or ON clauses. The USING clause can be used when both tables/subqueries contain the same field and
the ON clause can be used with a criterion. To perform a join, ...join() can be chained but then must be followed
immediately by ...on(<criterion>) or ...using(*field).
As a shortcut, the Query.join().on_field() function is provided for joining the (first) table in the FROM
clause with the joined table when the field name(s) are the same in both tables.
2.2. Tutorial 9
pypika Documentation, Release 0.18.4
˓→"."group"='A'
Unions
Both UNION and UNION ALL are supported. UNION DISTINCT is synonomous with “UNION‘‘ so and PyPika
does not provide a separate function for it. Unions require that queries have the same number of SELECT clauses so
trying to cast a unioned query to string with through a UnionException if the column sizes are mismatched.
To create a union query, use either the Query.union() method or + operator with two query instances. For a union
all, use Query.union_all() or the * operator.
Using pypika.Interval, queries can be constructed with date arithmetic. Any combination of intervals can be
used except for weeks and quarters, which must be used separately and will ignore any other values if selected.
fruits = Tables('fruits')
(continues on next page)
10 Chapter 2. Contents
pypika Documentation, Release 0.18.4
Tuples
Tuples are supported through the class pypika.Tuple but also through the native python tuple wherever possible.
Tuples can be used with pypika.Criterion in WHERE clauses for pairwise comparisons.
q = Query.from_(self.table_abc) \
.select(self.table_abc.foo, self.table_abc.bar) \
.where(Tuple(self.table_abc.foo, self.table_abc.bar) == Tuple(1, 2))
Using pypika.Tuple on both sides of the comparison is redundant and PyPika supports native python tuples.
q = Query.from_(self.table_abc) \
.select(self.table_abc.foo, self.table_abc.bar) \
.where(Tuple(self.table_abc.foo, self.table_abc.bar) == (1, 2))
Query.from_(self.table_abc) \
.select(self.table_abc.foo, self.table_abc.bar) \
.where(Tuple(self.table_abc.foo, self.table_abc.bar).isin([(1, 1), (2, 2), (3,
˓→ 3)]))
Strings Functions
There are several string operations and function wrappers included in PyPika. Function wrappers can be found in the
pypika.functions package. In addition, LIKE and REGEX queries are supported as well.
customers = Tables('customers')
q = Query.from_(customers).select(
customers.id,
customers.fname,
customers.lname,
).where(
(continues on next page)
2.2. Tutorial 11
pypika Documentation, Release 0.18.4
customers = Tables('customers')
q = Query.from_(customers).select(
customers.id,
customers.fname,
customers.lname,
).where(
customers.lname.regex(r'^[abc][a-zA-Z]+&')
)
customers = Tables('customers')
q = Query.from_(customers).select(
customers.id,
fn.Concat(customers.fname, ' ', customers.lname).as_('full_name'),
)
Case Statements
Case statements allow fow a number of conditions to be checked sequentially and return a value for the first condition
met or otherwise a default value. The Case object can be used to chain conditions together along with their output
using the when method and to set the default value using else_.
customers = Tables('customers')
q = Query.from_(customers).select(
customers.id,
Case()
.when(customers.fname == "Tom", "It was Tom")
.when(customers.fname == "John", "It was John")
.else_("It was someone else.").as_('who_was_it')
)
SELECT "id",CASE WHEN "fname"='Tom' THEN 'It was Tom' WHEN "fname"='John' THEN 'It
˓→was John' ELSE 'It was someone else.' END "who_was_it" FROM "customers"
Data can be inserted into tables either by providing the values in the query or by selecting them through another query.
12 Chapter 2. Contents
pypika Documentation, Release 0.18.4
By default, data can be inserted by providing values for all columns in the order that they are defined in the table.
customers = Table('customers')
Multiple rows of data can be inserted either by chaining the insert function or passing multiple tuples as args.
customers = Table('customers')
customers = Table('customers')
customers = Table('customers')
q = Query.into(customers)\
.insert(1, 'Jane', 'Doe', '[email protected]')\
.on_duplicate_key_update(customers.email, Values(customers.email))
.on_duplicate_key_update works similar to .set for updating rows, additionally it provides the Values
wrapper to update to the value specified in the INSERT clause.
To specify the columns and the order, use the columns function.
customers = Table('customers')
Inserting data with a query works the same as querying data with the additional call to the into method in the builder
chain.
2.2. Tutorial 13
pypika Documentation, Release 0.18.4
q = Query.into(customers_backup).from_(customers).select('*')
The syntax for joining tables is the same as when selecting data
.from_(customers)
.join(orders).on(orders.customer_id == customers.id)
.select(orders.id, customers.fname, customers.lname)
customers = Table('customers')
The syntax for joining tables is the same as when selecting data
Query.update(customers)
.join(profiles).on(profiles.customer_id == customers.id)
.set(customers.lname, profiles.lname)
UPDATE "customers"
JOIN "profiles" ON "profiles"."customer_id"="customers"."id"
SET "customers"."lname"="profiles"."lname"
14 Chapter 2. Contents
pypika Documentation, Release 0.18.4
This section covers the range of functions that are not widely standardized across all SQL databases or meet special
needs. PyPika intends to support as many features across different platforms as possible. If there are any features
specific to a certain platform that PyPika does not support, please create a GitHub issue requesting that it be added.
There can sometimes be differences between how database vendors implement SQL in their platform, for example
which quote characters are used. To ensure that the correct SQL standard is used for your platform, the platform-
specific Query classes can be used.
from pypika import MySQLQuery, MSSQLQuery, PostgreSQLQuery, OracleQuery, VerticaQuery
You can use these query classes as a drop in replacement for the default Query class shown in the other examples.
Again, if you encounter any issues specific to a platform, please create a GitHub issue on this repository.
The ROLLUP modifier allows for aggregating to higher levels that the given groups, called super-aggregates.
from pypika import Rollup, functions as fn
products = Table('products')
query = Query.from_(products) \
.select(products.id, products.category, fn.Sum(products.price)) \
.rollup(products.id, products.category)
The package pypika.analytic contains analytic function wrappers. These can be used in SELECT clauses when
building queries for databases that support them. Different functions have different arguments but all require some
sort of partitioning.
The NTILE function requires a constant integer argument while the RANK function takes no arguments. clause.
from pypika import Query, Table, analytics as an, functions as fn
total_sales = fn.Sum(store_sales_fact.sales_quantity).as_('TOTAL_SALES')
calendar_month_name = date_dimension.calendar_month_name.as_('MONTH')
ntile = an.NTile(4).order_by(total_sales).as_('NTILE')
query = Query.from_(store_sales_fact) \
(continues on next page)
FIRST_VALUE and LAST_VALUE both expect a single argument. They also support an additional IGNORE NULLS
clause.
t_month = Table('t_month')
first_month = an.FirstValue(t_month.month) \
.over(t_month.season) \
.orderby(t_month.id)
last_month = an.LastValue(t_month.month) \
.over(t_month.season) \
.orderby(t_month.id) \
.ignore_nulls()
query = Query.from_(t_month) \
.select(first_month, last_month)
customer_dimension = Table('customer_dimension')
median_income = an.Median(customer_dimension.annual_income).over(customer_dimension.
˓→customer_state).as_('MEDIAN')
avg_income = an.Avg(customer_dimension.annual_income).over(customer_dimension.
˓→customer_state).as_('AVG')
stddev_income = an.StdDev(customer_dimension.annual_income).over(customer_dimension.
˓→customer_state).as_('STDDEV')
query = Query.from_(customer_dimension) \
.select(median_income, avg_income, stddev_income) \
(continues on next page)
16 Chapter 2. Contents
pypika Documentation, Release 0.18.4
Functions which use window aggregation expose the functions rows() and range() with varying parameters
to define the window. Both of these functions take one or two parameters which specify the offset boundaries.
Boundaries can be set either as the current row with an.CURRENT_ROW or a value preceding or following the cur-
rent row with an.Preceding(constant_value) and an.Following(constant_value). The ranges
can be unbounded preceding or following the current row by omitting the constant_value parameter like an.
Preceding() or an.Following().
FIRST_VALUE and LAST_VALUE also support window frames.
t_transactions = Table('t_customers')
rolling_7_sum = an.Sum(t_transactions.total) \
.over(t_transactions.item_id) \
.orderby(t_transactions.day) \
.rows(an.Preceding(7), an.CURRENT_ROW)
query = Query.from_(t_transactions) \
.select(rolling_7_sum)
PyPika can be extended to include additional features that are not included.
Adding functions can be achieved by extending pypika.Function.
WRITEME
pypika.enums module
18 Chapter 2. Contents
pypika Documentation, Release 0.18.4
get_sql(**kwargs)
class pypika.enums.SqlTypeLength(name, length)
get_sql(**kwargs)
class pypika.enums.SqlTypes
pypika.functions module
20 Chapter 2. Contents
pypika Documentation, Release 0.18.4
pypika.queries module
22 Chapter 2. Contents
pypika Documentation, Release 0.18.4
validate(_from, _joins)
class pypika.queries.JoinOn(item, how, criteria)
Bases: pypika.queries.Join
get_sql(**kwargs)
validate(_from, _joins)
class pypika.queries.JoinUsing(item, how, fields)
Bases: pypika.queries.Join
get_sql(**kwargs)
validate(_from, _joins)
class pypika.queries.Joiner(query, item, how, type_label)
Bases: object
cross()
Return cross join
on(criterion)
on_field(*fields)
using(*fields)
class pypika.queries.Query
Bases: object
Query is the primary class and entry point in pypika. It is used to build queries iteratively using the builder
design pattern.
This class is immutable.
classmethod from_(table)
Query builder entry point. Initializes query building and sets the table to select from. When using this
function, the query becomes a SELECT query.
Parameters table – Type: Table or str
An instance of a Table object or a string table name.
:returns QueryBuilder
classmethod into(table)
Query builder entry point. Initializes query building and sets the table to insert into. When using this
function, the query becomes an INSERT query.
Parameters table – Type: Table or str
An instance of a Table object or a string table name.
:returns QueryBuilder
classmethod select(*terms)
Query builder entry point. Initializes query building without a table and selects fields. Useful when testing
SQL functions.
Parameters terms – Type: list[expression]
A list of terms to select. These can be any type of int, float, str, bool, or Term. They cannot
be a Field unless the function Query.from_ is called first.
:returns QueryBuilder
classmethod update(table)
Query builder entry point. Initializes query building and sets the table to update. When using this function,
the query becomes an UPDATE query.
Parameters table – Type: Table or str
An instance of a Table object or a string table name.
:returns QueryBuilder
classmethod with_(table, name)
class pypika.queries.QueryBuilder(quote_char=’"’, dialect=None, wrap_union_queries=True,
wrapper_cls=<class ’pypika.terms.ValueWrapper’>)
Bases: pypika.queries.Selectable, pypika.terms.Term
Query Builder is the main class in pypika which stores the state of a query and offers functions which allow the
state to be branched immutably.
columns(*args, **kwargs)
delete(*args, **kwargs)
distinct(*args, **kwargs)
do_join(join)
fields()
from_(*args, **kwargs)
get_sql(with_alias=False, subquery=False, **kwargs)
groupby(*args, **kwargs)
having(*args, **kwargs)
ignore(*args, **kwargs)
insert(*args, **kwargs)
into(*args, **kwargs)
join(*args, **kwargs)
limit(*args, **kwargs)
offset(*args, **kwargs)
orderby(*args, **kwargs)
prewhere(*args, **kwargs)
rollup(*args, **kwargs)
select(*args, **kwargs)
set(*args, **kwargs)
union(*args, **kwargs)
union_all(*args, **kwargs)
update(*args, **kwargs)
where(*args, **kwargs)
with_(*args, **kwargs)
with_totals(*args, **kwargs)
24 Chapter 2. Contents
pypika Documentation, Release 0.18.4
get_sql(quote_char=None, **kwargs)
class pypika.queries.Selectable(alias)
Bases: object
field(name)
star
class pypika.queries.Table(name, schema=None, alias=None)
Bases: pypika.queries.Selectable
get_sql(quote_char=None, **kwargs)
pypika.queries.make_tables(*names, **kwargs)
pypika.terms module
fields()
for_(*args, **kwargs)
get_sql(with_alias=False, **kwargs)
is_aggregate
tables_
class pypika.terms.BetweenCriterion(term, start, end, alias=None)
Bases: pypika.terms.Criterion
fields()
for_(*args, **kwargs)
get_sql(**kwargs)
tables_
class pypika.terms.Case(alias=None)
Bases: pypika.terms.Term
else_(*args, **kwargs)
fields()
get_sql(with_alias=False, **kwargs)
is_aggregate
tables_
when(*args, **kwargs)
class pypika.terms.ComplexCriterion(comparator, left, right, alias=None)
Bases: pypika.terms.BasicCriterion
fields()
get_sql(subcriterion=False, **kwargs)
needs_brackets(term)
class pypika.terms.ContainsCriterion(term, container, alias=None)
Bases: pypika.terms.Criterion
fields()
get_sql(**kwargs)
negate()
tables_
class pypika.terms.Criterion(alias=None)
Bases: pypika.terms.Term
fields()
get_sql()
class pypika.terms.Field(name, alias=None, table=None)
Bases: pypika.terms.Criterion
fields()
for_(*args, **kwargs)
26 Chapter 2. Contents
pypika Documentation, Release 0.18.4
fields()
for_(*args, **kwargs)
get_sql(**kwargs)
tables_
class pypika.terms.NullValue(alias=None)
Bases: pypika.terms.Term
fields()
get_sql(quote_char=None, **kwargs)
class pypika.terms.Pow(term, exponent, alias=None)
Bases: pypika.terms.Function
class pypika.terms.Psuedocolumn(name)
Bases: pypika.terms.Term
Represents a pseudocolumn
to_sql(**kwargs)
class pypika.terms.Rollup(*terms)
Bases: pypika.terms.Function
class pypika.terms.Star(table=None)
Bases: pypika.terms.Field
get_sql(with_alias=False, with_namespace=False, quote_char=None, **kwargs)
class pypika.terms.Term(alias=None)
Bases: object
as_(*args, **kwargs)
between(lower, upper)
bin_regex(pattern)
eq(other)
fields()
for_(table)
Replaces the tables of this term for the table parameter provided. The base implementation returns self
because not all terms have a table property.
Parameters table – The table to replace with.
Returns Self.
get_sql()
gt(other)
gte(other)
ilike(expr)
is_aggregate = False
isin(arg)
isnull()
like(expr)
28 Chapter 2. Contents
pypika Documentation, Release 0.18.4
lt(other)
lte(other)
ne(other)
negate()
not_ilike(expr)
not_like(expr)
notin(arg)
notnull()
regex(pattern)
tables_
wrap_constant(val)
Used for wrapping raw inputs such as numbers in Criterions and Operator.
For example, the expression F(‘abc’)+1 stores the integer part in a ValueWrapper object.
Parameters val – Any value.
Returns Raw string, number, or decimal values will be returned in a ValueWrapper. Fields and
other parts of the querybuilder will be returned as inputted.
class pypika.terms.Tuple(*values)
Bases: pypika.terms.Term
fields()
get_sql(**kwargs)
class pypika.terms.ValueWrapper(value, alias=None)
Bases: pypika.terms.Term
fields()
get_sql(quote_char=None, **kwargs)
is_aggregate = None
class pypika.terms.Values(field)
Bases: pypika.terms.Term
get_sql(quote_char=None, **kwargs)
class pypika.terms.WindowFrameAnalyticFunction(name, *args, **kwargs)
Bases: pypika.terms.AnalyticFunction
class Edge(value=None)
get_frame_sql()
get_partition_sql(**kwargs)
range(*args, **kwargs)
rows(*args, **kwargs)
pypika.utils module
exception pypika.utils.CaseException
Bases: exceptions.Exception
exception pypika.utils.DialectNotSupported
Bases: exceptions.Exception
exception pypika.utils.GroupingException
Bases: exceptions.Exception
exception pypika.utils.JoinException
Bases: exceptions.Exception
exception pypika.utils.QueryException
Bases: exceptions.Exception
exception pypika.utils.RollupException
Bases: exceptions.Exception
exception pypika.utils.UnionException
Bases: exceptions.Exception
pypika.utils.alias_sql(sql, alias, quote_char=None)
pypika.utils.builder(func)
Decorator for wrapper “builder” functions. These are functions on the Query class or other classes used for
building queries which mutate the query and return self. To make the build functions immutable, this decorator
is used which will deepcopy the current instance. This decorator will return the return value of the inner function
or the new copy of the instance. The inner function does not need to return self.
pypika.utils.format_quotes(value, quote_char)
pypika.utils.ignore_copy(func)
Decorator for wrapping the __getattr__ function for classes that are copied via deepcopy. This prevents infinite
recursion caused by deepcopy looking for magic functions in the class. Any class implementing __getattr__ that
is meant to be deepcopy’d should use this decorator.
deepcopy is used by pypika in builder functions (decorated by @builder) to make the results immutable. Any
data model type class (stored in the Query instance) is copied.
pypika.utils.resolve_is_aggregate(values)
Resolves the is_aggregate flag for an expression that contains multiple terms. This works like a voter system,
each term votes True or False or abstains with None.
Parameters values – A list of booleans (or None) for each term in the expression
Returns If all values are True or None, True is returned. If all values are None, None is returned.
Otherwise, False is returned.
Module contents
PyPika is divided into a couple of modules, primarily the queries and terms modules.
pypika.queries
This is where the Query class can be found which is the core class in PyPika. Also, other top level classes such as
Table can be found here. Query is a container that holds all of the Term types together and also serializes the
builder to a string.
30 Chapter 2. Contents
pypika Documentation, Release 0.18.4
pypika.terms
This module contains the classes which represent individual parts of queries that extend the Term base class.
pypika.functions
pypika.enums
Enumerated values are kept in this package which are used as options for Queries and Terms.
pypika.utils
This contains all of the utility classes such as exceptions and decorators.
32 Chapter 2. Contents
CHAPTER 3
• genindex
• modindex
33
pypika Documentation, Release 0.18.4
License
35
pypika Documentation, Release 0.18.4
36 Chapter 4. License
Python Module Index
p
pypika, 30
pypika.enums, 18
pypika.functions, 20
pypika.queries, 22
pypika.terms, 25
pypika.utils, 30
37
pypika Documentation, Release 0.18.4
39
pypika Documentation, Release 0.18.4
40 Index
pypika Documentation, Release 0.18.4
Index 41
pypika Documentation, Release 0.18.4
42 Index
pypika Documentation, Release 0.18.4
Y
year (pypika.enums.DatePart attribute), 18
Index 43