module 14 columnar storage and vectorized execution
module 14 columnar storage and vectorized execution
only require a few columns. For example, in this query, we only need the
timestamp
and wind speed columns. With row storage, we need to read seven pages
to
temperature values for records 1-40, and so on. In our weather dataset
query, we need only two columns,
timestamp and temperature. With this columnar
are stored together. But with column storage, each column is stored
best supported by row storage is axing full rows, and the access pattern
dataset looks like this. Temperature, humidity, and wind speed all vary
over time. With row storage, we write enter records sequentially
read all the columns, even though only the timestamp and temperature
columns are required. Since all the columns are red, this approach incurs
selective filter, running the same query using columnar storage format is
30 times faster than with
row storage format. This is mainly because we read four times fewer
pages from
of a database system.
reduces storage size because the deltas are typically much smaller
numbers
data types in each row, making it difficult to find repeating patterns that
compression algorithms
each column contains a single data type and is likely to have high
data redundancy. For example, product IDs often repeat a lot across
IDs or country names. The first compression algorithm we will see is Delta
encoding, which is useful for numeric data where values change
necessary for each value. It is suitable for columns with small ranges like
in Celsius, 35, 40, 45, and 50, which fit in a six-bit range values
total bits required and determines the number of bytes needed to store
required number of bits, checking if each bit is set in the pack data using
bit in value.
categorical values like product IDs in sales data. It assigns shorter codes
to more frequently occurring values
only occurs ten times, so it gets a longer code of 111. Let's look at the
calls build codes, which assigns binary codes to each character based on
from zero through 255. During compression, we loop through all the
product
techniques. The query result, total sales remains consistent across all
the query takes 7.79 milliseconds as raw product IDs require full string
comparisons. With Huffman encoding,
number of bits. For example, four bits is enough for temperature data
memory utilization.
tuple independently. First, each tuple will trigger function calls between
compressed data stored in formats like BitPack columns. The core SIMD
operations involve
greater than or equal to 32, and vc less than equal to 32. We combine
these
are greater than 20, and we see less than equal to to see which
four elements at a time. This parallel approach maps well to tasks like
filtering
process large datasets. If you have N data points, scalar processing would
push through the pipeline, the less time the CPU weighs wearing on these
usage and more cache mess. Overall, columnar plus SIMD leads to
efficient use
keeping the pipeline busy. SIMD concepts actually traced back to the early
parallel operation across pixel arrays. In the early 2000s, SIMD became a
permanent fixture
to gaming physics. From 2010 onwards, we have seen even more powerful
SIMD
and compression. Systems like Apache Arrow are designed with columnar
data storage format in mind and exemplify how software can exploit
SIMD for a sensor data set. By splitting out the timestamps and
temperatures into
for vector loads. Aligning the arrays with SIMD friendly boundaries also
helps in preventing
data using SIMD. Our task is to compute the average temperature within a
specified timestamp range. In a scalar implementation, we would check
two separate registers. These two functions load four 32 bit integers
loaded timestamps against our desired range. The first instruction checks
if each element is greater than
only the elements that satisfy both conditions. This mask will guide which
temperature values
we will add into our sum. Once we have the mask, we apply it on the
any leftover elements if our total count isn't a multiple of four in a scalar
loop to ensure
vectorized portion with a simple scalar tile case is a common pattern for
data type integer or float and the vector width. For example, vload1QS 32
represents loading a vector
of 432 bit signed integers, and vaddqf 32 signifies a floating point addition
query processing.
operate from the ground up, you have gained a good eye for detail and a
strong
ideas from this course? Database systems are awesome, they are at the
heart of solving real world problems
Google search or Chat GPT query abstracts away the complexities of vast
of these disciplines.