PostgreSQL For IoT
PostgreSQL For IoT
[email protected] http://intrbiz.com
Hello!
● I’m Chris
○ IT jack of all trades, studied Electronic Engineering
● Been using PostgreSQL for about 15 years
● Very much into Open Source
○ Started Bergamot Monitoring - open distributed monitoring
● Worked on various PostgreSQL systems
○ Connected TV Set top boxes
○ Smart energy meter analytics
○ IoT Kanban Board
○ IoT CHP Engines
○ Mixes of OLTP and OLAP workloads
○ Scaled PostgreSQL in various ways for various situations
[email protected] http://intrbiz.com
IoT
[email protected] http://intrbiz.com
One size fits all?
[email protected] http://intrbiz.com
One size fits all?
[email protected] http://intrbiz.com
Time series databases
[email protected] http://intrbiz.com
Why PostgreSQL?
[email protected] http://intrbiz.com
Why PostgreSQL?
● PostgreSQL makes it easy to combine your time series data with other data
○ You know: a join!
● Find me the average energy consumption of Shropshire?
● Find me the average energy consumption for 4 bed houses during the
summer?
● Find me the average, min, max energy consumption for 4 bed houses during
summer in Shropshire for a half hourly period?
● What is the average energy consumption for houses within x miles of my
house?
[email protected] http://intrbiz.com
"Where you must go; where the path of the One ends."
[email protected] http://intrbiz.com
"Where you must go; where the path of the One ends."
● ESP-32
○ Dual core 32bit @ upto 240MHz
○ 520KiB SRAM (D&I)
○ Typically 4MiB SPI Flash ROM
○ WiFi, TCP/IP stack
○ Runs FreeRTOS
[email protected] http://intrbiz.com
"Where you must go; where the path of the One ends."
● Some devices can be pretty
powerful with good RAM and
storage
● Smart Home Hub
○ Single Core 1GHz ARM Cortex-A8
○ 512 MiB RAM
○ 4 GiB Flash eMMC Storage
○ WiFi + Ethernet
○ Zigbee
○ Runs Linux
[email protected] http://intrbiz.com
"Where you must go; where the path of the One ends."
● Industrial Control
○ Single Core 200MHz ARM7
○ 128 MiB RAM
○ >8GB SD Card
○ Ethernet
○ Lots of CAN
○ Runs a RTOS, hard real time
○ Doing other very important things
[email protected] http://intrbiz.com
Collecting Data
[email protected] http://intrbiz.com
Collecting Data - Device ←→ Platform
[email protected] http://intrbiz.com
Collecting Data - Device ←→ Platform
[email protected] http://intrbiz.com
Storing Data
[email protected] http://intrbiz.com
Storing Data
[email protected] http://intrbiz.com
Storing Data - Range Types
[email protected] http://intrbiz.com
Storing Data - Metadata
[email protected] http://intrbiz.com
Storing Data - Rolling On Up
[email protected] http://intrbiz.com
Storing Data - Rolling On Up
t_xmin t_xmax t_cid t_xvac t_ctid t_infomask t_infomask t_hoff
2
4 4 4 4 6 2 2 1
24 bytes
16 8 4 4
32 bytes
[email protected] http://intrbiz.com
Loading Data
[email protected] http://intrbiz.com
Loading Data - Batching
● Load in batches
● Don’t use autocommit
● Batching ramps up
fast:
○ Autocommit: 300 /s
○ Batch of 10: 2k2 /s
○ Batch of 50: 5k5 /s
○ Batch of 100: 6k /s
○ Batch of 300: 8k /s
● Batching gives ~ 20x
performance gain
[email protected] http://intrbiz.com
Loading Data - Batching
connection.setAutoCommit(false);
try {
try (PreparedStatement stmt = connection.prepareStatement("INSERT INTO ....")) {
for (T record : batch) {
stmt.setString(1, record.getId().toString());
stmt.setTimestamp(2, record.getTimestamp());
stmt.setFloat(3, record.getTemperature());
stmt.addBatch();
}
stmt.executeBatch();
}
connection.commit();
} catch (SQLException e) {
connection.rollback();
} finally {
connection.setAutoCommit(true);
}
[email protected] http://intrbiz.com
Loading Data - Comparing Loading Methods
[email protected] http://intrbiz.com
Loading Data - Copy Performance
[email protected] http://intrbiz.com
Loading Data - ON CONFLICT
● Use ON CONFLICT
● Your data will be crap
○ Duplicate PKs
○ Out of order
● Nothing worse than having
your batch abort
○ Need to deal with savepoints,
application buffers
○ Gets rather complex
[email protected] http://intrbiz.com
Loading Data - Unlogged
● UNLOGGED tables
will ramp up faster
than LOGGED tables
with respect to batch
sizes
● Little improvement
over optimized batch
loading
[email protected] http://intrbiz.com
Loading Data - Parallel
[email protected] http://intrbiz.com
Loading Data - Never Sleeping
[email protected] http://intrbiz.com
Loading Data - When Thing Go Wrong
[email protected] http://intrbiz.com
Loading Data - When Thing Go Wrong
● Devices should skew times and back off when things go wrong
○ Can be very easy to trigger congestive collapse
■ Only needs a minor trigger
○ Don’t forget this is more about comms, rather than sampling time
● Your devices should still do sensible things without your platform
● Your data loading system should throttle inserts
○ Don’t want impact of devices taking your DB out, and thus most of the platform
○ It’s probably better to drop data or buffer more than fall flat on your face
[email protected] http://intrbiz.com
Managing Data
[email protected] http://intrbiz.com
Managing Data - Partitioning
Y
MONDA
TUESDAY
WEDNESDAY
THURSDAY
[email protected] http://intrbiz.com
Managing Data - Partitioning
[email protected] http://intrbiz.com
Managing Data - Partitioning
CREATE TABLE iot.alhex_reading_201910
PARTITION OF iot.alhex_reading
FOR VALUES FROM ('2019-10-01') TO ('2019-11-01');
...
CREATE TABLE iot.alhex_reading_202002
PARTITION OF iot.alhex_reading
FOR VALUES FROM ('2020-02-01') TO ('2020-03-01');
[email protected] http://intrbiz.com
Managing Data - Partition Loading Performance
[email protected] http://intrbiz.com
Managing Data - Partition Retention
COPY iot.alhex_reading_201910
TO ‘archive/alhex_reading_201910’;
[email protected] http://intrbiz.com
Managing Data - Tablespaces
[email protected] http://intrbiz.com
Managing Data - BRIN
[email protected] http://intrbiz.com
Managing Data - BRIN
CREATE TABLE iot.alhex_reading_history (
device_id UUID NOT NULL,
read_at TIMESTAMP NOT NULL,
temperature REAL,
light REAL
);
[email protected] http://intrbiz.com
Managing Data - BRIN
-- Relation size: 1321 MB, 23,000,000 rows
SELECT * FROM iot.alhex_reading_history
WHERE device_id = 'a3e06bcf-429d-43ff-9e46-55aee2ddd86a'
AND read_at >= '2019-10-17 07:10:31'
AND read_at <= '2019-10-18 07:10:31';
-- Seq Scan: 1239 ms No Index
-- BRIN: 148 ms 80 kB Index
-- BTREE: 0.73 ms 891 MB Index
[email protected] http://intrbiz.com
Processing Data
[email protected] http://intrbiz.com
Processing Data - Putting Stuff Together
SELECT date_trunc(‘month’, r.day) AS month,
avg(r.kwh), min(r.kwh), max(r.kwh)
FROM reading r
JOIN meter m ON (m.id = r.meter_id)
JOIN postcode p ON st_dwithin(m.location,
p.location, 2000)
WHERE p.postcode = ‘SY2 6ND’
GROUP BY 1;
[email protected] http://intrbiz.com
Processing Data - Putting Stuff Together
SELECT avg(r.kwh), min(r.kwh),
max(r.kwh), count(*)
FROM reading_monthly r
JOIN meter m ON (m.id = r.meter_id)
JOIN property p ON (m.property_id = p.id)
WHERE p.bedrooms = 4
AND r.month BETWEEN ‘2019-01-01’ AND ‘2019-03-01’
[email protected] http://intrbiz.com
Processing Data - Presenting Data
SELECT r.device_id, t.time, array_agg(r.read_at),
avg(r.temperature), avg(r.light)
FROM generate_series(
'2019-10-06 00:00:00'::TIMESTAMP,
'2019-10-07 00:00:00'::TIMESTAMP, '10 minutes') t(time)
JOIN iot.alhex_reading r
ON (r.device_id = '26170b53-ae8f-464e-8ca6-2faeff8a4d01'::UUID
AND r.read_at >= t.time
AND r.read_at < (t.time + '10 minutes'))
GROUP BY 1, 2
ORDER BY t.time;
[email protected] http://intrbiz.com
Processing Data - Presenting Data
SELECT r.device_id, t.time, array_agg(r.read_at),
avg(r.temperature), avg(r.light)
FROM generate_series(
'2019-10-06 00:00:00'::TIMESTAMP,
'2019-10-07 00:00:00'::TIMESTAMP, '10 minutes') t(time)
JOIN iot.alhex_reading r
ON (r.device_id = '26170b53-ae8f-464e-8ca6-2faeff8a4d01'::UUID
AND r.read_at >= t.time
AND r.read_at < (t.time + '10 minutes'))
GROUP BY 1, 2
ORDER BY t.time;
[email protected] http://intrbiz.com
Processing Data - Window Functions
[email protected] http://intrbiz.com
Processing Data - Counters
SELECT
day,
energy,
energy - coalesce(lag(energy)
OVER (ORDER BY day), 0) AS consumed
FROM iot.meter_reading
ORDER BY day;
[email protected] http://intrbiz.com
Processing Data - Rolling Along
WITH consumption AS (
… from previous slide …
)
SELECT *, sum(consumed) OVER
(PARTITION BY date_trunc('week', day))
AS weekly_total
FROM consumption;
[email protected] http://intrbiz.com
Processing Data - Moving On Up
[email protected] http://intrbiz.com
Processing Data - Mind The Gap!
[email protected] http://intrbiz.com
Processing Data - Mind The Gap
WITH days AS (
SELECT t.day::DATE
FROM generate_series('2017-01-01'::DATE, '2017-01-15'::DATE, '1 day') t(day)
), data AS (
SELECT *
FROM iot.meter_reading
WHERE day >= '2017-01-01'::DATE AND day <= '2017-01-15'::DATE
)
SELECT day, coalesce(energy_import_wh, (((next_read - last_read) / (next_read_time - last_read_time)) * (day -
last_read_time)) + last_read) AS energy_import_wh_interpolated
FROM (
SELECT t.day, d.energy_import_wh,
last(d.day) OVER lookback AS last_read_time,
last(d.day) OVER lookforward AS next_read_time,
last(d.energy_import_wh) OVER lookback AS last_read,
last(d.energy_import_wh) OVER lookforward AS next_read
FROM days t
LEFT JOIN data d ON (t.day = d.day)
WINDOW
lookback AS (ORDER BY t.day),
lookforward AS (ORDER BY t.day DESC)
) q ORDER BY q.day
[email protected] http://intrbiz.com
Processing Data - Mind The Gap
CREATE FUNCTION last_agg(anyelement, anyelement)
RETURNS anyelement LANGUAGE SQL IMMUTABLE STRICT AS $$
SELECT $2;
$$;
[email protected] http://intrbiz.com
Processing Data - Mind The Gap
WITH days AS (
SELECT t.day::DATE
FROM generate_series('2017-01-01'::DATE,
'2017-01-15'::DATE, '1 day') t(day)
), data AS (
SELECT *
FROM iot.meter_reading
WHERE day >= '2017-01-01'::DATE
AND day <= '2017-01-15'::DATE
)
[email protected] http://intrbiz.com
Processing Data - Mind The Gap
SELECT t.day, d.energy,
last(d.day) OVER lookback AS last_read_time,
last(d.day) OVER lookforward AS next_read_time,
last(d.energy) OVER lookback AS last_read,
last(d.energy) OVER lookforward AS next_read
FROM days t
LEFT JOIN data d ON (t.day = d.day)
WINDOW
lookback AS (ORDER BY t.day),
lookforward AS (ORDER BY t.day DESC)
[email protected] http://intrbiz.com
Processing Data - Mind The Gap
SELECT day,
coalesce(energy,
(((next_read - last_read)
/ (next_read_time - last_read_time))
* (day - last_read_time))
+ last_read) AS energy_interpolated
FROM (
… from previous slide …
) q
ORDER BY day
[email protected] http://intrbiz.com
Extensions - TimescaleDB
[email protected] http://intrbiz.com
So Long And Thanks For All The Fish
● Questions?
[email protected] http://intrbiz.com