PGSQL Phriday 011 - Partitioning vs Sharding in PostgreSQL

For this month’s PGSQL Phriday blogging challenge, Tomasz Gintowt asks if people rather use partitioning or sharding to solve business problems.

Adjust

Here at Adjust we use both. We also have quite a few databases of all sizes. Let’s look at some examples.

We leverage four primary database systems, termed as “Backends”, “Shards”, “Bagger” and “Tracker”. The Backend systems function as intermediate storage of data, anything between hours and weeks. Shards are designed for long-term data retention, keeping records indefinitely until a customer leaves us. Bagger also stores data intermediately, but only up to about a month, tops. It’s our logging and log retrieval system. And lastly Tracker is a large, specialized database which mainly stores Tracker IDs. We have quite a few more databases, but for the purpose of this blog post the four different systems shall be enough to show that different business needs require different architectures.

Let’s have a look at each system, what it does, what business problems it solves and what is the best approach, sharding or partitioning.

Shards

Shards function as our long-term storage systems. As the name already suggests, named due to their practice of distributing data across multiple systems. Each Shard approximately around 14 TB of data, give or take. This already makes it clear that it is impractical to consolidate all data from all Shards in one single system. And we have quite a few Shards running. Occasionally we have to add new Shards to the fleet, when the disk space on the existing ones becomes sparse. This process we call “Resharding”, and it involves redistributing the data across all existing and new Shards.

In addition to mere sharding, a number of the tables on each Shard are also partitioned. The data volume within each partition is different, while some partitions are empty, others hold up to around 35 GB. And anything in-between.

Given the sharded nature of this data, it’s self-evident that the application needs to identify which database to query. The services in front of the databases are aware of the sharded structure, and either query a subset of Shards, or query all servers.

Backends

The Backend system stores data which is in flight. Meaning that the data can still be updated, hours, days or even some weeks later. The individual Backend is around 15 TB in size.

Due to the frequency of updates, a substantial number of tables are partitioned. Data is also distributed across many Backend databases, but most applications typically interact with just one Backend at a time. There is no need to query all of them in parallel.

Bagger

For audit and logging purposes, we have a system dubbed “Bagger”. This serves as a temporary vault for a copy of all incoming web requests, for a short period of time: a couple days, up to a month. This system allows us to go back in time, and retrieve the original data which entered our pipelines. This proves invaluable for the support team when discrepancies emerge between what our pipelines report and what customers report. This further aids in the debugging of pipelines when the “before” and “after” data is known.

Bagger servers - there are several dozen of them - are characterized by their high write-load, where each server runs multiple PostgreSQL instances, each instance hitting a different I/O channel. They are usually kept “at capacity”, that is: old data is removed when the disk usage hits a certain threshold. The data in Bagger is partitioned hourly, allowing us to drop old data in a fine-grained pattern.

Tracker

Tracker database is a singular system with multiple replicas, dedicated to keep track (pun intended) of Tracker IDs. The database boasts a size of approx. 30 TBs in size, and several of its tables in it are further divided into multiple partitions, depending on what kind of data the partition is holding.

We are in the process of rewriting the architecture, as the system grew too large over time and maintenance is a burden. Setting up new replicas, or migrating this system to larger servers takes quite some time. The many data updates on this system make constant XID Wraparound prevention a necessity.

Summary

Given the sheer volume and number of our databases, we have no other alternative but to resort to sharding to keep data on multiple (often many) servers. It is virtually impossible to hold that amount of data on a single server. We face this problem with many of the database systems we operate.

On each sharded system we additionally use partitioning to split large data tables into multiple partitions, and speed up both read and write operations.

For us, the decision between using “sharding or partitioning” isn’t a question of one over the other, but both compliment each other.

Image credit

Image by AcatXIo from Pixabay

Adjust#

Shards#

Backends#

Bagger#

Tracker#

Summary#

Image credit#