Question 1

What is Citus?

Accepted Answer

Citus is a PostgreSQL extension that transparently shards tables across multiple Postgres nodes — turning Postgres into a distributed SQL database. You create 'distributed tables' partitioned by a key (tenant_id, customer_id, etc.), and Citus routes queries to the right shards and parallelizes joins and aggregations across the cluster. AGPL-licensed, maintained by Microsoft (acquired Citus Data in 2019), and available as Azure Cosmos DB for PostgreSQL. Single-node Citus is also widely used for columnar storage on a single machine.

Question 2

What's the difference between Citus and Hydra?

Accepted Answer

Citus focuses on distributed query execution across nodes (sharding, parallel joins, horizontal scaling). Hydra focuses on columnar storage on a single node — its columnar engine compresses tables and accelerates analytical scans, but doesn't shard. They're complementary: many Hydra deployments use Citus columnar (Citus's columnar engine, which Hydra also uses under the hood since Hydra forked from it). For a single big machine doing analytics, Hydra is simpler. For multi-node scale-out, use Citus.

Question 3

Can Postgres replace a data warehouse?

Accepted Answer

For teams under ~1 TB of analytical data, yes — Postgres with Citus columnar, Hydra, or pg_duckdb can match Snowflake and BigQuery on query speed at a fraction of the cost. For teams managing 10+ TB with hundreds of concurrent analyst queries, a purpose-built warehouse still wins on workload management, separation of compute from storage, and operational features. The sweet spot for Postgres-as-warehouse is the analytics needs of a typical product company that doesn't yet have a dedicated data platform team.

Question 4

Does pg_duckdb work on managed Postgres?

Accepted Answer

Limited support currently. pg_duckdb is relatively new (released 2024) and requires a shared library, so most managed providers haven't enabled it yet — check AWS RDS, Supabase, and Neon's current extension allow-lists. Self-hosted Postgres and Postgres-compatible analytics platforms like MotherDuck and Tembo are the main supported paths today. For managed providers without pg_duckdb, alternatives are foreign data wrappers (parquet_fdw, pg_lakehouse) or running DuckDB separately and federating via postgres_fdw.

Question 5

When should I use a columnar extension vs vanilla Postgres?

Accepted Answer

Add a columnar extension when your queries scan large fractions of a table for aggregations — sums, averages, group-bys over millions to billions of rows. Columnar storage compresses better (5-10x typical) and reads only the columns the query needs, both dramatically faster than row-store for these patterns. Stay row-store for transactional workloads where you read whole rows by primary key or do narrow filtered queries — columnar formats are slower for those patterns and don't support UPDATE/DELETE as efficiently.

20+ Analytics & Columnar Extensions for PostgreSQL

What is a PostgreSQL Analytics Extension?

When to Add an Analytics Extension

Frequently Asked Questions

Manage PostgreSQL Visually