In the world of databases, developers and analysts often face a tension:
- PostgreSQL shines at transactional workloads (OLTP): rapid inserts, updates, and complex multi-user business applications.
- Columnar engines like DuckDB excel at analytics (OLAP): large scans, aggregations, data exploration, and reporting.
What if you could combine the strengths of both?
Enter pg_duckdb — a PostgreSQL extension that merges DuckDB’s columnar, vectorized analytics engine directly into PostgreSQL. It’s not just “another extension” — it fundamentally expands what PostgreSQL can do without moving data out of your database.
What Is pg_duckdb?
pg_duckdb is an open-source PostgreSQL extension that embeds the DuckDB analytics engine within PostgreSQL itself. Instead of exporting your data to another analytics database, you can run advanced analytical queries inside PostgreSQL using DuckDB’s high-performance execution.
In practical terms, this means:
- Your OLTP data stays in PostgreSQL
- You get DuckDB-style performance for analytics
- You don’t need heavy ETL pipelines or separate data warehouses
Why It Matters
Traditionally, PostgreSQL stores data in a row-oriented format. That’s optimized for transactions — fetching one row at a time, updating frequently, and ensuring consistency. But analytical workloads — like scanning millions of rows, aggregating, or summarizing — benefit from columnar storage, where only relevant columns are read into memory. Columnar systems also compress data better and execute operations in vectorized batches.
DuckDB shines because of:
- Columnar storage — efficient for analytical workloads
- Vectorized execution — high CPU throughput
- Minimal dependencies — lightweight and fast for embedded use cases
pg_duckdb brings these abilities into PostgreSQL. The result? Analytical SQL runs much faster — especially on large tables — without moving data out of your production database.
How pg_duckdb Actually Works
Unlike traditional PostgreSQL extensions that add helper functions or indexing structures, pg_duckdb embeds an entire analytics engine inside the PostgreSQL server process. When enabled, certain queries — especially large analytical scans — are automatically routed through DuckDB’s engine instead of PostgreSQL’s planner and executor.
Here’s the magic:
- DuckDB’s engine lives inside PostgreSQL — not as an external service.
- You can query PostgreSQL tables directly — with no data export.
- pg_duckdb decides when analytics queries should be executed by DuckDB.
- You get faster scans, aggregations, joins, and heavy SQL transformations.
From the official GitHub README:
pg_duckdb “integrates DuckDB’s columnar-vectorized analytics engine into PostgreSQL, enabling high-performance analytics and data-intensive applications.”
Querying PostgreSQL Data With DuckDB
Here’s the beauty: you do not need to change your existing SQL. You write standard SELECT statements, and if pg_duckdb is configured (e.g., with duckdb.force_execution = true), DuckDB’s engine will run the query.
For example:
SET duckdb.force_execution = true;
SELECT
date_trunc('month', created_at) AS month,
COUNT(*) AS user_count,
SUM(amount) AS revenue
FROM
orders
GROUP BY
month;
Behind the scenes, DuckDB executes this using columnar internals, while PostgreSQL handles access to the tables. No CSV exports, no ETL — just SQL.
Real-World Use Cases
Here’s where pg_duckdb shines:
Aggregations over millions of rows become much faster than classic PostgreSQL plans.
Query Parquet files or Iceberg tables directly — no ETL required.
Run OLTP and OLAP from the same database instance.
Final stage reporting, dashboards, time-based aggregations.
Not a Replacement for PostgreSQL
It’s important to set expectations.
- pg_duckdb accelerates analytical queries
- It does not turn PostgreSQL into a full columnar database
- It’s not replacing PostgreSQL indexes, WAL, or core transactional internals
The idea is to combine strengths, not replace one with another.
Setting Up pg_duckdb
Step 1 — Clone the Repository
git clone https://github.com/duckdb/pg_duckdb.git
cd pg_duckdb
Step 2 — Build and Install
make
make install
Ensure pg_config from your PostgreSQL installation is available in your PATH.
Step 3 — Enable in PostgreSQL
CREATE EXTENSION pg_duckdb;
To force DuckDB execution for compatible analytical queries:
SET duckdb.force_execution = true;
pg_duckdb is a game changer for analytics inside PostgreSQL. It’s not a buzzword — it’s a practical extension backed by DuckDB’s battle-tested vectorized engine. You keep your transactional database, and gain analytical power without building separate data pipelines.
If your workflows include:
- Dashboarding
- Reporting
- Large aggregations
- Data lake queries
Then pg_duckdb is absolutely worth experimenting with.
As analytics workloads continue to grow, having a tool that bridges OLTP and OLAP within a single PostgreSQL playground is powerful. And best of all — you do it with straight SQL.