PostgreSQL Source

Nanosync replicates from PostgreSQL using logical replication with the built-in pgoutput plugin — no third-party extensions (wal2json, decoderbufs) are required. Each pipeline gets its own replication slot and publication, so multiple pipelines on the same database do not interfere.

Prerequisites

PostgreSQL 10 or later
wal_level = logical in postgresql.conf
At least one free replication slot (max_replication_slots)
The replication user must have the REPLICATION login attribute and SELECT on every replicated table

Setup

Enable logical replication

Add the following to postgresql.conf and restart PostgreSQL:

wal_level = logical
max_replication_slots = 10   # one slot per pipeline; ensure headroom
max_wal_senders = 10

Verify:

SHOW wal_level;   -- must return "logical"

Create the replication user

CREATE ROLE nanosync LOGIN REPLICATION PASSWORD 'secret';

Grant SELECT on specific tables:

GRANT SELECT ON TABLE public.orders, public.order_items TO nanosync;

Or grant access to an entire schema (including future tables):

GRANT SELECT ON ALL TABLES IN SCHEMA public TO nanosync;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO nanosync;

The REPLICATION attribute allows the user to open replication connections. It does not grant superuser access.

Replication slot and publication

Nanosync creates both automatically on first connect using names derived from the pipeline name (nanosync_slot_<pipeline> and nanosync_pub_<pipeline>). To create them manually:
```
SELECT pg_create_logical_replication_slot('nanosync_slot_orders', 'pgoutput');

CREATE PUBLICATION nanosync_pub_orders
  FOR TABLE public.orders, public.order_items;
-- or for all tables:
CREATE PUBLICATION nanosync_pub_orders FOR ALL TABLES;
```
Replication slots retain WAL segments until the consumer acknowledges them. If Nanosync is stopped for an extended period, WAL can accumulate and fill the disk. Monitor pg_replication_slots.wal_status and set slot_lag_alert_mb to receive warnings before this happens.
(Optional) Set REPLICA IDENTITY for DELETE support

By default, PostgreSQL only includes the primary key in DELETE events. To include the full row:
```
ALTER TABLE public.orders REPLICA IDENTITY FULL;
```
Tables without a primary key must have REPLICA IDENTITY FULL set for DELETE events to be captured.
(Optional) Failover slots — PostgreSQL 17+

Failover slots are synchronised to physical standbys so the slot position survives a primary promotion without data loss. Enable with failover_slots: "true" in the pipeline properties (see below).

Connection configuration

connections:
  - name: prod-postgres
    type: postgres
    dsn: "postgres://nanosync:${env:PG_PASSWORD}@db.prod:5432/mydb?sslmode=require"

DSN format:

postgres://[user]:[password]@[host]:[port]/[database]?[options]

DSN option	Example	Description
`sslmode`	`require`	TLS mode: `disable`, `require`, `verify-ca`, `verify-full`
`sslrootcert`	`/etc/ssl/ca.crt`	CA certificate path (needed for `verify-ca` / `verify-full`)
`connect_timeout`	`10`	Connection timeout in seconds
`application_name`	`nanosync`	Shown in `pg_stat_activity`

Pipeline configuration

pipelines:
  - name: orders-pipeline
    source:
      connection: prod-postgres
      tables:
        - public.orders
        - public.order_items
      properties:
        replication_slot: nanosync_slot_orders   # created automatically if absent
        publication:      nanosync_pub_orders    # created automatically if absent
        snapshot_workers: "4"
        chunk_size:       "10000"

Source properties

Connection

Property	Default	Description
`replication_slot`	`nanosync_slot_<pipeline>`	Logical replication slot name. Created with `pgoutput` if it does not exist.
`publication`	`nanosync_pub_<pipeline>`	Publication name. Created for the configured table set if it does not exist.

Snapshot

Property	Default	Description
`snapshot_workers`	`4`	Goroutines used to stream CTID partitions concurrently during the initial backfill. Each worker opens an additional database connection. Clamped to `1` on PostgreSQL < 14.
`chunk_size`	`10000`	Rows fetched per keyset-pagination page during backfill.
`partition_strategy`	`ctid`	How the snapshot divides tables into parallel chunks. Options: `ctid`, `ntile`, `minmax`.
`partition_column`	—	Column for `ntile` or `minmax` strategies. Required when `partition_strategy` is not `ctid`.
`ctid_partition_blocks`	`1000`	Heap blocks (~8 MB) per CTID partition. Only used with `ctid` strategy.

Partition strategy details:

Strategy	Best for
`ctid` (default)	Most tables. Divides by physical page range — fast and no table stats required.
`ntile`	High-bloat tables. Balances by row count using a window function — equal rows per worker even with many dead tuples.
`minmax`	Uniformly distributed numeric or timestamp columns. Divides by value range — requires `partition_column`.

CDC / streaming

Property	Default	Description
`wal_heartbeat_enabled`	`true`	Emit a periodic WAL message to advance `confirmed_flush_lsn` on idle sources. Prevents WAL accumulation when there are no changes.
`wal_heartbeat_interval`	`10s`	Period between WAL heartbeat messages.
`wal_sender_timeout`	`120s`	Set on each replication connection. Controls how long the server waits for a standby status update before closing the connection.
`keepalive_idle_interval`	`10s`	Period between periodic standby status updates when no new WAL events have arrived. Lower values keep slot lag smaller at the cost of network traffic.
`slot_lag_alert_mb`	`5000`	Log a warning when the replication slot lag exceeds this many MB. Set to `0` to disable.
`failover_slots`	`false`	Create the slot with `failover = true` (PostgreSQL 17+). Synchronises the slot to physical standbys so it survives a primary promotion.
`batch_boundary_signaling`	`false`	Buffer events within each source transaction and deliver them atomically on `COMMIT`. Allows the sink to commit once per source transaction instead of once per event.
`max_reconnect_attempts`	`5`	Maximum consecutive reconnect retries before the CDC goroutine gives up and the pipeline errors.

Tuning

Property	Default	Description
`cdc_channel_buffer_size`	`131072`	Capacity of the internal event channel. Larger values reduce the chance of the receive loop blocking and triggering a WAL sender timeout.
`normalize_buffer_hours`	`24`	Maximum hours of unprocessed batches allowed in the normalizer queue before the pipeline scheduler pauses to let it catch up.

How replication works

Snapshot — On first start, Nanosync performs a consistent snapshot of each table using snapshot_workers parallel goroutines. Each worker reads rows in chunk_size pages via keyset pagination, partitioned by the chosen partition_strategy.
CDC — After the snapshot, Nanosync subscribes to the replication slot and streams WAL events using pgoutput. Each event is decoded into an Apache Arrow record batch with operation type (INSERT / UPDATE / DELETE), the before row, and the after row.
Checkpoint — The acknowledged LSN is persisted to the state store after each batch is committed by the sink. On restart, replication resumes from the last checkpoint with no data loss.

Supported operations

Operation	Supported	Notes
`INSERT`	Yes	—
`UPDATE`	Yes	Includes `before` and `after` row images
`DELETE`	Yes	Full row image requires a primary key or `REPLICA IDENTITY FULL`
`TRUNCATE`	Yes	Propagated as a delete-all event to the sink
DDL changes	Detection only	Schema drift is detected and the pipeline pauses for review; DDL is not replayed automatically

Monitoring

nanosync metrics pipeline orders-pipeline

Monitor the replication slot directly in PostgreSQL:

SELECT
  slot_name,
  active,
  confirmed_flush_lsn,
  pg_size_pretty(
    pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
  ) AS lag
FROM pg_replication_slots
WHERE slot_name = 'nanosync_slot_orders';

Key Prometheus metrics:

Metric	Description
`ns_pipeline_replication_lag_seconds`	End-to-end source-to-sink latency
`ns_cdc_events_total`	CDC events received from the slot
`ns_snapshot_rows_total`	Rows written during the initial snapshot

Limitations

Tables without a primary key must have REPLICA IDENTITY FULL set for UPDATE and DELETE events to include the full row. Without it, DELETE events only contain the system columns.
TRUNCATE is propagated as a delete-all event. Not all sinks handle this gracefully — verify your sink’s behavior before relying on TRUNCATE replication.
DDL changes (schema evolution) are detected but not replayed automatically. The pipeline pauses at pending_schema_approval on breaking changes. New columns are added automatically in widen mode.
Replication slots retain WAL on disk until acknowledged. If Nanosync is stopped for an extended period, WAL accumulates and can fill the source disk. Monitor pg_replication_slots.wal_status in production.
Failover slots (PostgreSQL 17+) are required for zero-data-loss primary promotions. On PostgreSQL 16 and earlier, a primary promotion invalidates the replication slot — a fresh snapshot is required after failover.
Cloud SQL / AlloyDB: postgresql.conf is managed by Google. Set wal_level = logical and max_replication_slots via the Cloud SQL instance flags in the GCP console or with gcloud sql instances patch. The REPLICATION role attribute is granted the same way as self-hosted PostgreSQL.

Troubleshooting

ERROR: logical replication slot "..." is active for PID ... Another process is consuming the slot. Check with:

SELECT pid, slot_name, active FROM pg_replication_slots;

ERROR: replication slot "..." does not exist The slot was dropped externally. Stop the pipeline, drop its checkpoint in Nanosync (nanosync delete pipeline <name>), and restart to trigger a fresh snapshot.

WAL accumulating / disk pressure The slot is not being consumed. Check pg_replication_slots.wal_status. If it shows lost, the slot has been invalidated and must be recreated (full re-snapshot required). To prevent this, keep slot_lag_alert_mb at a level appropriate for your disk capacity.

Pipeline stalls after a Postgres primary promotion Without failover_slots: "true", the replication slot does not exist on the new primary. Enable failover slots (PG 17+) or configure auto-restart with a fresh snapshot after failover.