Configuration Reference
Full YAML schema for Nanosync connections, pipelines, rate limits, and schema mapping.
Nanosync is configured via a single YAML file (default: nanosync.yaml). The YAML is applied to the embedded store on startup. After that, pipelines and connections can be managed via the API, CLI, or UI — no file editing required.
Top-level structure
connections:
- ... # named connection definitions
pipelines:
- ... # pipeline definitions
Connections
Named connections allow you to reuse credentials across multiple pipelines.
connections:
- name: prod-postgres # unique name referenced by pipelines
type: postgres
dsn: "postgres://user:${env:PG_PASSWORD}@db.prod:5432/mydb?sslmode=require"
- name: prod-bigquery
type: bigquery
properties:
project_id: my-gcp-project
dataset_id: replication
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Unique identifier referenced in pipeline connection: fields |
type | string | yes | Connector type. Active sources: postgres, sqlserver, kafka, local, stdin. Active sinks: bigquery, alloydb, cloudsql, kafka, local, stdout. See Overview for coming-soon connectors. |
dsn | string | no | Connection string (used by database connectors) |
properties | map | no | Key-value connector properties (connector-specific) |
Inline dsn or properties on a pipeline source or sink override the named connection on conflict, without modifying the connection definition.
Environment variable expansion
Any value in the YAML can reference an environment variable with ${env:VAR_NAME}. Expansion happens at startup before the config is applied.
dsn: "postgres://user:${env:PG_PASSWORD}@host:5432/db"
properties:
api_key: "${env:BQ_API_KEY}"
Pipelines
pipelines:
- name: orders-to-bigquery # unique pipeline name
source:
connection: prod-postgres # reference a named connection ...
# or inline:
# type: postgres
# dsn: "postgres://..."
tables:
- public.orders
- public.order_items
properties:
replication_slot: nanosync_slot
chunk_size: "10000"
snapshot_workers: "4"
sink:
connection: prod-bigquery # reference a named connection ...
# or inline:
# type: bigquery
properties:
project_id: my-project
dataset_id: replication
table_id: orders
rate_limit:
max_events_per_second: 10000 # 0 = unlimited
max_bytes_per_second: 104857600
schema_mapping:
conflict: widen # widen | fail | approve
Pipeline fields
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Unique pipeline identifier |
source | object | yes | Source connector config |
sink | object | yes | Sink connector config |
rate_limit | object | no | Throughput limits |
schema_mapping | object | no | Schema drift handling |
Source fields
| Field | Type | Description |
|---|---|---|
connection | string | Name of a named connection |
type | string | Connector type (required if no connection) |
dsn | string | Connection string (overrides named connection) |
tables | []string | Tables to replicate, in schema.table format |
properties | map | Connector-specific options (see connector docs) |
Sink fields
| Field | Type | Description |
|---|---|---|
connection | string | Name of a named connection |
type | string | Connector type (required if no connection) |
properties | map | Connector-specific options (see connector docs) |
Rate limiting
rate_limit:
max_events_per_second: 10000 # integer, 0 = unlimited
max_bytes_per_second: 104857600 # integer bytes, 0 = unlimited
Rate limits apply per pipeline. They are enforced on the source read side via backpressure.
Schema mapping
Controls what happens when a type-mapping conflict is detected between source and sink schemas.
schema_mapping:
conflict: widen # widen | fail | approve
| Mode | Behaviour |
|---|---|
widen (default) | Auto-cast to the nearest compatible type and log a warning. Replication continues. |
fail | Stop the pipeline immediately if any column has no direct type mapping. |
approve | Pause the pipeline at pending_schema_approval state and wait for human review. |
When using approve mode:
nanosync schema review <pipeline> # inspect the proposed type mapping
nanosync schema approve <pipeline> # accept and resume
File format sinks
When using local, s3, gcs, or iceberg sink types, configure the output format:
sink:
type: local
properties:
base_path: /data/replication
file_format: parquet # parquet | csv | jsonl | avro
| Format | Extension | Notes |
|---|---|---|
parquet | .parquet | Default. Columnar, best compression, schema-aware. |
csv | .csv | Plain text, no schema embedded. |
jsonl | .jsonl | One JSON object per line. |
avro | .avro | Schema embedded in each file. |
SQL Server transaction log mode
Set cdc_mode: tlog to read directly from the SQL Server transaction log via sys.fn_dblog without requiring CDC setup on the source.
source:
type: sqlserver
dsn: "sqlserver://user:pass@host:1433?database=mydb"
tables: [dbo.orders]
properties:
cdc_mode: tlog # "cdc" (default) | "tlog"
log_batch_size: "10000"
poll_interval: "200ms"
max_xact_memory: "268435456" # 256 MiB cap per transaction
Requires: database must use FULL or BULK_LOGGED recovery model. Only VIEW DATABASE STATE privilege is needed (no CDC setup required).
Config reload without restart
Send SIGHUP to the running server to reload and apply the config file:
kill -HUP $(pgrep nanosync)
Or use:
nanosync apply --file nanosync.yaml
nanosync apply --file nanosync.yaml --dry-run # preview changes only
apply is idempotent — it upserts all connections and pipelines, leaving unchanged resources untouched.
Full annotated example
connections:
- name: prod-postgres
type: postgres
dsn: "postgres://replicator:${env:PG_PASSWORD}@db.prod:5432/orders?sslmode=require"
- name: warehouse
type: bigquery
properties:
project_id: acme-data
dataset_id: replication
pipelines:
- name: orders-to-warehouse
source:
connection: prod-postgres
tables:
- public.orders
- public.order_items
- public.products
properties:
replication_slot: nanosync_slot
chunk_size: "5000"
snapshot_workers: "8"
sink:
connection: warehouse
properties:
table_id: orders_cdc
rate_limit:
max_events_per_second: 50000
max_bytes_per_second: 524288000 # 500 MiB/s
schema_mapping:
conflict: widen