Highly available leaderless SQLite cluster powered by embedded NATS JetStream server.

Overview

1. Installation

1.1 Install from source

git clone https://github.com/litesql/ha.git
cd ha
go install

1.2 Install using Docker

docker run --name ha \
-e HA_MEMORY=true \
-p 5432:5432 -p 8080:8080 -p 4222:4222 \
ghcr.io/litesql/ha:latest

- Set up a volume at /data to store the NATS streams state.

Cluster example

docker compose up
  • Services:
Instance HTTP Pg Wire NATS
node1 8080 5432 4222
node2 8081 5433 4223
node3 8082 5434 4224

1.3 Install using Helm

helm repo add litesql https://litesql.github.io/helm-charts
  • Update the chart repository:
helm repo update
  • Deploy ha to kubernetes:
helm install ha litesql/ha

2. Usage

  • Start the first ha node (-m flag if you want to use in-memory)
ha -n node1 -m
  • Start an another ha node
ha -n node2 -m --port 8081 --pg-port 5433 --nats-port 0 --replication-url nats://localhost:4222
  • Create a table
curl -d '[
  {
    "sql": "CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)"     
  }
]' \
http://localhost:8080
  • Insert some data using HTTP client
curl -d '[
  {
    "sql": "INSERT INTO users(name) VALUES(:name)", 
    "params": {"name": "HA User"} 
  }
]' \
http://localhost:8080
  • Or use a PostgreSQL client
PGPASSWORD="ha" psql -h localhost -U ha
INSERT INTO users(name) VALUES('HA user from PostgreSQL Wire Protocol');
SELECT * FROM users;
  • Connect to another server and check the values
PGPASSWORD="ha" psql -h localhost -U ha -p 5433
SELECT * FROM users;

2.1 Loading existed database to memory

ha -m mydatabase.db

2.2 Use database in disk

ha "file:mydatabase.db?_journal=WAL&_busy_timeout=500"

2.3 Load database from latest snapshot

ha --from-latest-snapsot

2.4 Loading multiple databases

ha *.db

3. Local Replicas

3.1 Local Read/Write Replicas

  • Use go-ha database/sql driver to create embedded read/write replicas
  • Use with go

3.2 Local Read Replicas

  • Use ha-sync SQLite extension to create local embedded replicas from a remote HA database.
  • Use with any programming language

4. PostgreSQL Wire Protocol

  • You can use any PostgreSQL driver to connect to ha.
  • The SQLite parser engine will proccess the commands.
  • PostgreSQL functions (and visual editors like pgadmim, dbeaver, etc) are not supported.

5. HTTP API

5.1 Using bind parameters

curl -d '{
    "sql": "INSERT INTO users(name) VALUES(:name)", 
    "params": {"name": "HA user"}
}' \
http://localhost:8080
{
  "columns": [
    "rows_affected",
    "last_insert_id"
  ],
  "rows": [
    [
      1,
      3
    ]
  ]
}

5.2 Multiple commands (one transaction)

curl -d '[
{
    "sql": "INSERT INTO users(name) VALUES(:name)", 
    "params": {"name": "new HA user"}
},
{
    "sql": "DELETE FROM users WHERE name = :name", 
    "params": {"name": "new HA user"}
},
{
    "sql": "SELECT * FROM users"
}
]' \
http://localhost:8080
{
  "results": [
    {
      "columns": [
        "rows_affected",
        "last_insert_id"
      ],
      "rows": [
        [
          1,
          2
        ]
      ]
    },
    {
      "columns": [
        "rows_affected",
        "last_insert_id"
      ],
      "rows": [
        [
          1,
          2
        ]
      ]
    },
    {
      "columns": [
        "id",
        "name"
      ],
      "rows": [
        [
          1,
          "HA user"
        ]        
      ]
    }
  ]
}

5.3 Backup database

curl -O -J http://localhost:8080/download

5.4 Take a snapshot and save on NATS Object Store

curl -X POST http://localhost:8080/snapshot

5.5 Get latest snapshot from NATS Object Store

curl -O -J http://localhost:8080/snapshot

5.6 List all replications status

curl http://localhost:8080/replications

5.7 Get replication status

curl http://localhost:8080/replications/{name}

5.8 Remove replication (consumer)

curl -X DELETE http://localhost:8080/replications/{name}

6. Replication

  • You can write to any server
  • Uses embedded or external NATS JetStream cluster
  • NATS JetStream guarantees “at-least-once” message delivery
  • All DML (INSERT, UPDATE, DELETE) operations are idempotent
  • Last writer wins
  • DDL commands are replicated (since v0.0.7)

6.1 CDC message format

{
  "node": "ha_node_name",
  "changes": [
    {
      "database": "main",
      "table": "users",
      "columns": [
        "id",
        "name"
      ],
      "operation": "INSERT",      
      "new_rowid": 2,
      "new_values": [
        2,
        "new HA user"
      ]
    },
    {
      "database": "main",
      "table": "users",
      "columns": [
        "id",
        "name"
      ],
      "operation": "DELETE",
      "old_rowid": 2,
      "old_values": [
        2,
        "new HA user"
      ]
    }
  ],
  "timestamp_ns": 1758574275504509677
}

6.2 Replication limitations

  • Tables WITHOUT ROWID are not replicated
  • The replication is not invoked when conflicting rows are deleted because of an ON CONFLICT REPLACE clause.
  • Uses “automatic idempotency” for DDL commands (CREATE IF NOT EXISTS and DROP IF EXISTS). But it’s dificult to do with ALTER TABLE commands.
  • Writing to any node in the cluster improves availability, but it can lead to consistency issues in certain edge cases. If your application values Consistency more than Availability, it’s better to route all write operations through a single cluster node.

6.3 Conflict Resolution

In the event of conflicting writes, the following conflict resolution strategy is applied:

  1. Last Writer Wins: The most recent write operation is retained. This ensures that the latest data is propagated across the cluster.

  2. Idempotent Operations: All DML (INSERT, UPDATE, DELETE) and DDL (CREATE, DROP) operations are converted to idempotent operations on replica nodes, meaning that applying the same operation multiple times will yield the same result. This helps maintain consistency during replication.

  3. Custom Conflict Resolution: You can implement a custom conflict resolution strategy by using the --interceptor flag to provide a Go script. This script allows you to define how conflicts are resolved based on your application’s specific requirements.

See example here.

7. Configuration

Flag Environment Variable Default Description
-n, –name HA_NAME $HOSTNAME Node name
-p, –port HA_PORT 8080 HTTP API tcp port
-m, –memory HA_MEMORY false Store database in memory
-i, –interceptor HA_INTERCEPTOR   Path to a Go script for customizing replication behavior
–log-level HA_LOG_LEVEL info Log level (info, warn, error or debug)
–from-latest-snapsot HA_FROM_LATEST_SNAPSHOT false Use the latest database snapshot from NATS JetStream Object Store (if available at startup)
–snapshot-interval HA_SNAPSHOT_INTERVAL 0s Interval to create database snapshot to NATS JetStream Object Store (0 to disable)
–disable-ddl-sync HA_DISABLE_DDL_SYNC false Disable DDL commands publisher
–nats-logs HA_NATS_LOGS false Enable embedded NATS Server logging
–nats-port HA_NATS_PORT 4222 Embedded NATS server port (0 to disable)
–nats-store-dir HA_NATS_STORE_DIR /tmp/nats Embedded NATS server store directory
–nats-user HA_NATS_USER   Embedded NATS server user
–nats-pass HA_NATS_PASS   Embedded NATS server password
–nats-config HA_NATS_CONFIG   Path to embedded NATS server config file (override other nats configurations)
–pg-port HA_PG_PORT 5432 Port to PostgreSQL Wire Protocol server
–pg-user HA_PG_USER ha PostgreSQL Auth user
–pg-pass HA_PG_PASS ha PostgreSQL Auth password
–pg-cert HA_PG_CERT   Path to PostgreSQL TLS certificate file
–pg-key HA_PG_KEY   Path to PostgreSQL TLS key file
–concurrent-queries HA_CONCURRENT_QUERIES 50 Number of concurrent queries (DB pool max)
–extensions HA_EXTENSIONS   Comma-separated list of SQLite extensions path to load
–async-replication HA_ASYNC_REPLICATION false Enables asynchronous replication message publishing
–async-replication-store-dir HA_ASYNC_REPLICATION_STORE_DIR   Directory path for storing outbox messages used in asynchronous replication
–replicas HA_REPLICAS 1 Number of replicas to keep for the stream and object store in clustered jetstream
–replication-timeout HA_REPLICATION_TIMEOUT 15s Replication publisher timeout
–replication-stream HA_REPLICATION_STREAM ha_replication Replication stream name
–replication-max-age HA_REPLICATION_MAX_AGE 24h Replication stream max age
–replication-url HA_REPLICATION_URL   Replication NATS url (defaults to embedded NATS server)
–replication-policy HA_REPLICATION_POLICY all Replication subscriber delivery policy (all, last, new, by_start_sequence=X, by_start_time=x)
–row-identify HA_ROW_IDENTIFY rowid Strategy used to identify rows during replication. Options: rowid or full
–version HA_VERSION false Print version information and exit
-c, –config     config file (optional)