HA SQLite Cluster

Highly available SQLite cluster powered by embedded NATS JetStream server.

Connect using HTTP API, gRPC API, database/sql go driver, JDBC driver, MySQL or PostgreSQL Wire Protocol
Create live local read/write replicas with go-ha database/sql driver
Use ha-sync SQLite extension to create live local read replicas
Change Data Capture (CDC)
Execute Cross-Database queries (without ATTACH DATABASE)
Open Source

Overview

1. Installation
2. Usage
3. Local Replicas
- 3.1 Local Read/Write Replicas
- 3.2 Local Read Replicas
4. HA Client, PostgreSQL and MySQL Wire Protocol
- 4.1 Using HA Client
5. HTTP API
6. Replication
7. Cross-shard Queries
8. Transaction Operations
9. Configuration

1. Installation

Download from releases page.

1.1 Install from source

go install github.com/litesql/ha@latest

1.2 Install using Docker

docker run --name ha \
-e HA_MEMORY=true \
-p 5432:5432 -p 8080:8080 -p 4222:4222 \
ghcr.io/litesql/ha:latest

- Set up a volume at /data to store the NATS streams state.

Cluster example

Docker compose cluster example

docker compose up

Services:

Instance	HTTP	Pg Wire	NATS	MySQL Wire
node1	8080	5432	4222	3306
node2	8081	5433	4223	3307
node3	8082	5434	4224	3308

1.3 Install using Helm

Add litesql helm charts repository to Helm:

helm repo add litesql https://litesql.github.io/helm-charts

Update the chart repository:

helm repo update

Deploy ha to kubernetes:

helm install ha litesql/ha

Visit litesql helm charts repository to customize the installation;

2. Usage

Start the first ha node (-m flag if you want to use in-memory)

ha -n node1 -m -pg-port 5432

Start an another ha node

ha -n node2 -m --port 8081 --pg-port 5433 --nats-port 0 --replication-url nats://localhost:4222

Create a table

curl -d '[
  {
    "sql": "CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)"     
  }
]' \
http://localhost:8080

Insert some data using HTTP client

curl -d '[
  {
    "sql": "INSERT INTO users(name) VALUES(:name)", 
    "params": {"name": "HA User"} 
  }
]' \
http://localhost:8080

Or use a PostgreSQL client

PGPASSWORD="ha" psql -h localhost -U ha

INSERT INTO users(name) VALUES('HA user from PostgreSQL Wire Protocol');
SELECT * FROM users;

Connect to another server and check the values

PGPASSWORD="ha" psql -h localhost -U ha -p 5433

SELECT * FROM users;

2.1 Loading existed database to memory

ha -m mydatabase.db

2.2 Use database in disk

ha "file:mydatabase.db"

2.3 Load database from latest snapshot

ha --from-latest-snapsot

2.4 Loading multiple databases

ha *.db

3. Local Replicas

3.1 Local Read/Write Replicas

Use go-ha database/sql driver to create embedded read/write replicas
Use with go

3.2 Local Read Replicas

Use ha-sync SQLite extension to create local embedded replicas from a remote HA database.
Use with any programming language

4. HA Client, PostgreSQL and MySQL Wire Protocol

You can use any ha, PostgreSQL or MySQL driver to connect to ha.
The SQLite parser engine will proccess the commands.
MySQL and PostgreSQL functions are not supported.
DBeaver support: use the JDBC HA driver to manage the database.

4.1 Using HA Client

ha -r http://localhost:8080

You can send any SQL statement to the HA database, plus the following special commands:

Command	Description
SHOW DATABASES;	List all databases
CREATE DATABASE ;	Create a new database
DROP DATABASE ;	Drop a database
SET DATABASE TO ;	Send commands to a specific database
UNSET DATABASE;	Use default database
EXIT;	Quit client (ctrl+d)

5. HTTP API

Access OpenAPI documentation: http://localhost:8080/openapi.yaml

5.1 Using bind parameters

curl -d '{
    "sql": "INSERT INTO users(name) VALUES(:name)", 
    "params": {"name": "HA user"}
}' \
http://localhost:8080/query

{
  "columns": [
    "rows_affected",
    "last_insert_id"
  ],
  "rows": [
    [
      1,
      3
    ]
  ]
}

5.2 Multiple commands (one transaction)

curl -d '[
{
    "sql": "INSERT INTO users(name) VALUES(:name)", 
    "params": {"name": "new HA user"}
},
{
    "sql": "DELETE FROM users WHERE name = :name", 
    "params": {"name": "new HA user"}
},
{
    "sql": "SELECT * FROM users"
}
]' \
http://localhost:8080/query

{
  "results": [
    {
      "columns": [
        "rows_affected",
        "last_insert_id"
      ],
      "rows": [
        [
          1,
          2
        ]
      ]
    },
    {
      "columns": [
        "rows_affected",
        "last_insert_id"
      ],
      "rows": [
        [
          1,
          2
        ]
      ]
    },
    {
      "columns": [
        "id",
        "name"
      ],
      "rows": [
        [
          1,
          "HA user"
        ]        
      ]
    }
  ]
}

5.3 Backup database

curl -O -J http://localhost:8080/download

5.4 Take a snapshot and save on NATS Object Store

curl -X POST http://localhost:8080/snapshot

5.5 Get latest snapshot from NATS Object Store

curl -O -J http://localhost:8080/snapshot

5.6 List all replications status

curl http://localhost:8080/replications

5.7 Get replication status

curl http://localhost:8080/replications/{name}

5.8 Remove replication (consumer)

curl -X DELETE http://localhost:8080/replications/{name}

6. Replication

You can write to any server (in the leaderless mode)
Uses embedded or external NATS JetStream cluster
NATS JetStream guarantees “at-least-once” message delivery
All DML (INSERT, UPDATE, DELETE) operations are idempotent
Last writer wins.
DDL commands are replicated (since v0.0.7)

6.1 CDC message format

{
  "node": "ha_node_name",
  "changes": [
    {
      "database": "main",
      "table": "users",
      "columns": [
        "id",
        "name"
      ],
      "operation": "INSERT",      
      "new_rowid": 2,
      "new_values": [
        2,
        "new HA user"
      ]
    },
    {
      "database": "main",
      "table": "users",
      "columns": [
        "id",
        "name"
      ],
      "operation": "DELETE",
      "old_rowid": 2,
      "old_values": [
        2,
        "new HA user"
      ]
    }
  ],
  "timestamp_ns": 1758574275504509677
}

6.2 Replication limitations

Tables WITHOUT ROWID are not replicated
The replication is not invoked when conflicting rows are deleted because of an ON CONFLICT REPLACE clause.
Uses “automatic idempotency” for DDL commands (CREATE IF NOT EXISTS and DROP IF EXISTS). But it’s dificult to do with ALTER TABLE commands.
Writing to any node in the cluster (the default ha behaviour) improves availability, but it can lead to consistency issues in certain edge cases. If your application values Consistency more than Availability, it’s better to route all write operations through a single cluster node or using the leader options –leader-static or leader-addr.

6.3 Conflict Resolution

In the event of conflicting writes, the following conflict resolution strategy is applied:

Last Writer Wins: The most recent write operation is retained. This ensures that the latest data is propagated across the cluster.
Idempotent Operations: All DML (INSERT, UPDATE, DELETE) and DDL (CREATE, DROP) operations are converted to idempotent operations on replica nodes, meaning that applying the same operation multiple times will yield the same result. This helps maintain consistency during replication.
Custom Conflict Resolution: You can implement a custom conflict resolution strategy by using the --interceptor flag to provide a Go script. This script allows you to define how conflicts are resolved based on your application’s specific requirements.

See example here.

7. Cross-shard Queries

HA supports cross-database queries across multiple SQLite databases hosted on the same node. This lets you query data from all matching shard databases in a single request.

Use the SQL optimizer hint /*+ db=<regex> */ to select which shard databases should participate in the query. For example:

SELECT id, name /*+ db=.* */ FROM users;

In this example, db=.* runs the query against all available database IDs. To target a subset, use a narrower regular expression, e.g. db=users_.*.

Discover database IDs with:
- SHOW DATABASES;
Limit results to specific shards by adjusting the regex.

8. Transaction Operations

HA provides advanced transaction history and undo capabilities, allowing you to retrieve and revert changes made to the database.

Transaction History

Use the HISTORY command to retrieve all transactions from a specific sequence point to the present:

HISTORY 100;

This retrieves all transactions starting from sequence number 100. You can also use time durations:

HISTORY '5m';

This retrieves transactions from the last 5 minutes.

Undo Operations

HA supports several undo commands to revert transactions:

UNDO n: Revert all transactions from stream sequence n through the present, effectively rolling back changes made during this period.
UNDOE n: Roll back all modifications on entities that were affected by transactions occurring from stream sequence n to the present, while preserving other changes.
UNDOT n: Revert all transactions on entities that were impacted by the transaction at stream sequence n through the present. Only transactions that modified the same entities are affected.

Parameters:

n: Can be specified as a numeric stream sequence number or as a relative time duration (e.g., '5m', '1h', '30s')

Examples:

UNDO 150;        -- Revert all transactions from sequence 150 onwards
UNDO '10m';      -- Revert all transactions from the last 10 minutes
UNDOE 200;       -- Roll back entity modifications from sequence 200
UNDOT 250;       -- Revert transactions affecting entities from sequence 250

Note: These operations work on committed transactions and can be used for point-in-time recovery or correcting erroneous changes.

Limitations

DDL commands (such as CREATE, ALTER, and DROP) are not affected by undo operations, as they are designed to preserve database schema integrity and avoid potential inconsistencies.

9. Configuration

Flag	Environment Variable	Default	Description
-n, –name	HA_NAME	$HOSTNAME	Node name
-p, –port	HA_PORT	8080	HTTP API tcp port
–token	HA_TOKEN		API auth token
-m, –memory	HA_MEMORY	false	Store database in memory
–create-db-dir	HA_CREATE_DB_DIR		Path to a directory where new databases are created
-i, –interceptor	HA_INTERCEPTOR		Path to a Go script for customizing replication behavior
-r, –remote	HA_REMOTE		Address of a remote HA server to connect to and interact with (e.g. to run queries) instead of starting a server
–log-level	HA_LOG_LEVEL	info	Log level (info, warn, error or debug)
–from-latest-snapsot	HA_FROM_LATEST_SNAPSHOT	false	Use the latest database snapshot from NATS JetStream Object Store (if available at startup)
–snapshot-interval	HA_SNAPSHOT_INTERVAL	0s	Interval to create database snapshot to NATS JetStream Object Store (0 to disable)
–disable-ddl-sync	HA_DISABLE_DDL_SYNC	false	Disable DDL commands publisher
–grpc-insecure	HA_GRPC_INSECURE	false	Use insecure gRPC connection (plaintext, no TLS) for leader messages
–leader-addr	HA_LEADER_ADDR		Address when this node become the leader (uses the gRPC server). This will enable the leader election
–leader-static	HA_LEADER_STATIC		Address of a static leader. This will disable the leader election
–mysql-port	HA_MYSQL_PORT		Port to MySQL Wire Protocol Server
–mysql-user	HA_MYSQL_USER	ha	MySQL Auth user
–mysql-pass	HA_MYSQL_PASS		MySQL Auth password
–nats-logs	HA_NATS_LOGS	false	Enable embedded NATS Server logging
–nats-port	HA_NATS_PORT	4222	Embedded NATS server port (0 to disable)
–nats-store-dir	HA_NATS_STORE_DIR	/tmp/nats	Embedded NATS server store directory
–nats-user	HA_NATS_USER		Embedded NATS server user
–nats-pass	HA_NATS_PASS		Embedded NATS server password
–nats-config	HA_NATS_CONFIG		Path to embedded NATS server config file (override other nats configurations)
–pg-port	HA_PG_PORT		Port to PostgreSQL Wire Protocol server
–pg-user	HA_PG_USER	ha	PostgreSQL Auth user
–pg-pass	HA_PG_PASS	ha	PostgreSQL Auth password
–pg-cert	HA_PG_CERT		Path to PostgreSQL TLS certificate file
–pg-key	HA_PG_KEY		Path to PostgreSQL TLS key file
–concurrent-queries	HA_CONCURRENT_QUERIES	50	Number of concurrent queries (DB pool max)
–extensions	HA_EXTENSIONS		Comma-separated list of SQLite extensions path to load
–async-replication	HA_ASYNC_REPLICATION	false	Enables asynchronous replication message publishing
–async-replication-store-dir	HA_ASYNC_REPLICATION_STORE_DIR		Directory path for storing outbox messages used in asynchronous replication
–replicas	HA_REPLICAS	1	Number of replicas to keep for the stream and object store in clustered jetstream
–replication-timeout	HA_REPLICATION_TIMEOUT	15s	Replication publisher timeout
–replication-stream	HA_REPLICATION_STREAM	ha_replication	Replication stream name
–replication-max-age	HA_REPLICATION_MAX_AGE	24h	Replication stream max age
–replication-url	HA_REPLICATION_URL		Replication NATS url (defaults to embedded NATS server)
–replication-policy	HA_REPLICATION_POLICY	all	Replication subscriber delivery policy (all, last, new, by_start_sequence=X, by_start_time=x)
–row-identify	HA_ROW_IDENTIFY	pk	Strategy used to identify rows during replication. Options: pk, rowid or full
–version	HA_VERSION	false	Print version information and exit
-c, –config			config file (optional)