dqlitepy Architecture

This document describes the overall architecture of dqlitepy, including how Python bindings interact with the Go shim and the underlying dqlite C library.

Overview

dqlitepy is a multi-layer architecture that bridges Python applications with the dqlite distributed SQLite engine through a Go-based C-compatible shim layer.

Component Architecture

1. Python Layer Components

Core API (`dqlitepy/node.py`, `dqlitepy/client.py`)

The core API provides high-level Python interfaces for:

Node Management: Creating, starting, stopping dqlite nodes
Cluster Management: Adding/removing nodes, querying cluster state
Configuration: Setting node options (timeouts, compression, etc.)

Key Features:

Thread-safe operations using threading.RLock
Context manager support for automatic cleanup
Graceful error handling with custom exception hierarchy
Automatic node ID generation

DB-API 2.0 Interface (`dqlitepy/dbapi.py`)

PEP 249 compliant database interface providing standard Python database connectivity:

Features:

Parameter binding with ? placeholders
Transaction support (commit/rollback)
Multiple fetch methods
Cursor iteration support
BLOB and Unicode handling

FFI Layer (`dqlitepy/_ffi.py`)

The FFI (Foreign Function Interface) layer uses CFFI to load and interact with the Go shim:

Responsibilities:

Library discovery and loading
Platform-specific shared library handling
C type definitions and function signatures
Error code translation
Thread-safe library initialization

2. Go Shim Layer

The Go shim (go/shim/main_with_client.go) provides a C-compatible bridge between Python and go-dqlite:

Key Exports:

Category	Functions
Node Lifecycle	`dqlitepy_node_create`, `dqlitepy_node_start`, `dqlitepy_node_stop`, `dqlitepy_node_destroy`
Node Configuration	`dqlitepy_node_set_bind_address`, `dqlitepy_node_set_auto_recovery`, `dqlitepy_node_set_busy_timeout`
Client Operations	`dqlitepy_client_create`, `dqlitepy_client_add`, `dqlitepy_client_remove`, `dqlitepy_client_leader`
Cluster Management	`dqlitepy_client_cluster`, `dqlitepy_client_close`
Utility	`dqlitepy_version`, `dqlitepy_generate_node_id`, `dqlitepy_last_error`

Memory Management:

// Handle tracking for cleanup
var (
    handleMu      sync.Mutex
    nodeHandles   = make(map[dqlitepy_handle]*app.App)
    clientHandles = make(map[dqlitepy_handle]*client.Client)
    nextHandle    = dqlitepy_handle(1)
)

3. C Library Layer

The vendored C libraries provide the core distributed database functionality:

Data Flow

Node Creation and Startup

Cluster Formation

Query Execution (DB-API)

Thread Safety

dqlitepy implements thread safety at multiple levels:

Python Layer:

Each Node has a threading.RLock for state mutations
Protects _started, _handle, _finalizer attributes
Ensures atomic start/stop operations

Go Layer:

Global mutex for handle map operations
Per-handle locks for concurrent access
Go's runtime manages goroutine synchronization

Raft Layer:

All cluster operations go through Raft leader
Leader serializes all state changes
Provides linearizable consistency

Error Handling

Exception Hierarchy:

Exception
├── DqliteError (base for all dqlite errors)
│   ├── NodeError (node operations)
│   ├── ClientError (client operations)
│   │   ├── ClientClosedError
│   │   ├── ClientConnectionError
│   │   └── ClusterConfigurationError
│   └── DatabaseError (DB-API 2.0 base)
│       ├── DataError
│       ├── IntegrityError
│       ├── NotSupportedError
│       └── OperationalError
└── Warning
    └── ResourceWarning (cleanup issues)

Performance Characteristics

Memory Usage

Component	Memory
Python Node object	~1 KB
Go app.App	~50-100 KB
dqlite node	~10-50 MB (depends on database size)
Per-connection	~100 KB

Latency Profile

Read Operations: 0.5-2ms (no Raft consensus required)
Write Operations: 2-10ms (requires Raft consensus)
Leader Election: 100-500ms (during failures)

Scalability

Cluster Size

Recommended: 3-5 nodes
Maximum tested: 7 nodes
Optimal for fault tolerance: 3 nodes (tolerates 1 failure)

Database Size

SQLite limits: 281 TB theoretical, 140 TB tested
Practical limit: Depends on disk and memory
Snapshot transfer: Affects new node join time

Connection Pooling

dqlitepy nodes can handle multiple concurrent connections:

Security Considerations

Network Security

No built-in encryption: dqlite wire protocol is unencrypted
Recommendation: Use TLS tunnels (stunnel, wireguard) or private networks
Authentication: None built-in, rely on network isolation

File System

Data directory: Should have restricted permissions (700)
SQLite files: WAL mode requires proper file locking
Snapshots: Contain full database, protect with encryption at rest

Memory Safety

Go runtime: Memory safe, garbage collected
C libraries: Potential for memory bugs (use vendored, tested versions)
CFFI: Type-safe bindings, validated at runtime

Known Limitations

Upstream Issues

Segfault in stop(): The dqlite C library's dqlitepy_node_stop() function has a segfault bug
- Workaround: Disabled finalizer and explicit stop() calls
- Impact: Nodes not explicitly stopped during cleanup
- Status: Tracked with upstream maintainers
BLOB Serialization: JSON serialization converts bytes to strings
- Workaround: Tests handle both bytes and string types
- Impact: May require manual conversion in application code

Current Limitations

No SSL/TLS: Wire protocol is unencrypted
No authentication: Relies on network security
Single database per node: Can't host multiple databases in one node
Synchronous API: No async/await support (yet)

dqlitepy Architecture

Overview

Component Architecture

1. Python Layer Components

Core API (`dqlitepy/node.py`, `dqlitepy/client.py`)

DB-API 2.0 Interface (`dqlitepy/dbapi.py`)

FFI Layer (`dqlitepy/_ffi.py`)

2. Go Shim Layer

3. C Library Layer

Data Flow

Node Creation and Startup

Cluster Formation

Query Execution (DB-API)

Thread Safety

Error Handling

Performance Characteristics

Memory Usage

Latency Profile

Scalability

Cluster Size

Database Size

Connection Pooling

Security Considerations

Network Security

File System

Memory Safety

Known Limitations

Upstream Issues

Current Limitations

Future Enhancements

References

Overview​

Component Architecture​

1. Python Layer Components​

Core API (dqlitepy/node.py, dqlitepy/client.py)​

DB-API 2.0 Interface (dqlitepy/dbapi.py)​

FFI Layer (dqlitepy/_ffi.py)​

2. Go Shim Layer​

3. C Library Layer​

Data Flow​

Node Creation and Startup​

Cluster Formation​

Query Execution (DB-API)​

Thread Safety​

Error Handling​

Performance Characteristics​

Memory Usage​

Latency Profile​

Scalability​

Cluster Size​

Database Size​

Connection Pooling​

Security Considerations​

Network Security​

File System​

Memory Safety​

Known Limitations​

Upstream Issues​

Current Limitations​

Future Enhancements​

References​

Overview

Component Architecture

1. Python Layer Components

Core API (`dqlitepy/node.py`, `dqlitepy/client.py`)

DB-API 2.0 Interface (`dqlitepy/dbapi.py`)

FFI Layer (`dqlitepy/_ffi.py`)

2. Go Shim Layer

3. C Library Layer

Data Flow

Node Creation and Startup

Cluster Formation

Query Execution (DB-API)

Thread Safety

Error Handling

Performance Characteristics

Memory Usage

Latency Profile

Scalability

Cluster Size

Database Size

Connection Pooling

Security Considerations

Network Security

File System

Memory Safety

Known Limitations

Upstream Issues

Current Limitations

Future Enhancements

References