Roadmap

Planned direction for CSharpDB — organized by timeframe and priority. Reflects the current v3.8.0 state.

Need the full source guide? The original long-form markdown version is preserved as Roadmap Source Reference.

Near-Term Completed

Recently completed improvements to query performance, storage behavior, provider/tooling compatibility, maintenance workflows, and developer ergonomics.

Source-Generated Collections

Done

No-reflection, trim-safe typed collection API via CSharpDB.Generators with GetGeneratedCollectionAsync<T>(), GeneratedCollection<T>, generated field metadata, binary direct payloads for supported shapes, and NativeAOT-friendly model registration.

Collection Write-Path Performance

Done

Separated collection write probes from the read-side B-tree routing-cache, reused traversal scratch during insert/replace, and buffered catalog mutation bookkeeping inside explicit transactions.

Covered Composite Index Fast-Path

Done

Recovered covered composite-index lookup optimization for queries that can be answered entirely from the index without touching the base table.

Durable-Write Batching

Done

Configurable durable commit batch window to coalesce WAL fsync calls across concurrent transactions for higher write throughput.

DISTINCT & Composite Indexes

Done

Deduplicate SELECT output with DISTINCT. Multi-column indexes for broader query coverage.

Index Range Scans

Done

Use indexes for <, >, <=, >=, BETWEEN — not just equality lookups.

Prepared Statement Cache

Done

Cache parsed ASTs and query plans to avoid re-parsing identical SQL statements.

In-Memory Database Mode

Done

Open a database fully in memory, load from disk, and save committed snapshots back to disk.

Collection Path Indexes

Done

Nested scalar, array-element, nested array-object, Guid, temporal, and ordered text path indexes.

B+Tree Delete Rebalancing

Done

Merge underflowed pages on delete to reclaim space via borrow/merge with interior collapse.

Database Administration

Done

Maintenance report, REINDEX, VACUUM/compact, fragmentation analysis, and database size report.

Dedicated gRPC Daemon

Done

CSharpDB.Daemon host with full gRPC coverage for SQL, schema, procedures, collections, and maintenance.

Background WAL Checkpointing

Done

Incremental/sliced auto-checkpointing to move work off the triggering commit path.

Hybrid Storage Mode

Done

Lazy-resident durable storage with on-demand page loading and gRPC tunable file-cache.

Table & Index Statistics

Done

ANALYZE command with persisted row counts, column NDV/min/max, and initial stats-guided index selection.

Client Backup & Restore

Done

BackupAsync / RestoreAsync as first-class operations across direct, HTTP, gRPC, CLI, and Admin.

Native Table Archives & External Tables

Done

Native .csdbtable snapshots with fast Admin Import / Export, download or server-path destinations, CREATE EXTERNAL TABLE, sys.external_tables, read-only scans/joins, and embedded primary-key lookup indexes.

Older DB Foreign-Key Retrofit Migration

Done

Validate/apply maintenance workflow that rewrites existing child tables with persisted FK metadata across direct, HTTP, gRPC, CLI, and Admin.

Admin Reports Designer

Done

Visual banded-report designer with grouping, sorting, expressions, aggregate functions, page settings, and printable preview.

Mid-Term In Progress

SQL feature parity, provider/tooling compatibility, and ecosystem expansion.

User-Defined Functions and Commands

Done

Done for the trusted in-process model: host-registered C# scalar functions, common SQL/Admin built-ins, trusted commands, Admin Forms/Reports/pipeline hooks, declarative form action sequences, and local Admin Forms C# code modules. Untrusted sandboxed UDF execution is intentionally out of scope.

Writable External Tables

Planned

Opt-in writable external table registrations over mutable .csdbx files, backed by CSharpDB B+tree storage and limited to INSERT, UPDATE, and DELETE in v1 while .csdbtable archives remain read-only.

Window Functions

Planned

ROW_NUMBER(), RANK(), DENSE_RANK(), LEAD(), LAG() for analytical queries.

DEFAULT & CHECK Constraints

Planned

Default expressions in column definitions and arbitrary expression-based constraints per column or table.

Foreign Key Constraints

Done

v1 support for single-column, column-level REFERENCES with optional ON DELETE CASCADE, plus metadata/tooling surfaces.

Remote Host Consolidation

Done

CSharpDB.Daemon now hosts the existing REST/HTTP /api surface and gRPC from one long-running process backed by the same warm daemon-hosted client. Standalone CSharpDB.Api remains supported for REST-only hosting.

Remote API-Key Protection

Done

Opt-in API-key mode protects REST /api/* and daemon gRPC calls with constant-time key comparison while keeping default no-auth behavior for compatibility.

Remote Host Security Hardening

Planned

Authorization, protected admin endpoint scopes, JWT/RBAC options, and TLS/mTLS deployment helpers for remote HTTP and gRPC access.

Daemon Service Packaging

Done

CSharpDB.Daemon can be packaged as a persistent background service across systemd, Windows Service, and launchd.

Cross-Platform Distribution

In Progress

Self-contained daemon archives and install scripts ship for Windows, Linux, and macOS; dotnet tool, Docker, Homebrew, and winget distribution remain future work.

ADO.NET GetSchema

Done

DbConnection.GetSchema() now exposes standard metadata collections for tooling and ORM schema discovery.

Collation Support

Done

BINARY, NOCASE, NOCASE_AI, and ICU:<locale> collation now work across SQL and collection indexes; dedicated ordered SQL text index optimization remains future work.

Subqueries & Set Operations

Done

Scalar subqueries, IN/EXISTS (including correlated), UNION, INTERSECT, EXCEPT across SELECT results.

Visual Query Designer

Done

Admin query builder with source canvas, join editing, design grid, SQL preview, and saved layouts.

Long-Term Future

Advanced features and fundamental architecture enhancements, including long-range items that have since shipped.

Full-Text Search

Done

Inverted index support with tokenization, stemming, and relevance ranking.

Source-Generated Collections

Done

Current phase is complete: opt-in generated models provide GetGeneratedCollectionAsync<T>, generated descriptors/index bindings, binary direct payloads for supported shapes, JSON fallback for unsupported shapes, and trim/NativeAOT smoke coverage.

Generated Collection Package Ergonomics

Planned

Streamline NuGet/analyzer packaging, templates, onboarding docs, and project setup for the opt-in generated collection path.

Broader Generated Model Coverage

Planned

Expand generator support beyond the current scalar, scalar collection, nested scalar, and nested collection-scalar shapes.

SQL Batched Row Transport

Done

Internal row-batch transport serves as the batch-first SQL execution foundation across batch-capable result boundaries, scans, joins, and generic aggregates.

External Table Index Coverage

Planned

Follow writable .csdbx storage with broader external-table indexes, planner costing, and multi-column lookup/range support beyond the current archive primary-key point-lookup path.

Page-Level Compression

Planned

Deep engine/page compression remains planned; application-level payload compression is available as a sample/SDK pattern without changing the storage format.

At-Rest Encryption

Research

Encrypt database and WAL files with passphrase-based key management and explicit plaintext/encrypted migration/export paths; implementation must meet the database-encryption plan entry criteria before shipping.

Cost-Based Query Optimizer

Done

Current phase is complete: ANALYZE-driven stats-guided costing uses internal histograms, heavy hitters, composite-prefix summaries, skew-aware estimates, correlation-aware filters/joins, non-unique lookup costing, hash build-side choice, and bounded DP join reordering.

Adaptive Query Re-Optimization

Done

Current phase is complete: opt-in adaptive join execution can switch eligible index nested-loop joins to hash joins and flip inner hash build sides at safe pre-emission boundaries.

Public Planner Histogram Inspection

Done

Stable SQL-first diagnostics expose sys.planner_histograms, sys.planner_heavy_hitters, sys.planner_index_prefix_stats, and EXPLAIN ESTIMATE FOR <query>.

Async I/O Batching

Done

Current phase is complete: WAL frame-chunk writes, chunked checkpoint page copies, shared snapshot/export batching, reusable B-tree copy utilities, and the close-out audit cover the main storage and maintenance write paths.

Low-Latency Durable Writes

Done

Advisory planner-stat persistence can stay deferred without weakening committed-row durability, and sys.table_stats.row_count_is_exact makes exact versus estimated row-count semantics explicit.

Group Commit / Deferred WAL Flush

Done

Opt-in UseDurableCommitBatchWindow(...) batches durable WAL flushes across contending in-process transactions — an expert measure-first knob rather than default behavior.

Initial Multi-Writer Support

Done

Explicit WriteTransaction conflict-detected retry flow, shared auto-commit non-insert isolation, and opt-in ConcurrentWriteTransactions for shared implicit inserts.

Broader Multi-Writer Optimization

Done

Opt-in concurrent write transactions now reserve shared row-id ranges and rebase hot right-edge insert pages against pending WAL images for improved insert fan-in.

API-Level Sharding

Research

Route API/daemon requests across multiple warm CSharpDB database files so independent tenants or shard keys can use separate WAL and commit paths, with v1 focused on single-shard writes and point reads.

Replication & Change Feed

Research

Retained commit-log change feeds and reactive query subscriptions for read replicas, live Admin views, and event-driven applications.

Current Limitations

Known simplifications in the current implementation:

Area	Limitation
Functions and automation	CSharpDB's UDF/command model is trusted and in-process by design. Current supported surfaces include host-registered scalar functions, common built-ins, trusted commands, form/report/pipeline hooks, declarative action sequences, and local Admin Forms C# modules; untrusted sandboxed execution is intentionally out of scope
Query	Scalar/IN/EXISTS subqueries are supported, including correlated cases in WHERE, non-aggregate projection, and UPDATE/DELETE expressions; correlated subqueries are not yet supported in JOIN ON, GROUP BY, HAVING, ORDER BY, or aggregate projections
Query	UNION, INTERSECT, and EXCEPT are supported; UNION ALL is not implemented yet
Query	No window functions
Schema	No SQL DEFAULT column values or CHECK constraints yet. Foreign keys are currently v1 only: single-column, column-level REFERENCES with optional ON DELETE CASCADE; table-level/composite/deferred foreign keys and ON UPDATE actions are not implemented
Indexes	Equality lookups support current INTEGER/TEXT indexes, but ordered range-scan pushdown is still limited to single-column INTEGER index paths
RowId	Legacy table schemas without persisted high-water metadata may pay a one-time key scan on first insert
Collections	`FindByIndexAsync` supports declared field-equality lookups; `FindByPathAsync` and `FindByPathRangeAsync` support path-based queries on indexed paths; `FindAsync` remains a full scan for unindexed predicates. Generated collections require registered descriptors for existing collection indexes; unsupported generated model shapes warn and use the source-generated JSON fallback instead of binary direct payloads
External Tables	Native `.csdbtable` archives can be registered and queried as read-only external tables. Writable external tables are planned as an opt-in `.csdbx` format; current archives remain read-only, and broader external indexes, range seeks, and deeper planner costing remain planned
Networking	`CSharpDB.Daemon` now hosts both REST and gRPC from one process; named pipes remain reserved but are not implemented end to end today
Security	Remote REST and daemon gRPC support opt-in API-key authentication, defaulting to `None` for compatibility. JWT, RBAC, mTLS helpers, TLS-specific configuration, and at-rest encryption are not implemented
Admin Forms	The Forms designer/runtime supports the core generated-form and data-entry path plus trusted command-backed automation, including lifecycle events, command buttons, selected-control events, conditional UI rules, domain formula helpers, declarative action sequences, and local C# code modules. It still needs Access-parity work for responsive runtime rendering, complete inferred validation, richer form modes, additional events, advanced filtering/sorting, report/query/import/export actions, macro loops/on-error/temp vars, and broader controls
Admin Reports	The Reports designer/runtime supports the core banded preview path plus trusted command-backed preview lifecycle events, but still needs Access-parity work for bounded saved-query previews, full report output/export, parameters, richer grouping and totals semantics, conditional formatting, subreports, and broader controls
Text / Multilingual	Text is stored as UTF-8 and supports all Unicode languages; default semantics remain ordinal, but opt-in `BINARY`, `NOCASE`, `NOCASE_AI`, and `ICU:<locale>` collation are implemented for SQL and collection indexes. Dedicated ordered SQL text index optimization remains planned
Concurrency	Physical WAL commit path is still serialized at the storage boundary. Initial multi-writer support is shipped, but observed gains depend on conflict shape and whether shared auto-commit INSERT is left on the default serialized path
Storage	No page-level compression; the compression SDK sample stores compressed payloads as ordinary application-managed `BLOB` values
Storage	No at-rest encryption for database/WAL files; on-disk storage is plaintext only
Storage	Memory-mapped reads are opt-in and currently apply only to clean main-file pages; WAL-backed reads still rely on the WAL/cache path
Storage	By default, durable auto-commit single-row writes still pay a physical WAL flush per commit; opt-in `UseDurableCommitBatchWindow(...)` can trade some commit latency for higher throughput
Query	Phase-2 cost-based planning is in place: `ANALYZE`, `sys.table_stats`, `sys.column_stats`, public planner-stat diagnostics, histogram/heavy-hitter/prefix estimates, and bounded small-chain join reordering now feed join/access-path costing. Opt-in adaptive join re-optimization can react to stale-stat or parameter-sensitive join cardinality misses, while broader runtime actuals, `EXPLAIN ANALYZE`, and full mid-plan reordering remain future work
Query	Internal row-batch transport is now the default scan-heavy execution foundation across batch-capable scans, joins, aggregates, and result boundaries; remaining work is broader kernel specialization and optional SIMD-style tuning rather than missing core batch coverage

Completed Milestones

Major features already implemented and shipped:

✓ Single-file database with 4 KB page-oriented storage

✓ B+tree-backed tables and secondary indexes

✓ Write-Ahead Log with crash recovery and auto-checkpoint

✓ Concurrent snapshot-isolated readers via WAL-based MVCC

✓ Full SQL pipeline: tokenizer, parser, planner, operators

✓ JOINs (INNER, LEFT, RIGHT, CROSS), aggregates, GROUP BY, HAVING, CTEs

✓ UNION, INTERSECT, EXCEPT set operations

✓ Scalar/IN/EXISTS subqueries (incl. correlated) in filters, projections, and UPDATE/DELETE

✓ Scalar TEXT(expr) for filter-friendly text coercion

✓ Composite (multi-column) indexes

✓ Ordered integer index range scans in the fast lookup path

✓ ANALYZE with persisted table/column stats and stale-aware refresh

✓ Phase-2 cost-based query planning: statistics-guided access paths, join method/reordering, histogram/cardinality estimation

✓ Public planner diagnostics with EXPLAIN ESTIMATE and sys.planner_* catalogs

✓ Opt-in adaptive join re-optimization for eligible stale-stat and parameter-sensitive joins

✓ SELECT DISTINCT and DISTINCT aggregates

✓ SQL statement and SELECT plan caching

✓ First-class IDENTITY / AUTOINCREMENT support for INTEGER PRIMARY KEY columns

✓ Persisted table NextRowId high-water mark with compatibility fallback

✓ Batch-first SQL row-batch execution across scans, joins, aggregates, and result boundaries

✓ Views and triggers (BEFORE/AFTER on INSERT/UPDATE/DELETE)

✓ Foreign key constraints: single-column REFERENCES with optional ON DELETE CASCADE

✓ Older-database foreign-key retrofit migration across direct, HTTP, gRPC, CLI, and Admin

✓ ADO.NET provider with connection pooling and GetSchema metadata collections

✓ In-memory database mode with explicit load/save APIs

✓ Shared/private in-memory ADO.NET connections with named shared-memory hosts

✓ Document Collection API with typed Put/Get/Delete/Scan/Find

✓ Collection secondary field indexes via EnsureIndexAsync / FindByIndexAsync

✓ Binary direct-payload collection storage with direct hydration and field/path extraction

✓ Collection path indexes: nested scalar, array-element, nested array-object, Guid, temporal, ordered text

✓ Collection path query APIs: FindByPathAsync and FindByPathRangeAsync

✓ Source-generated typed collection fast path with trim-safe NativeAOT-friendly access

✓ Full-text search with tokenization, stemming, and relevance ranking

✓ Hybrid storage mode with lazy-resident durable storage and gRPC tunable file-cache

✓ Client-wide BackupAsync / RestoreAsync across direct, HTTP, gRPC, CLI, and Admin

✓ Native .csdbtable table archives with Admin Import / Export and read-only external table registration

✓ ReplaceAsync for index stores

✓ Maintenance report, REINDEX, and VACUUM flows across client, CLI, API, and Admin UI

✓ Dedicated gRPC daemon host

✓ Remote host consolidation in CSharpDB.Daemon, with REST /api and gRPC sharing one warm hosted database client

✓ Opt-in API-key protection for REST /api/* and daemon gRPC calls

✓ Daemon service packaging with self-contained archives and service install assets

✓ Storage tuning presets, bounded WAL read caching, memory-mapped reads, and sliced background checkpointing

✓ SQL executor/read-path fast paths for compact projections, broader join/index coverage, and correlated subquery filters

✓ REST API with 34+ endpoints and OpenAPI/Scalar documentation

✓ Blazor Server admin dashboard with Forms and Reports designers

✓ Trusted C# callbacks, commands, Admin automation hooks, and local Admin Forms C# code modules

✓ Interactive CLI with meta-commands and file execution

✓ Package-driven ETL pipelines with validation, dry-run, execute/resume, and Admin visual designer

✓ VS Code extension with schema explorer

✓ MCP server for AI assistant integration

✓ NativeAOT C library for cross-language FFI

✓ B+tree delete rebalancing with underflow handling

✓ Reusable snapshot reader sessions for higher concurrent-read throughput

✓ Comprehensive benchmark suite (micro, macro, stress, scaling, in-memory, shared-memory)

✓ Collection write-path performance recovery with separated read/write B-tree routing

✓ Covered composite-index fast-path optimization

✓ Durable-write commit batching for higher concurrent write throughput