Full-Text Search

Built-in full-text search over SQL tables. Create an index on one or more TEXT columns, then run term-intersection queries — all backed by B+trees with zero external dependencies.

Creating an Index

Use EnsureFullTextIndexAsync to create a full-text index on one or more TEXT columns. If the index already exists, the call is a no-op. Existing rows are backfilled automatically in the same transaction.

await using var db = await Database.OpenAsync("myapp.db");

// Create a table with text content
await db.ExecuteAsync(
    "CREATE TABLE docs (id INTEGER PRIMARY KEY, title TEXT, body TEXT)");

// Insert some documents
await db.ExecuteAsync(
    "INSERT INTO docs VALUES (1, 'Quick Start', 'Learn how to build fast apps')");
await db.ExecuteAsync(
    "INSERT INTO docs VALUES (2, 'Storage Engine', 'B+tree paging and WAL internals')");

// Create a full-text index across title and body columns
await db.EnsureFullTextIndexAsync(
    "fts_docs",     // index name
    "docs",         // table name
    ["title", "body"]  // TEXT columns to index
);

Searching

Call SearchAsync with a query string. The query is tokenized with the same rules as the index, and all terms must match (AND semantics). Results are returned as a list of FullTextSearchHit values containing the RowId and a relevance Score.

// Single-term search
var hits = await db.SearchAsync("fts_docs", "storage");

// Multi-term search (AND — all terms must match)
var hits = await db.SearchAsync("fts_docs", "fast apps");

// Use results to fetch full rows
foreach (var hit in hits)
{
    var row = await db.QuerySingleAsync(
        $"SELECT * FROM docs WHERE id = {hit.RowId}");
    Console.WriteLine($"[{hit.Score:F2}] {row["title"]}");
}

Automatic Maintenance

Once created, the full-text index is automatically kept in sync with DML operations on the source table. No manual re-indexing is needed.

// INSERT — new row is automatically indexed
await db.ExecuteAsync(
    "INSERT INTO docs VALUES (3, 'Pipelines', 'ETL with CSV and JSON connectors')");
var hits = await db.SearchAsync("fts_docs", "pipelines");
// hits contains row 3

// UPDATE — old tokens removed, new tokens added
await db.ExecuteAsync(
    "UPDATE docs SET body = 'ETL runtime with validation' WHERE id = 3");
// "connectors" no longer matches row 3, "validation" now does

// DELETE — tokens are removed from the index
await db.ExecuteAsync("DELETE FROM docs WHERE id = 3");
// row 3 no longer appears in any search results

Index Options

Customize tokenization behavior by passing a FullTextIndexOptions object when creating the index.

await db.EnsureFullTextIndexAsync(
    "fts_docs", "docs", ["title", "body"],
    new FullTextIndexOptions
    {
        Normalization    = NormalizationForm.FormKC,  // Unicode normalization (default: NFKC)
        LowercaseInvariant = true,                   // case-insensitive matching (default)
        StorePositions     = true,                   // store token positions (default)
    });
OptionDefaultDescription
NormalizationFormKCUnicode normalization form applied before tokenization
LowercaseInvarianttrueConvert tokens to lowercase for case-insensitive search
StorePositionstrueStore token positions in the postings list

Tokenization

The tokenizer splits text on non-word boundaries. Word characters are Unicode letters, digits, and apostrophes. Text is normalized and lowercased according to the index options before tokenization.

// "Quick brown fox" → ["quick", "brown", "fox"]
// "Runner's high"   → ["runner's", "high"]
// "CSharpDB v2.4"   → ["csharpdb", "v2", "4"]

How It Works

Each full-text index is backed by four internal B+tree indexes managed by the catalog service:

Internal IndexPurpose
__metaIndex-level metadata (corpus statistics, state, version)
__termsTerm frequency and document frequency statistics per term
__postingsVarint-encoded postings lists with delta-encoded DocIds and positions
__docstatsPer-document statistics (document length)
No External Dependencies: Full-text search is implemented entirely within CSharpDB's existing B+tree and WAL infrastructure. All index data is stored inside the same database file and participates in transactions, snapshots, and checkpointing.

Dropping an Index

Use standard SQL DROP INDEX to remove a full-text index. All four internal indexes are cascade-deleted automatically.

await db.ExecuteAsync("DROP INDEX fts_docs");