Full-Text Search
Built-in full-text search over SQL tables. Create an index on one or more TEXT columns, then run term-intersection queries — all backed by B+trees with zero external dependencies.
Creating an Index
Use EnsureFullTextIndexAsync to create a full-text index on one or more TEXT columns. If the index already exists, the call is a no-op. Existing rows are backfilled automatically in the same transaction.
await using var db = await Database.OpenAsync("myapp.db");
// Create a table with text content
await db.ExecuteAsync(
"CREATE TABLE docs (id INTEGER PRIMARY KEY, title TEXT, body TEXT)");
// Insert some documents
await db.ExecuteAsync(
"INSERT INTO docs VALUES (1, 'Quick Start', 'Learn how to build fast apps')");
await db.ExecuteAsync(
"INSERT INTO docs VALUES (2, 'Storage Engine', 'B+tree paging and WAL internals')");
// Create a full-text index across title and body columns
await db.EnsureFullTextIndexAsync(
"fts_docs", // index name
"docs", // table name
["title", "body"] // TEXT columns to index
);
Searching
Call SearchAsync with a query string. The query is tokenized with the same rules as the index, and all terms must match (AND semantics). Results are returned as a list of FullTextSearchHit values containing the RowId and a relevance Score.
// Single-term search
var hits = await db.SearchAsync("fts_docs", "storage");
// Multi-term search (AND — all terms must match)
var hits = await db.SearchAsync("fts_docs", "fast apps");
// Use results to fetch full rows
foreach (var hit in hits)
{
var row = await db.QuerySingleAsync(
$"SELECT * FROM docs WHERE id = {hit.RowId}");
Console.WriteLine($"[{hit.Score:F2}] {row["title"]}");
}
Automatic Maintenance
Once created, the full-text index is automatically kept in sync with DML operations on the source table. No manual re-indexing is needed.
// INSERT — new row is automatically indexed
await db.ExecuteAsync(
"INSERT INTO docs VALUES (3, 'Pipelines', 'ETL with CSV and JSON connectors')");
var hits = await db.SearchAsync("fts_docs", "pipelines");
// hits contains row 3
// UPDATE — old tokens removed, new tokens added
await db.ExecuteAsync(
"UPDATE docs SET body = 'ETL runtime with validation' WHERE id = 3");
// "connectors" no longer matches row 3, "validation" now does
// DELETE — tokens are removed from the index
await db.ExecuteAsync("DELETE FROM docs WHERE id = 3");
// row 3 no longer appears in any search results
Index Options
Customize tokenization behavior by passing a FullTextIndexOptions object when creating the index.
await db.EnsureFullTextIndexAsync(
"fts_docs", "docs", ["title", "body"],
new FullTextIndexOptions
{
Normalization = NormalizationForm.FormKC, // Unicode normalization (default: NFKC)
LowercaseInvariant = true, // case-insensitive matching (default)
StorePositions = true, // store token positions (default)
});
| Option | Default | Description |
|---|---|---|
Normalization | FormKC | Unicode normalization form applied before tokenization |
LowercaseInvariant | true | Convert tokens to lowercase for case-insensitive search |
StorePositions | true | Store token positions in the postings list |
Tokenization
The tokenizer splits text on non-word boundaries. Word characters are Unicode letters, digits, and apostrophes. Text is normalized and lowercased according to the index options before tokenization.
// "Quick brown fox" → ["quick", "brown", "fox"]
// "Runner's high" → ["runner's", "high"]
// "CSharpDB v2.4" → ["csharpdb", "v2", "4"]
How It Works
Each full-text index is backed by four internal B+tree indexes managed by the catalog service:
| Internal Index | Purpose |
|---|---|
__meta | Index-level metadata (corpus statistics, state, version) |
__terms | Term frequency and document frequency statistics per term |
__postings | Varint-encoded postings lists with delta-encoded DocIds and positions |
__docstats | Per-document statistics (document length) |
Dropping an Index
Use standard SQL DROP INDEX to remove a full-text index. All four internal indexes are cascade-deleted automatically.
await db.ExecuteAsync("DROP INDEX fts_docs");