Skip to content

feat: task-relevant code summaries with Turso vector search#3185

Open
oldschoola wants to merge 6 commits into
can1357:mainfrom
oldschoola:taskrelevant_context
Open

feat: task-relevant code summaries with Turso vector search#3185
oldschoola wants to merge 6 commits into
can1357:mainfrom
oldschoola:taskrelevant_context

Conversation

@oldschoola

@oldschoola oldschoola commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a native (non-MCP) code summaries feature inspired by devalade/codemap: agent-written file-level summaries persisted in Turso/libSQL with native vector search, retrieved as minimal task-relevant context via hybrid FTS5 + vector retrieval with reciprocal rank fusion and budget packing.

Key features

  • Agent-written summaries: set_file_summary, get_file_summary, get_task_context, delete_file_summary tools
  • Turso/libSQL storage with F32_BLOB vector columns + libsql_vector_idx DiskANN index + vector_top_k() ANN queries
  • Hybrid retrieval: FTS5 lexical + vector semantic → reciprocal rank fusion (k=60) → budget packer (ceil(chars/4) + 20 token formula)
  • Automatic Turso provisioning: auto-creates a Turso DB when codemap.turso.autoProvision is enabled and TURSO_API_TOKEN is available, persists credentials via settings.set()
  • Staleness tracking: Bun.hash content hash comparison — stale summaries flagged on file changes
  • Lazy embeddings: generated on retrieval (not write) via a decoupled MnemopiEmbedClient instance — keeps set_file_summary fast
  • First-turn auto-injection: task-relevant summaries injected into the system prompt on the first turn of each session
  • Path traversal guard: toStoredPath rejects file paths that resolve outside the project cwd

Design decisions (verified through adversarial workflowz design panel)

  • Distinct feature module (codemap.* settings) — NOT in memory.backend enum. Composes with any memory backend including "off" (the default)
  • Independent injection seam: runs BEFORE the memory-backend block in #buildSystemPromptForAgentStart, gated only on codemap.enabled — fixes the dead-seam issue where offBackend has no beforeAgentStartPrompt hook
  • Singular PK + ROWID: required by libsql_vector_idx (composite PK without ROWID is not supported)
  • FTS5 virtual table (not experimental USING fts index): stable, proven in codebase (history-storage.ts, mnemopi/schema.ts), works in @libsql/client without experimental flags
  • Pluggable LanguageAdapter interface: TsAdapter ships in v1 (LSP-based), Go/Python/Rust adapters are future work
  • Schema dimension parameterized: buildSchemaSql(dimensions) adapts to embedding model (768d for en, 1024d for multilingual)

New module: packages/coding-agent/src/task-context/ (13 files)

File Responsibility
schema.ts SQL DDL (parameterized dimensions, FTS5, vector index)
config.ts Settings → typed config loader
db.ts libsql client factory + post-sync maintenance
staleness.ts Bun.hash content hash + staleness detection
store.ts CRUD + FTS + vector search data layer
adapter.ts Pluggable language adapter + TsAdapter
retrieve.ts Hybrid retrieval pipeline (RRF + budget packer)
embed.ts Decoupled embedding client (lazy, on retrieval)
turso.ts Auto-provisioning + connection resolution
tools.ts 4 AgentTool classes with createIf gating + path traversal guard
prompt.ts System-prompt injection helpers
state.ts Per-session state via Symbol key
index.ts Barrel re-exports + resolveCodemap/shutdownCodemap + injectCodemapTaskContext

Integration edits (10 existing files)

settings-schema.ts, settings-defs.ts, builtin-names.ts, tools/index.ts, system-prompt.ts, system-prompt.md, agent-session.ts, sdk.ts, hindsight/content.ts, package.json

Settings

codemap.enabled: false           # master toggle (off by default)
codemap.autoInject: true         # first-turn auto-injection
codemap.tokenBudget: 8000       # token budget for retrieval
codemap.maxSummaryChars: 1000   # hard write-side char cap
codemap.embedding.variant: en   # en (768d) | multilingual (1024d)
codemap.turso.autoProvision: false  # opt-in auto Turso DB creation

Testing

Test results: 140 tests pass, 0 fail across 10 files

Test file Tests Coverage
staleness.test.ts 8 Content hash transitions, missing files, staleness flags
retrieve.test.ts 20 Keyword extraction, token splitting, budget packer, RRF fusion (real exported functions)
integration.test.ts 31 Schema init, CRUD, FTS5 search, vector search, embedding backfill, full getTaskContext pipeline
token-usage.test.ts 13 Token formula verification, budget bounds, truncation, token efficiency vs full file reads
config.test.ts 17 Defaults, override precedence, dbPath fallback, variant→dimensions/model mapping, env var fallback, floor/clamp guards
prompt.test.ts 9 Empty result→empty string, stale/missing tags, truncation meta, multi-file ordering
state.test.ts 9 get/set roundtrip, markFirstTurnInjected, undefined-session guard
adapter.test.ts 16 Extension routing, no-adapter null, LSP SymbolKind mapping, sync stubs, error handling
tools.test.ts 8 createIf gating (all 4 tools), path traversal rejection, in-bounds acceptance with real DB
injection.test.ts 9 Guard chain, once-per-session, memory.backend="off" composition, error isolation, block content

Token usage verification

The token formula ceil(summary_text.length / 4) + 20 is verified across edge cases:

  • Empty string: 20 tokens (per-file overhead only)
  • Typical 1-3 sentence summary (~100 chars): ~46 tokens
  • Max-capped summary (1000 chars): 270 tokens — well under the 8000 default budget
  • Budget packer correctly truncates when results exceed tokenBudget or maxFiles
  • Always includes at least 1 file even if it alone exceeds the budget
  • Default 8000-token budget accommodates 20+ typical summaries

Bug fixes found through adversarial review (3-skeptic workflowz panel)

Bug Severity Fix
searchVector referenced v.distance but vector_top_k only returns id Blocking Compute distance via vector_distance_cos() in SELECT
Schema hardcoded F32_BLOB(768) but multilingual variant produces 1024d vectors Blocking Made schema parameterized via buildSchemaSql(dimensions)
Race condition in tools.ts — concurrent calls could double-open DB client Blocking Added in-flight promise guard + composite cache key
Vector retrieval never ran — queryEmbedding was never passed to getTaskContext Blocking Wired embedText into GetTaskContextTool.execute and injectCodemapTaskContext
Path traversal — toStoredPath accepted ../../etc/passwd without boundary check Blocking Added traversal guard; moved toStoredPath before getClient in all tool execute() methods
codemap.turso.autoProvision defaulted to true (opt-out) instead of false (opt-in) per design spec Safety Changed default to false — auto-provisioning fires network calls + persists credentials
FTS5 query used implicit AND — task queries need OR Minor Changed to explicit OR joining
retrieve.test.ts tested a copy of functions, not real code Minor Exported pure functions, updated test to import real implementations
Unused fmtOps function in benchmark.ts Lint Removed

Benchmarks

Local libSQL file mode, 768d vectors (bge-base-en-v1.5), AMD Ryzen 5 7600X:

Read latency (the hot path — retrieval)

Operation 100 rows 500 rows 1000 rows
Schema init 0.9ms
FTS5 single keyword 0.22ms 0.43ms 0.42ms
FTS5 multi-keyword 0.43ms 0.75ms 0.69ms
Vector search (k=20) 6.6ms 14.4ms
getTaskContext (FTS only) 0.73ms 1.4ms 1.1ms
getTaskContext (hybrid FTS+vector) 8.1ms 17.4ms

All read operations are well under the codemap design target of P95 < 200ms.

Write performance

Operation 100 rows 500 rows 1000 rows
upsertSummary (no embedding) 2.9ms/op 8.7ms/op 5.5ms/op
updateEmbedding (768d vector) 15.8ms/op 24.1ms/op

Write throughput is acceptable for interactive agent use (one summary per file read).

Key findings

  • FTS5 search stays sub-millisecond even at 1000 summaries — external-content index scales well
  • Vector search scales linearly but stays under 15ms at 500 summaries with embeddings
  • Full hybrid pipeline under 20ms at 500 summaries with both FTS + vector active
  • Schema init negligible: < 1ms
  • Token-efficient: 20+ typical summaries (270 tokens each) fit within the 8000 default budget

Verification

  • bun check passes across all 16 packages (0 type errors, 0 lint errors)
  • bun test — 140 tests pass, 0 fail across 10 files

Dependency

  • Adds @libsql/client@^0.17.4 (lazy-loaded via await import() only when codemap is enabled, matching the fastembed-runtime.ts optional-peer pattern)

Design document

Full design with adversarial review history: TASK_CONTEXT_DESIGN.md

Add a native (non-MCP) code summaries feature inspired by devalade/codemap:
agent-written file-level summaries persisted in Turso/libSQL with native
vector search, retrieved as minimal task-relevant context via hybrid
FTS5 + vector_top_k retrieval with reciprocal rank fusion and budget
packing.

Key design decisions (verified through adversarial workflowz design panel):
- Distinct feature module (codemap.* settings), NOT in memory.backend enum
- Composes with any memory backend including off (the default)
- Turso/libSQL-only storage with F32_BLOB vector columns + libsql_vector_idx
- FTS5 virtual table + triggers for lexical search (stable, not experimental)
- Hybrid retrieval: FTS5 + vector_top_k → reciprocal rank fusion (k=60)
- Budget packer with codemap's documented token formula: ceil(chars/4) + 20
- Singular PK + ROWID (required by libsql_vector_idx, not composite PK)
- Bun.hash for content staleness (per AGENTS.md convention)
- Lazy embedding on retrieval (not write) via decoupled MnemopiEmbedClient
- Automatic Turso DB provisioning with settings.set() persist-back
- Independent first-turn injection seam (not via memory backend hook)
- Pluggable LanguageAdapter interface (TsAdapter ships in v1 via LSP)

New module: packages/coding-agent/src/task-context/ (13 files, ~1400 lines)
- schema.ts: SQL DDL (summaries table, FTS5, vector index)
- config.ts: settings → typed config loader
- db.ts: libsql client factory + post-sync maintenance
- staleness.ts: Bun.hash content hash + staleness detection
- store.ts: CRUD + FTS + vector search data layer
- adapter.ts: pluggable language adapter + TsAdapter
- retrieve.ts: hybrid retrieval pipeline with RRF + budget packer
- embed.ts: decoupled embedding client (lazy, on retrieval)
- turso.ts: auto-provisioning + connection resolution
- tools.ts: 4 AgentTool classes with createIf gating
- prompt.ts: system-prompt injection helpers
- state.ts: per-session state via Symbol key
- index.ts: barrel re-exports + resolveCodemap/shutdownCodemap lifecycle

Integration edits (10 existing files):
- settings-schema.ts: 14 codemap.* settings
- settings-defs.ts: codemapActive condition
- builtin-names.ts: 4 tool names
- tools/index.ts: BUILTIN_TOOLS registration + isToolAllowed gating
- system-prompt.ts: codemapEnabled option + hasCodemap threading
- system-prompt.md: {{#if hasCodemap}} advertisement block
- agent-session.ts: #injectCodemapTaskContext + shutdownCodemap in dispose
- sdk.ts: resolveCodemap startup call + codemapEnabled option
- hindsight/content.ts: stripMemoryTags for <codemap> blocks
- package.json: @libsql/client dependency

Tests: 19 passing (staleness transitions, budget packer token math, RRF fusion)
Verification: bun check passes across all 16 packages (0 type errors, 0 lint errors)
@github-actions github-actions Bot added the vouched Passed the vouch gate label Jun 21, 2026
@roboomp roboomp added agent Agent runtime planning and orchestration feat prompting Prompt templates and prompt assembly review:p3 tool Tool behavior and integrations triaged labels Jun 21, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e86e901326

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +173 to +175
queryEmbedding && queryEmbedding.length > 0
? searchVector(client, projectLabel, queryEmbedding, seedLimit).catch(() => [] as RankedSummary[])
: Promise.resolve([] as RankedSummary[]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire embeddings before vector retrieval

The user-facing callers added in this commit (GetTaskContextTool.execute and the first-turn injection) call getTaskContext without opts.queryEmbedding, and the new embedding helpers/backfill functions are not invoked anywhere, so this branch always resolves the vector side to an empty list. In codemap-enabled sessions this makes the advertised semantic/vector retrieval path unreachable; summaries are only found when the lexical FTS query happens to match the task terms exactly.

Useful? React with 👍 / 👎.

Comment on lines +50 to +51
const created = (await createResp.json()) as { Hostname: string };
const syncUrl = `libsql://${created.Hostname}`;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Read Turso hostname from the nested response

When auto-provisioning runs, Turso's Create Database API returns the hostname under database.Hostname (see https://docs.turso.tech/api-reference/databases/create), not as a top-level Hostname. This cast therefore makes created.Hostname undefined and persists libsql://undefined as codemap.turso.syncUrl, so users with TURSO_API_TOKEN/org configured get an invalid remote database configuration instead of the newly created DB.

Useful? React with 👍 / 👎.

Comment on lines +2816 to +2818
void (async () => {
try {
await resolveCodemap(session, settings);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Await codemap setup before first-turn injection

Starting resolveCodemap fire-and-forget means the first user prompt can reach #injectCodemapTaskContext before setCodemapSessionState has run; that path sees no state and returns no injected summaries. This is especially likely when Turso provisioning or initial sync is involved, so codemap.autoInject does not reliably inject task-relevant summaries on the first turn as the feature promises.

Useful? React with 👍 / 👎.

@roboomp roboomp left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: this is a large new codemap/Turso feature, but the core vector/lazy-embedding and first-turn injection contracts are not wired end-to-end.
Blocking findings: retrieval is FTS-only, multilingual embeddings conflict with the 768d schema, codemap startup races first-turn injection, and the tool DB client leaks outside session state. One convention issue: new dynamic import.
I also could not check open duplicate PRs because gh is unavailable in this environment; git log origin/main --grep only showed unrelated shared task-context UI fixes. Thanks for the detailed design write-up.

Comment on lines +196 to +202
async execute(_id: string, params: GetTaskContextParams): Promise<AgentToolResult> {
const { client, config } = await getClient(this.session);
const projectLabel = resolveProjectLabel(this.session.cwd);
const opts: { maxFiles?: number; tokenBudget?: number } = {};
if (params.max_files !== undefined) opts.maxFiles = params.max_files;
if (params.token_budget !== undefined) opts.tokenBudget = params.token_budget;
const result = await getTaskContext(client, config, params.task, projectLabel, this.session.cwd, opts);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: get_task_context never supplies queryEmbedding, and the same is true for the first-turn path in agent-session.ts:4972. getTaskContext() only calls searchVector() when opts.queryEmbedding is present, while embedText, embedBatch, getUnembeddedSummaries, and updateEmbedding are unused. Result: the advertised hybrid/vector retrieval and lazy embedding backfill never run; enabled codemap is FTS-only.

Comment on lines +28 to +34
CREATE TABLE IF NOT EXISTS summaries (
id INTEGER PRIMARY KEY AUTOINCREMENT,
project_label TEXT NOT NULL,
file_path TEXT NOT NULL,
summary_text TEXT NOT NULL,
content_hash TEXT NOT NULL DEFAULT '',
embedding F32_BLOB(768),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: this schema hard-codes embedding F32_BLOB(768), but codemap.embedding.variant = "multilingual" sets dimensions = 1024 and selects intfloat/multilingual-e5-large in config.ts. Any future embedding write for that supported setting will try to store a 1024d vector in a 768d column/index, so the documented multilingual mode cannot work with this table.

Comment on lines +2811 to +2816
// Initialize codemap (code summaries) if enabled. Distinct from the memory
// backend — runs independently of memory.backend. Opens the Turso/libSQL DB,
// runs auto-provisioning if configured, and stores session state. Non-blocking
// so the session starts without waiting for DB init; the first-turn injection
// in #buildSystemPromptForAgentStart handles a not-yet-ready state gracefully.
void (async () => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: first-turn auto-injection races this fire-and-forget initialization. #injectCodemapTaskContext() returns null when getCodemapSessionState(this) is still unset, and the first model call can build the prompt immediately after session creation. In that common path the advertised first-turn injection is skipped instead of waiting for codemap readiness.

Comment on lines +14 to +16
// --- Shared per-session DB client cache -------------------------------------

let cachedClient: Client | null = null;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: this module-level client cache is not session-scoped and is not closed by shutdownCodemap(), which only closes the client stored on the AgentSession symbol. A tool call opens a second libSQL client here; ending the session leaves it alive, and simultaneous sessions with the same dbPath share mutable DB client state outside the per-session lifecycle the new state.ts is meant to enforce.

Comment on lines +17 to +22
export async function openCodemapDb(config: CodemapConfig): Promise<Client> {
// Dynamic import: @libsql/client loads a native NAPI binding (libsql) that
// must NOT load at CLI startup when codemap is disabled. Matches the
// loadFastembedOnce pattern in mnemopi/src/core/fastembed-runtime.ts:59-77
// — optional native peers are lazy-loaded via `await import()`.
const { createClient } = await import("@libsql/client");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should-fix: repo conventions explicitly ban inline/dynamic imports (await import()); new imports must be top-level. If @libsql/client must stay cold until codemap is enabled, please add a small top-level loader module/approved lazy boundary instead of embedding the dynamic import here.

Fixes found through integration testing:
- searchVector: vector_top_k() returns only 'id', not 'distance'. Compute
  distance separately via vector_distance_cos(s.embedding, vector32(?))
  instead of referencing non-existent v.distance column
- buildFtsQuery: change FTS5 query from implicit AND to explicit OR
  ('term1'* OR 'term2'*). Task queries describe intent, not exact content —
  AND matching returned empty for multi-word queries where no single summary
  contained all terms
- Stopwords: add common query words (how, does, what, when, where, why, who,
  can, use, using, work, works) that add noise to FTS queries
- Add comprehensive integration tests: schema init, CRUD, FTS5 search, vector
  search, embedding backfill, full getTaskContext pipeline with staleness

50 tests pass, 0 fail. bun check passes across all 16 packages.
… real functions

Bugs found by adversarial review (3-skeptic workflowz panel):
- Vector dimension mismatch: schema hardcoded F32_BLOB(768) but multilingual
  variant produces 1024d vectors. Made schema parameterized via
  buildSchemaSql(dimensions) and pass config.embedding.dimensions to initSchema
- Race condition in tools.ts getClient: concurrent calls could double-open the
  DB client. Added in-flight promise guard so concurrent callers await the
  same open promise. Also expanded cache key to include syncUrl + authToken
- FTS docstring mismatch: said AND, code does OR. Fixed docstring to match
- retrieve.test.ts tested a copy of functions, not the real code. Exported
  extractKeywords, splitTokens, reciprocalRankFusion, tokenCost, packBudget
  from retrieve.ts and updated test to import real implementations

59 tests pass, 0 fail. bun check passes across all 16 packages.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 04997c4e2a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if (!injected) return baseWithCodemap;

const previousBaseSystemPrompt = this.#baseSystemPrompt;
const previousBaseSystemPrompt = baseWithCodemap;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve codemap injection when memory also injects

When codemap auto-inject returns a block and the selected memory backend (hindsight or mnemopi) also returns beforeAgentStartPrompt content, previousBaseSystemPrompt now includes the extra codemap block but refreshBaseSystemPrompt() rebuilds only the raw base prompt. The length comparison below therefore always treats the prompt as changed and returns this.#baseSystemPrompt, dropping both the codemap summaries and the memory recall for that first turn.

Useful? React with 👍 / 👎.

- `omp://`: harness docs; AVOID unless the user asks about the harness itself.
{{#if hasCodemap}}
## Code Summaries (codemap)
File-level code summaries are available for this repo. Before reading unfamiliar files, call `get_task_context` with your task to retrieve relevant summaries (packed within a token budget). After reading a non-trivial file or making load-bearing changes, call `set_file_summary` to record a short note (purpose, key symbols, gotchas, invariants). Summaries are anchored to file content via `Bun.hash` — if a file changes, its summary is flagged `stale` and should be refreshed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate codemap guidance on active tools

When tools.discoveryMode is all, the codemap tools are marked discoverable and the initial-tool filter removes non-essential discoverable built-ins unless they were explicitly requested. This block is gated only on codemap.enabled, so it can tell the model to call get_task_context and set_file_summary even though those tool schemas are absent from the active tool list; gate the guidance on the active tool names or force these tools active whenever this guidance is rendered.

Useful? React with 👍 / 👎.

Token usage tests (13 new, 72 total passing):
- Verify codemap token formula (ceil(chars/4)+20) across edge cases
- Budget packer respects token budget, always includes >=1 file
- Truncation when results exceed budget or maxFiles
- Token efficiency: 20+ typical summaries fit within 8000 budget
- Empty result has zero token cost
- Single file result has exact token cost matching formula

Benchmark results (local libSQL file mode, 768d vectors):
- Schema init: 0.9ms
- FTS5 search: 0.2-0.7ms (flat at 1000 summaries)
- Vector search (vector_top_k): 6.6-14.4ms (scales linearly)
- Full getTaskContext pipeline: 0.7-17.4ms (under 20ms at 500 summaries)
- All well below codemap's P95 < 200ms design target

Bug fixes from adversarial review:
- searchVector: vector_top_k returns only 'id', compute distance via
  vector_distance_cos() instead of non-existent v.distance column
- Vector dimension mismatch: schema now parameterized via buildSchemaSql(dimensions)
- Race condition in tools.ts: in-flight promise guard prevents double-open
- FTS5 query changed from AND to OR for task-intent matching
- Stopwords expanded with common query words
- retrieve.ts pure functions exported for direct testing

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc84601ea2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +44 to +45
if (clientPromise) {
return clientPromise;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Key the in-flight codemap client by configuration

When two sessions with different codemap.dbPath/Turso settings call a codemap tool while the first openCodemapDb is still in flight, this returns the first session's promise without comparing cacheKey. The second session then writes or reads its project label through the wrong DB client for that tool call, which can leak or corrupt summaries across concurrently running projects; the in-flight promise needs to be keyed the same way as cachedClient.

Useful? React with 👍 / 👎.

Comment on lines +72 to +73
function resolveProjectLabel(cwd: string): string {
return path.basename(cwd);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Derive codemap scope from the repo root

In sessions started from a subdirectory of the same repository, path.basename(cwd) changes the project_label (for example repo root oh-my-pi vs packages/coding-agent), so summaries written at one cwd are invisible when the agent later runs in another cwd inside the same repo. The comment says this mirrors Hindsight, but Hindsight resolves the primary git root before taking the basename; codemap should use the same repo-root scope and store paths relative to that scope if summaries are meant to follow the repo across sessions.

Useful? React with 👍 / 👎.

Comment on lines +192 to +194
FROM vector_top_k('idx_summaries_embedding', vector32(?), ?) v
JOIN summaries s ON s.rowid = v.id
WHERE s.project_label = ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid limiting vector candidates before project filtering

With a shared codemap DB containing embeddings for multiple projects, vector_top_k(..., limit) returns the nearest limit rows globally before this project_label filter runs. If another project has limit closer vectors, every candidate is filtered out and the current project gets no vector seeds even though it has relevant embedded summaries just below the global cutoff; overfetch or use a project-scoped index/filter so the limit applies within the requested project.

Useful? React with 👍 / 👎.

Add unit tests for untested codemap modules (config, prompt, state,
adapter, tools createIf gating, toStoredPath traversal guard) and an
integration test for the first-turn injection seam.

- config.test.ts (17 tests): defaults, override precedence, dbPath
  fallback, variant→dimensions/model mapping, floor/clamp guards
- prompt.test.ts (9 tests): empty result, stale/missing tags,
  truncation meta, multi-file ordering
- state.test.ts (9 tests): get/set roundtrip, markFirstTurnInjected,
  undefined-session guard
- adapter.test.ts (16 tests): extension routing, no-adapter null,
  LSP SymbolKind mapping, sync stubs, error handling
- tools.test.ts (8 tests): createIf gating (all 4 tools), path
  traversal rejection, in-bounds path acceptance with real DB
- injection.test.ts (9 tests): guard chain, once-per-session,
  memory.backend='off' composition, error isolation, block content

Fix path traversal vulnerability in toStoredPath — paths escaping cwd
(../../etc/passwd, absolute paths) are now rejected. toStoredPath runs
before getClient in all tool execute() methods (fail-fast on invalid
input).

Fix codemap.turso.autoProvision default: true → false (opt-in) to match
design spec. Auto-provisioning fires network calls to Turso's API,
creates cloud databases, and persists credentials — must be opt-in.

Extract injectCodemapTaskContext from AgentSession private method into
a testable standalone function in the task-context module.

Remove unused fmtOps from benchmark.ts.

Total: 140 tests pass (68 new), 0 fail. bun check passes across all
16 packages.
GetTaskContextTool.execute and injectCodemapTaskContext never passed
queryEmbedding to getTaskContext, so the vector search branch was
always skipped — making codemap FTS-only despite advertising hybrid
retrieval. Both call sites now embed the task query via embedText
(lazy — returns null when the embedding model is unavailable, falling
back to FTS-only gracefully).

Addresses blocking review finding from @roboomp and @codex on PR can1357#3185.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77b17c1bf4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +24 to +26
const client = createClient({
url: `file:${config.dbPath}`,
...(hasTursoSync ? { syncUrl: config.turso.syncUrl, authToken: config.turso.authToken } : {}),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Create the default codemap directory before opening

When codemap.dbPath is left empty, loadCodemapConfig points at <memories>/codemap/codemap.db, but neither that loader nor openCodemapDb creates the new codemap parent directory before this client is opened. On a fresh install with codemap enabled, schema initialization will try to create a database under a missing parent and the feature is disabled before any summary can be stored; create path.dirname(config.dbPath) before constructing the libSQL client.

Useful? React with 👍 / 👎.


export function extractKeywords(task: string): string[] {
// Tokenize on non-alphanumeric, lowercase, keep >= 3 chars, drop stopwords
const tokens = task.toLowerCase().match(/[a-z0-9]+/g) ?? [];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Split symbol names before lowercasing

For code-symbol tasks such as buildSystemPrompt, this lowercasing happens before the only camel-case splitter runs, so splitTokens(extractKeywords(task)) later sees just buildsystemprompt and emits that fused token. The resulting FTS query misses summaries or paths tokenized as build, system, and prompt, which makes get_task_context fail on common symbol-name prompts unless vector search happens to rescue it.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Agent runtime planning and orchestration feat prompting Prompt templates and prompt assembly review:p3 tool Tool behavior and integrations triaged vouched Passed the vouch gate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants