Attocode Roadmap¶

v0.2.6 -- Language Support, Search Quality & Architecture (Released 2026-03-24)¶

~~Language-specific symbol extraction~~ -- DONE: 11 new tree-sitter configs (Erlang, Clojure, Perl, Crystal, Dart, OCaml, F#, Julia, Nim, R, Objective-C); total 36 languages supported
~~Architecture analysis fallback~~ -- DONE: directory-based module detection when dependency graph is sparse; 22 repos improved from 2/5 to 4/5
~~Search ranking improvements~~ -- DONE: graduated symbol boosting, multi-term coverage, path relevance, non-code penalty; MRR +23%, NDCG +16%
~~Lazy embedding initialization~~ -- DONE: embedding model loads on first semantic_search() call, not on construction; bootstrap latency 20s → 0.8s
~~3-way benchmark expansion~~ -- DONE: 19 → 49 repos benchmarked across 30+ languages
~~FK constraint fix~~ -- DONE: save files before symbols in ast_service.py

~~Embedding-based semantic search~~ -- DONE: hybrid vector + BM25 with RRF; two-stage retrieval
~~Persistent index across instances~~ -- DONE: SQLite-backed IndexStore with incremental updates
~~Progressive hydration~~ -- DONE: adaptive tier-based indexing (small/medium/large/huge); skeleton init <2s for any repo; background hydration thread; on-demand gap filling
~~3-way benchmark (20 repos)~~ -- DONE: grep vs ast-grep vs code-intel across 12 languages; code-intel 4.7/5 quality
~~New MCP tools~~ -- DONE: hydration_status tool; indexing_depth param on bootstrap; mode param on semantic_search
Ground truth expansion -- add YAML files for 15+ more benchmark repos (currently 5)
Go-specific search improvements -- Go MRR 0.200 lags Python 0.725; index package docs, use module paths
ast-grep integration -- optional structural pattern searches alongside tree-sitter parsing

Cross-repo search in org -- aggregate embeddings across repositories, org-scoped vector queries
Better git integration -- commit graph exploration, blame-weighted hotspots, PR-aware analysis
Cross-service analysis -- detect API contracts (OpenAPI, gRPC), map service-to-service calls
More tests -- integration tests for all 40 MCP tools, Playwright E2E, target 60% coverage
Full MCP feature parity -- ensure all tools work in all modes (local, remote, service)
Better offline mode:
Offline embedding fallback (auto-switch to local model like all-MiniLM-L6-v2)
Pre-computed analysis bundles (.attocode-bundle export for air-gapped use)
Offline learning sync (queue locally, sync on reconnect)
Git-based offline analysis (blame, history, branches via pygit2 without DB)

Swarm mode update to loops -- migrate from DAG-based to loop-based execution architecture
Swarm using code intel -- bootstrap orientation, impact analysis for task scoping, cross-refs for merge conflicts, learning system for per-repo patterns