Advanced Analysis¶

Three complementary tools for deep codebase analysis: dead code detection, code distillation, and graph DSL queries. Available as MCP tools and through the HTTP API.

Dead Code Detection¶

The dead_code tool identifies unreferenced symbols, files, and modules using the cross-reference index. Results are ranked by a confidence score reflecting how likely the code is truly unused.

Analysis Levels¶

Level	What it finds	Use case
`symbol`	Functions, classes, methods with zero external references	Cleaning up unused internal code
`file`	Files with zero importers (not entry points)	Finding abandoned modules
`module`	Directories with no external imports from outside	Identifying dead packages

Entry Point Heuristics¶

Dead code detection automatically excludes known entry points to avoid false positives:

Auto-detected entry-point files:

main.py, __main__.py, __init__.py, setup.py, manage.py, conftest.py
Files starting with test_ or ending with _test.py

Auto-detected entry-point symbols:

Functions named main, setup, teardown
Symbols with framework decorators containing: route, endpoint, fixture, command, click
Symbols defined in __init__.py (treated as re-exports)

You can also specify additional entry points manually via the entry_points parameter.

Confidence Scoring¶

Each result includes a confidence score from 0.0 to 1.0:

Factor	Effect
Private name (leading `_`)	+0.1 (base 0.8 vs 0.7)
Defined in `__init__.py`	-0.3
Public name (no `_` prefix)	-0.2
File not modified in 180+ days	+0.1

Higher confidence = safer to remove. Use min_confidence to filter results.

Examples¶

MCP tool:

dead_code(level="symbol", scope="src/api/", min_confidence=0.6, top_n=20)

Sample output (symbol level):

Dead symbols detected in src/api/ (5 results):

   1. [0.80] function _legacy_auth_check
      src/api/security.py:142
      Reason: function never called or imported; private symbol

   2. [0.70] class DeprecatedHandler
      src/api/handlers.py:88
      Reason: class never instantiated or referenced

   3. [0.70] method RequestParser._parse_legacy
      src/api/parsers.py:201
      Reason: method with no external callers

Sample output (file level):

Dead files detected (3 results):

   1. [0.80] src/utils/_old_helpers.py
      12 symbols, 0 dependencies
      Reason: no files import this module; also has no imports (orphan)

   2. [0.70] src/api/v1_compat.py
      5 symbols, 3 dependencies
      Reason: no files import this module

Sample output (module level):

Dead modules detected (2 results):

   1. [0.70] src/legacy_exporters/
      8 files, 34 symbols
      Reason: no external imports into this directory (8 files)

   2. [0.40] src/tools/deprecated/
      3 files, 12 symbols
      Reason: no external imports into this directory (3 files); contains entry-point files (lower confidence)

Code Distillation¶

The distill tool compresses codebase information at varying fidelity levels, optimized for different context window budgets.

Distillation Levels¶

Level	Compression	What's included	Best for
`full`	~0%	Complete repo map with symbols	Full understanding, large context
`signatures`	~70%	Public API surface: function signatures, class definitions, type hints, first-line docstrings	API overview, code review
`structure`	~90%+	File tree + import graph adjacency list	Quick orientation, small context

File Selection¶

When files is specified, the tool includes those files plus neighbors up to depth hops via the dependency graph (max 3 hops). When files is omitted, auto-selects the most important files by PageRank score that fit within the token budget.

Examples¶

Signatures level (public API surface):

distill(files=["src/auth/middleware.py"], level="signatures", depth=1, max_tokens=2000)

Output:

# src/auth/middleware.py
class AuthMiddleware(BaseHTTPMiddleware):
    """JWT authentication middleware."""
    def __init__(self, app: ASGIApp, secret: str, algorithms: list[str] = ["HS256"]): ...
    async def dispatch(self, request: Request, call_next: Callable) -> Response: ...
        """Validate JWT token and inject user into request state."""

async def verify_token(token: str, secret: str) -> dict: ...
    """Verify and decode a JWT token."""

def extract_bearer(authorization: str) -> str | None: ...
    """Extract bearer token from Authorization header."""

# src/auth/tokens.py
def create_access_token(user_id: str, expires_delta: timedelta = ...) -> str: ...
    """Create a signed JWT access token."""

def create_refresh_token(user_id: str) -> str: ...
    """Create a signed JWT refresh token."""

(4 files, level=signatures, ~380 tokens)

Structure level (maximum compression):

distill(level="structure", max_tokens=1000)

Output:

# File tree
  src/api/__init__.py
  src/api/app.py
  src/api/routes/admin.py
  src/api/routes/auth.py
  src/api/routes/users.py
  src/auth/middleware.py
  src/auth/tokens.py
  src/config.py
  src/db/session.py
  src/models/user.py

# Import graph (file -> imports)
  src/api/app.py -> src/api/routes/admin.py, src/api/routes/auth.py, src/auth/middleware.py, src/config.py
  src/api/routes/auth.py -> src/auth/tokens.py, src/models/user.py, src/db/session.py
  src/auth/middleware.py -> src/auth/tokens.py, src/config.py, src/db/session.py

(10 files, level=structure, ~120 tokens)

Full level (delegates to repo_map):

distill(level="full", max_tokens=8000)

Graph DSL¶

The graph_dsl tool provides a Cypher-inspired query language for traversing the dependency graph with depth constraints, filters, and multi-hop chains.

Syntax Reference¶

MATCH <pattern> [WHERE <conditions>] RETURN <variables>

Nodes¶

Variable: target, caller, a, b --- matches any file, binds to a variable name
Literal path: "src/api/app.py" --- matches a specific file
Glob pattern: "src/api/*.py" --- matches files by glob

Edges¶

Outbound: node -[EDGE_TYPE]-> node
Inbound: node <-[EDGE_TYPE]- node
Exact depth: -[IMPORTS*3]-> (exactly 3 hops)
Depth range: -[IMPORTS*1..3]-> (1 to 3 hops)
Default depth: -[IMPORTS]-> (exactly 1 hop)

Edge Types¶

Edge type	Meaning
`IMPORTS`	File A imports file B
`IMPORTED_BY`	File A is imported by file B

WHERE Clause¶

Filter results by file properties. Multiple conditions are AND-separated.

WHERE variable.field operator value

Available fields:

Field	Type	Description
`language`	string	File language (e.g. `"python"`, `"typescript"`)
`line_count`	int	Number of lines
`importance`	float	PageRank importance score
`path`	string	Relative file path
`fan_in`	int	Number of files that import this file
`fan_out`	int	Number of files this file imports
`is_test`	bool	Whether the file is a test file
`is_config`	bool	Whether the file is a config file

Available operators: =, !=, >, <, >=, <=, LIKE

RETURN Clause¶

RETURN target --- return file paths bound to target
RETURN target.language, target.line_count --- return specific fields
RETURN COUNT --- return count of matches

Example Queries¶

1. Find all files imported by a specific file (1 hop):

MATCH "src/api/app.py" -[IMPORTS]-> target RETURN target

2. Find transitive dependencies up to 3 hops:

MATCH "src/api/app.py" -[IMPORTS*1..3]-> target RETURN target

3. Find all callers of a module:

MATCH "src/auth/middleware.py" <-[IMPORTED_BY]- caller RETURN caller

4. Find Python files that import a module:

MATCH "src/config.py" <-[IMPORTED_BY]- caller WHERE caller.language = "python" RETURN caller

5. Find large files in the import chain:

MATCH "src/core/loop.py" -[IMPORTS*1..2]-> dep WHERE dep.line_count > 500 RETURN dep, dep.line_count

6. Multi-hop chain --- find files 2 hops downstream:

MATCH a -[IMPORTS]-> b -[IMPORTS]-> c RETURN a, c

7. Find high fan-in files (popular dependencies):

MATCH source -[IMPORTS]-> target WHERE target.fan_in > 10 RETURN target, target.fan_in

8. Find test files that depend on a module:

MATCH "src/auth/middleware.py" <-[IMPORTED_BY]- test WHERE test.is_test = true RETURN test

9. Count dependencies:

MATCH "src/api/app.py" -[IMPORTS*1..3]-> dep RETURN COUNT

10. Glob pattern matching:

MATCH "src/api/*.py" -[IMPORTS]-> dep WHERE dep.path LIKE "src/db%" RETURN dep

Sample output:

Graph DSL results (6 matches):
    1. target=src/auth/tokens.py
    2. target=src/config.py
    3. target=src/db/session.py
    4. target=src/models/user.py
    5. target=src/api/routes/auth.py
    6. target=src/api/routes/admin.py