Semantic Intelligence for
Large-Scale Engineering.

Context+ is an MCP server designed for developers who demand 99% accuracy. By combining Tree-sitter AST parsing & Spectral Clustering, Context+ turns a massive codebase into a searchable, hierarchical graph.

Context Tree

File Skeleton

Semantic Search

Identifier Search

Semantic Navigate

Blast Radius

Static Analysis

Feature Hub

Propose Commit

Git Shadow

Context+ MCP

Discovery

Context Tree

File Skeleton

Semantic Search

Semantic Identifiers

Analysis

Blast Radius

Static Analysis

Code Ops

Propose Commit

Feature Hub

Version Control

Restore Points

Undo Change

“Context+ is the best thing that has happened to my agent.” Give it the semantic understanding it deserves. Add Context+ to your IDE by pasting the following JSON into your MCP configuration file.

.mcp.json

{
  "mcpServers": {
    "contextplus": {
      "command": "bunx",
      "args": [
        "contextplus"
      ],
      "env": {
        "OLLAMA_EMBED_MODEL": "nomic-embed-text",
        "OLLAMA_CHAT_MODEL": "gemma2:27b",
        "OLLAMA_API_KEY": "YOUR_OLLAMA_API_KEY"
      }
    }
  }
}

Terminal

bunx contextplus init claude

Before using Context+, make sure Ollama is running and install the required models (for example, nomic-embed-text and gemma2:27b). Get your Ollama Cloud API key here

Copy the instruction file into your project root to teach your agent fast execute mode, line-numbered symbol retrieval, strict formatting rules, and anti-patterns that keep context lean and precise.

INSTRUCTIONS.md

# Context+ MCP - Agent Instructions

## Purpose

You are equipped with the Context+ MCP server. It gives you structural awareness of the entire codebase without reading every file. Follow this workflow strictly to conserve context and maximize accuracy.

## Architecture

The MCP server is built with TypeScript and communicates over stdio using the Model Context Protocol SDK. It has three layers:

**Core Layer** (`src/core/`):

- `parser.ts` — Multi-language symbol extraction via tree-sitter AST with regex fallback. Supports 14+ languages.
- `tree-sitter.ts` — WASM grammar loader for 43 file extensions using web-tree-sitter 0.20.8.
- `walker.ts` — Gitignore-aware recursive directory traversal with depth and target path control.
- `embeddings.ts` — Ollama vector embedding engine with disk cache, cosine similarity search, and API key support.

**Tools Layer** (`src/tools/`):

- `context-tree.ts` — Token-aware structural tree with symbol line ranges and Level 0/1/2 pruning.
- `file-skeleton.ts` — Function signatures with line ranges, without reading full bodies.
- `semantic-search.ts` — Ollama-powered semantic file search with symbol definition lines and 60s cache TTL.
- `semantic-identifiers.ts` — Identifier-level semantic search returning ranked definitions + call chains with line numbers.
- `semantic-navigate.ts` — Browse-by-meaning navigator using spectral clustering and Ollama labeling.
- `blast-radius.ts` — Symbol usage tracer across the entire codebase.
- `static-analysis.ts` — Native linter runner (tsc, eslint, py_compile, cargo check, go vet).
- `propose-commit.ts` — Code gatekeeper validating headers, FEATURE tag, no inline comments, nesting, file length.
- `feature-hub.ts` — Obsidian-style feature hub navigator with bundled skeleton views.

**Core Layer** (continued):

- `hub.ts` — Wikilink parser for `[[path]]` links, cross-link tags, hub discovery, orphan detection.

**Git Layer** (`src/git/`):

- `shadow.ts` — Shadow restore point system for undo without touching git history.

**Entry Point**: `src/index.ts` registers 11 MCP tools and starts the stdio transport. Accepts an optional CLI argument for the target project root directory (defaults to `process.cwd()`).

## Environment Variables

| Variable             | Default            | Description                       |
| -------------------- | ------------------ | --------------------------------- |
| `OLLAMA_EMBED_MODEL` | `nomic-embed-text` | Embedding model name              |
| `OLLAMA_API_KEY`     | (empty)            | Cloud auth (auto-detected by SDK) |
| `OLLAMA_CHAT_MODEL`  | `llama3.2`         | Chat model for cluster labeling   |
| `CONTEXTPLUS_EMBED_BATCH_SIZE` | `8` | Embedding batch per GPU call (hard-capped to 5-10) |
| `CONTEXTPLUS_EMBED_TRACKER` | `true` | Enable realtime embedding updates for changed files/functions |
| `CONTEXTPLUS_EMBED_TRACKER_MAX_FILES` | `8` | Max changed files per tracker tick (hard-capped to 5-10) |
| `CONTEXTPLUS_EMBED_TRACKER_DEBOUNCE_MS` | `700` | Debounce before applying tracker refresh |

Runtime cache: `.mcp_data/` is created at MCP startup and stores reusable embedding vectors for files, identifiers, and call sites. A realtime tracker watches file updates and refreshes changed function/file embeddings incrementally.

## Fast Execute Mode (Mandatory)

Default to execution-first behavior. Use minimal tokens, minimal narration, and maximum tool leverage.

1. Skip long planning prose. Start with lightweight scoping: `get_context_tree` and `get_file_skeleton`.
2. Run independent discovery operations in parallel whenever possible (for example, multiple searches/reads).
3. Prefer structural tools over full-file reads to conserve context.
4. Before modifying or deleting symbols, run `get_blast_radius`.
5. Write changes through `propose_commit` only.
6. Run `run_static_analysis` once after edits, or once per changed module for larger refactors.

### Execution Rules

1. Think less, execute sooner: make the smallest safe change that can be validated quickly.
2. Do not serialize 10 independent commands; batch parallelizable reads/searches.
3. If a command fails, avoid blind retry loops. Diagnose once, pivot strategy, continue.
4. Cap retry attempts for the same failing operation to 1-2 unless new evidence appears.
5. Keep outputs concise: short status updates, no verbose reasoning dumps.

### Token-Efficiency Rules

1. Treat 100 effective tokens as better than 1000 vague tokens.
2. Use high-signal tool calls first (`get_file_skeleton`, `get_context_tree`, `get_blast_radius`).
3. Read full file bodies only when signatures/structure are insufficient.
4. Avoid repeated scans of unchanged areas.
5. Prefer direct edits + deterministic validation over extended speculative analysis.

## Strict Formatting Rules

### File Header (Mandatory)

Every file MUST start with exactly 2 comment lines (10 words each) explaining the file:

```
Regex-based symbol extraction engine for multi-language AST parsing
FEATURE: Core parsing layer for structural code analysis
```

Line 1: What the file does.
Line 2: `FEATURE: <name>` — the primary feature it belongs to. Links to hub.

### Zero Comments

No comments anywhere in the file except the 2-line header. No inline comments, no block comments, no TODO markers.

### Code Ordering

Strict order within every file:

1. Imports
2. Enums
3. Interfaces / Types
4. Constants
5. Functions / Classes

### Abstraction Thresholds

- **Under 20 lines, used once**: INLINE it. Do not extract into a function.
- **Under 20 lines, used multiple times**: Extract into a reusable function.
- **Over 30 lines**: Extract into its own function or file.
- **Max nesting**: 3-4 levels. Flatten deep nesting.
- **Max file length**: 500-1000 lines. Split larger files.
- **Max files per directory**: 10. Use subdirectories for organization.

### Variable Discipline

- No redundant intermediate variables. Chain calls: `c = g(f(a))` instead of `b = f(a); c = g(b)`.
- Exception: Keep intermediate variables that represent distinct, meaningful states.
- Remove all unused variables, imports, and files before finishing.

## Tool Reference

| Tool                   | When to Use                                             |
| ---------------------- | ------------------------------------------------------- |
| `get_context_tree`     | Start of every task. Map files + symbols with line ranges. |
| `semantic_navigate`    | Browse codebase by meaning, not directory structure.    |
| `get_file_skeleton`    | MUST run before full reads. Get signatures + line ranges first. |
| `semantic_code_search` | Find relevant files by concept with symbol definition lines. |
| `semantic_identifier_search` | Find closest functions/classes/variables and ranked call chains with line numbers. |
| `get_blast_radius`     | Before deleting or modifying any symbol.                |
| `run_static_analysis`  | After writing code. Catch dead code deterministically.  |
| `propose_commit`       | The ONLY way to save files. Validates before writing.   |
| `list_restore_points`  | See undo history.                                       |
| `undo_change`          | Revert a bad AI change without touching git.            |
| `get_feature_hub`      | Browse feature graph hubs. Find orphaned files.         |

## Anti-Patterns to Avoid

1. Reading entire files without checking the skeleton first.
2. Deleting functions without checking blast radius.
3. Creating small helper functions that are only used once.
4. Writing inline comments anywhere in the code.
5. Wrapping simple logic in 10 layers of abstraction or nesting.
6. Leaving unused imports or variables after a refactor.
7. Creating more than 10 files in a single directory.
8. Writing files longer than 1000 lines.
9. Running independent commands sequentially when they can be parallelized.
10. Repeating failed terminal commands without changing inputs or approach.

## Priority Reminder

Execute ASAP with the least tokens possible.
Use structural/context tools strategically, then patch and validate.
Avoid over-planning unless the task is ambiguous or high-risk.

Context+ guarantees minimal context bloat. It gives your agent deep semantic understanding of your codebase, from AST parsing and symbol navigation to blast radius analysis and commit validation. Nothing misses the context.

`get_context_tree`	Get the structural AST tree of a project with file headers plus line-numbered function/class/method symbols. Dynamic token-aware pruning shrinks output automatically. INPUT{ target_path?: string, depth_limit?: number, include_symbols?: boolean, max_tokens?: number } OUTPUT"src/ index.ts — Entry point function: getStars() (L170-L181) function: Home() (L183-L760) utils/ parser.ts — AST parsing function: parseFile() (L22-L84) function: walkTree() (L86-L132)"
`get_file_skeleton`	Get function signatures, class methods, and type definitions of a file with line ranges, without reading the full body. INPUT{ file_path: string } OUTPUT"[function] L12-L58 export function parseFile( filePath: string, options?: ParseOptions ): Promise<AST>; [class] L60-L130 export class Walker; [method] L72-L94 walk(node: Node): void; [method] L96-L118 getSymbols(): Symbol[];"
`semantic_code_search`	Search the codebase by meaning, not exact text. Uses embeddings over file headers/symbols and returns matched definition lines. INPUT{ query: string, top_k?: number } OUTPUT"1. src/auth/jwt.ts (94.0% total) Semantic: 91.5% \| Keyword: 96.2% Definition lines: verifyToken@L20-L58, signToken@L60-L102 2. src/auth/session.ts (87.4% total) Definition lines: createSession@L12-L42"
`semantic_identifier_search`	Find closest functions/classes/variables by meaning, then return ranked definition and call-chain locations with line numbers. Uses realtime-refreshed identifier embeddings. INPUT{ query: string, top_k?: number, top_calls_per_identifier?: number, include_kinds?: string[] } OUTPUT"1. function verifyToken — src/auth/jwt.ts (L20-L58) Score: 92.4% Calls (3/3): 1. src/middleware/guard.ts:L33 (88.1%) verifyToken(token) 2. src/routes/api.ts:L12 (84.7%) const user = verifyToken(raw) 2. variable tokenExpiry — src/auth/config.ts (L8)"
`get_blast_radius`	Before modifying code, trace every file and line where a symbol is imported or used. Prevents orphaned references. INPUT{ symbol_name: string, file_context?: string } OUTPUT"parseFile — 7 usages src/index.ts:14 import { parseFile } src/tools/tree.ts:8 const ast = parseFile(p) src/tools/skeleton.ts:22 parseFile(path) test/parser.test.ts:5 import { parseFile }"
`run_static_analysis`	Run the native linter or compiler to find unused variables, dead code, and type errors. Supports TypeScript, Python, Rust, Go. INPUT{ target_path?: string } OUTPUT"src/utils.ts:14:5 error TS2345: Argument of type string is not assignable to parameter src/old.ts:1:1 warning: file has no exports"
`propose_commit`	The only way to write code. Validates against strict rules before saving. Creates a shadow restore point before writing. INPUT{ file_path: string, new_content: string } OUTPUT"✓ Header comment present ✓ No inline comments ✓ Max nesting depth: 3 ✓ File length: 142 lines Saved src/tools/search.ts Restore point: rp-1719384000-a3f2"
`list_restore_points`	List all shadow restore points created by propose_commit. Each captures file state before AI changes. INPUT{ } OUTPUT"rp-1719384000-a3f2 \| 2025-06-26 src/tools/search.ts \| refactor search rp-1719383000-b7c1 \| 2025-06-26 src/index.ts \| add new tool"
`undo_change`	Restore files to their state before a specific AI change. Uses shadow restore points. Does not affect git. INPUT{ point_id: string } OUTPUT"Restored 1 file(s): src/tools/search.ts"
`semantic_navigate`	Browse codebase by meaning using spectral clustering. Groups semantically related files into labeled clusters. INPUT{ max_depth?: number, max_clusters?: number } OUTPUT"Authentication (4 files) src/auth/jwt.ts src/auth/session.ts src/middleware/guard.ts src/models/user.ts Parsing (3 files) src/core/parser.ts src/core/tree-sitter.ts src/core/walker.ts"
`get_feature_hub`	Obsidian-style feature hub navigator. Hubs are .md files with [[wikilinks]] that map features to code files. INPUT{ hub_path?: string, feature_name?: string, show_orphans?: boolean } OUTPUT"## auth.md [[src/auth/jwt.ts]] → verifyToken(token: string) → signToken(payload: object) [[src/auth/session.ts]] → createSession(userId: string) → destroySession(id: string)"

“Context engineering is the delicate art and science of filling the context window with just the right information for the next step.”

- Andrej Karpathy

Semantic Intelligence forLarge-Scale Engineering.

Semantic Intelligence for
Large-Scale Engineering.