TabbyML/Tabby — Prompts
Prompt 1: FIM (Fill-In-the-Middle) Template
Source: MODEL_SPEC.md — tabby.json model specification
Technique: Prompt templating for FIM inference. The model specification defines how context is structured for code completion requests.
{
"prompt_template": "<PRE>{prefix}<SUF>{suffix}<MID>",
"chat_template": "<s>{% for message in messages %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + '</s> ' }}{% endif %}{% endfor %}"
}
The FIM template:
{prefix}: code before the cursor
{suffix}: code after the cursor
- The model fills in the
<MID> section
This is the core prompting primitive for code completion. Unlike chat-based agents that use free-form prompts, Tabby uses structured FIM templates defined per-model.
Prompt 2: RAG Context Injection
Source: Inferred from tabby-index and tabby-crawler architecture
Technique: Retrieval-Augmented Generation for code completion. The repository index provides relevant code snippets as additional context for completions.
When generating a completion, Tabby:
- Takes the current file context (prefix + suffix)
- Queries the repository index for similar/relevant code patterns
- Injects retrieved snippets as additional context
- Generates the completion with expanded context
This is documented as "locally relevant snippets (declarations from local LSP, and recently modified code)" added in v0.5.
Prompt 3: Answer Engine Query Handling
Source: Inferred from v0.13 release notes
Technique: Grounded Q&A with source attribution. The Answer Engine is designed to provide "reliable and precise answers" grounded in indexed content, with citations.
The system likely uses a retrieval step (query → relevant docs/code) followed by generation with explicit "cite your sources" constraints, similar to the OpenHands documentation microagent.
Prompting Techniques Used
- FIM templates: Structured prefix/suffix/middle format for code completion — not conversational prompting
- Per-model prompt templates: Each model has its own template defined in
tabby.json, not a universal format
- RAG injection: Repository context retrieved and injected as additional prompt context
- Grounded generation: Answer Engine requires source attribution
- Chat templates: Jinja2-style templates for instruct/chat models (separate from FIM templates)