Documentation Syntax Guide¶
author: claude
This project uses mkdocstrings with the Python handler to render API documentation on ReadTheDocs. Docstrings are written in Google style and use Markdown link syntax for cross-references.
Not Sphinx
This project does not use Sphinx. RST-style roles like
:class:\Foo`or:meth:`Bar.baz`` do not work — they render
as literal text. Use the Markdown link syntax described below instead.
This guide covers the two places you write documentation:
- Python docstrings — inline in
.pyfiles (Google style + Markdown cross-refs) - Markdown pages —
.mdfiles underdocs/(standard Markdown + MkDocs extensions)
Cheat Sheet¶
| What you want | In docstrings | In .md pages (Markdown) |
|---|---|---|
| Inline code | ``code`` |
`code` |
| Bold | **bold** |
**bold** |
| Italic | *italic* |
*italic* |
| Link to function | [evaluate][lm_eval.evaluator.evaluate] |
Same syntax, or use ::: directives |
| Link to class | [TaskManager][lm_eval.tasks.TaskManager] |
Same syntax |
| Link to method | [TaskManager.load][lm_eval.tasks.TaskManager.load] |
Same syntax |
| Relative link (same class) | [load][.load] |
N/A |
| Scoped link (same module) | [GenScorer][GenScorer] |
N/A |
| Code block (script) | Example: + fenced ``` block |
Fenced ``` block |
| Code block (REPL) | Examples: + >>> lines (note plural) |
Fenced ``` block |
| Bullet list | * item or - item |
- item |
| Numbered list | 1. item |
1. item |
| Heading | N/A (use sections like Args:) |
# H1 / ## H2 / ### H3 |
| Admonition/callout | Note: or Warning: sections |
!!! note / !!! warning |
Part 1 — Python Docstrings (Google Style)¶
The project's zensical.toml sets docstring_style = "google", so
mkdocstrings parses docstrings using
Google style conventions.
Basic Structure¶
def simple_evaluate(
model: str | LM,
tasks: list[str] | None = None,
num_fewshot: int | None = None,
) -> EvalResults | None:
"""High-level entry point for evaluation.
Longer description goes here. Can span multiple paragraphs.
Blank lines separate paragraphs.
Args:
model: Name of model or LM object.
tasks: List of task names or Task objects.
num_fewshot: Number of examples in few-shot context.
Returns:
Dictionary of results, or None if not on rank 0.
Raises:
ValueError: If no tasks are provided.
"""
Supported Sections¶
Google-style sections recognized by mkdocstrings:
| Section | Use for |
|---|---|
Args: |
Function/method parameters |
Returns: |
Return value description |
Raises: |
Exceptions the function may raise |
Yields: |
For generator functions |
Attributes: |
Class or dataclass attributes |
Example: |
Code example (admonition — use fenced block) |
Examples: |
REPL examples (auto-parses >>> lines) |
Note: |
Important notes |
Warning: |
Warnings |
Todo: |
Future work |
See Also: |
Related functions/classes |
Args with Types¶
Types can be specified in the docstring or in the signature (preferred). When types are already in the signature, don't duplicate them:
# PREFERRED — types in signature, not repeated in docstring
def load(self, names: list[str], *, num_fewshot: int | None = None) -> TaskDict:
"""Load tasks by name.
Args:
names: List of task names, glob patterns, or file paths.
num_fewshot: Override the task's default fewshot count.
"""
# ALSO VALID — types in docstring (useful when the signature isn't visible)
def load(self, names, num_fewshot=None):
"""Load tasks by name.
Args:
names (list[str]): List of task names, glob patterns, or file paths.
num_fewshot (int | None): Override the task's default fewshot count.
"""
Attributes Section (for dataclasses / classes)¶
Inline attribute docs
@dataclass
class Group:
"""A named group of tasks."""
group: str
"""Display name of the group."""
group_alias: str | None = None
"""Optional alias for result display."""
Part 2 — Cross-References (Inside Docstrings)¶
Inline Code — Double Backticks ``¶
In docstrings, inline code uses double backticks:
"""Ignored if ``model`` argument is a LM object."""
"""Each entry follows the ``"metric"`` key plus optional kwargs."""
"""An empty list ``[]`` signals 'no explicit filters'."""
Double backticks render as monospace code in the docs.
Cross-References — [text][target]¶
mkdocstrings uses Markdown link syntax for cross-references, not RST roles. The general form is:
This renders as a clickable hyperlink to the target's documentation.
How Path Resolution Works (preferred order)¶
The target is a dotted Python import path to the object. With
relative_crossrefs and scoped_crossrefs enabled in zensical.toml,
you have three forms. Prefer the shortest one that works — it keeps
docstrings readable.
1. Scoped name — bare name, auto-resolved (preferred)
With scoped_crossrefs = true, bare names are resolved by searching
the current scope (class members → module → parents):
2. Relative path with . prefix — when scoped is ambiguous
With relative_crossrefs = true, a leading . means "relative to the
current object":
3. Full path — for cross-module references
Use the full import path when the target is in a different module:
Syntax Variants¶
| Syntax | Displayed as | Notes |
|---|---|---|
[GenScorer][GenScorer] |
GenScorer | Scoped lookup (same module) |
[reduce][.reduce] |
reduce | Relative to current class |
[Scorer][lm_eval.scorers.Scorer] |
Scorer | Full path, explicit display text |
[Scorer.reduce][lm_eval.scorers.Scorer.reduce] |
Scorer.reduce | Full path method reference |
[lm_eval.scorers.Scorer][] |
lm_eval.scorers.Scorer | Auto-titled (full path shown) |
When to Use Which Form¶
| Situation | Recommended form |
|---|---|
| Target is in the same class | Scoped: [score_doc][score_doc] or relative: [score_doc][.score_doc] |
| Target is in the same module | Scoped: [GenScorer][GenScorer] |
| Target is in a different module | Full path: [FilterStep][lm_eval.config.task.FilterStep] |
| You want custom display text | Explicit: [see the scorer][lm_eval.scorers.Scorer] |
When to Use [links] vs ``backticks``¶
- Use
[text][path]for: classes, functions, and methods that exist in your codebase and should be clickable links - Use
``double backticks``for: values, strings, variable names, dict keys, code snippets that shouldn't link anywhere
# GOOD — link to a real class, backticks for a string value
"""Each entry follows the [FilterStep][lm_eval.config.task.FilterStep]
shape (``"function"`` key plus optional ``"kwargs"``)."""
Important: The Target Must Be Documented¶
Cross-references only resolve into clickable links if the target object
appears somewhere in your built docs (i.e., it's pulled in by a :::
directive in some .md page). If the target isn't documented, the
reference renders as plain text instead of a link.
Code Blocks in Docstrings¶
There are two approaches depending on whether you want to show a script or interactive REPL output.
Script examples — Example: + fenced block¶
Use Example: (singular, one colon) with a fenced Markdown code block.
This gives you explicit language control and proper syntax highlighting:
class TaskManager:
"""Central entry point for discovering and loading evaluation tasks.
Example:
```python
tm = TaskManager(include_path="my_tasks/")
result = tm.load(["mmlu", "hellaswag"])
result["tasks"] # {name: Task, ...}
result["groups"] # {name: Group, ...}
```
"""
For non-Python examples, change the language tag:
class ScorerConfig:
"""Configuration for a registered scorer.
Example:
```yaml
# String shorthand
scorer: first_token
```
"""
REPL examples — Examples: + >>> lines¶
Use Examples: (plural) for interactive console style. mkdocstrings
auto-parses >>> lines into code blocks and separates prose:
def normalize_metric_cfg(cfg: dict) -> dict:
"""Normalize a metric config dict.
Examples:
Basic usage:
>>> normalize_metric_cfg({"metric": "exact_match", "ignore_case": True})
{"metric": "exact_match", "kwargs": {"ignore_case": True}}
Already normalized input is returned as-is:
>>> normalize_metric_cfg({"metric": "acc", "kwargs": {}})
{"metric": "acc", "kwargs": {}}
"""
Example:: (double colon) does not work
The Example:: syntax is from reStructuredText/Sphinx. mkdocstrings
does not understand :: literal blocks — the content will render as
collapsed inline text without line breaks. Use one of the two approaches
above instead.
Bulleted / Numbered Lists in Docstrings¶
"""Filter / metric precedence (highest to lowest):
1. Explicit ``cfg["filter"]`` / ``cfg["metric_list"]`` passed to ``from_dict``
2. ``cls.default_filter_cfg`` / ``cls.default_metric_cfg``
3. Hardcoded fallback (``noop`` / *global_metrics*)
"""
Or with bullets:
"""
* ``None`` values are dropped.
* Any callable value is serialized with [serialize_callable][lm_eval.config.utils.serialize_callable].
"""
Part 3 — Markdown Pages (docs/*.md)¶
The .md files under docs/ use standard Markdown plus extensions
from Material for MkDocs
and PyMdownx.
mkdocstrings Directives — Pulling in API Docs¶
The ::: directive tells mkdocstrings to auto-generate documentation from
a Python object:
# Entry Points
The main functions for running evaluations programmatically.
::: lm_eval.evaluator.simple_evaluate
::: lm_eval.evaluator.evaluate
With options:
Common options you can set per-directive:
| Option | Effect |
|---|---|
show_root_heading |
Show the object name as a heading |
members |
List specific members to show |
show_source |
Show source code link |
heading_level |
Override heading level (default: page level) |
Links to GitHub Source¶
Use the Material icon button pattern:
[:material-github: Source](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/evaluator.py){ .md-button }
- `[:material-github: Source]` — renders a GitHub icon (from Material Design Icons) followed by the link text
- `(https://...)` — the target URL
- `{ .md-button }` — applies the `md-button` CSS class, rendering the link as a styled button
Code Blocks¶
Standard fenced code blocks with syntax highlighting:
```python
from lm_eval import evaluator
results = evaluator.simple_evaluate(model="hf", ...)
```
```yaml
task: my_task
dataset_path: my_dataset
output_type: multiple_choice
```
With line highlighting and annotations (enabled via pymdownx.highlight):
```python hl_lines="2 3"
from lm_eval import evaluator
results = evaluator.simple_evaluate( # (1)!
model="hf",
model_args="pretrained=gpt2",
)
```
1. This is an annotation explaining the highlighted line.
Tabs¶
Use pymdownx.tabbed for tabbed content:
=== "Python"
```python
from lm_eval import evaluator
evaluator.simple_evaluate(model="hf", tasks=["mmlu"])
```
=== "CLI"
```bash
lm_eval --model hf --tasks mmlu
```
Admonitions¶
Use admonition extension for callout boxes:
!!! note
Task names are case-sensitive.
!!! warning
Running with `--confirm_run_unsafe_code` enables arbitrary code execution.
!!! tip "Performance"
Use `--batch_size auto` to let the harness find the optimal batch size.
!!! example
```python
tm = TaskManager()
result = tm.load(["hellaswag"])
```
Collapsible admonitions (via pymdownx.details):
??? note "Click to expand"
Hidden content here.
???+ note "Expanded by default"
Visible content here.
Task Lists¶
Tables¶
Standard Markdown tables:
| Metric | Type | Aggregation |
|-------------|---------------|-------------|
| `acc` | loglikelihood | `mean` |
| `exact_match`| generation | `mean` |
Mermaid Diagrams¶
Enabled via pymdownx.superfences:
```mermaid
graph LR
A[TaskConfig YAML] --> B[TaskManager.load]
B --> C[Task objects]
C --> D[Scorer pipeline]
D --> E[Metrics]
```
Footnotes¶
This uses bootstrap resampling[^1] for standard error estimation.
[^1]: See `bootstrap_iters` parameter in `simple_evaluate`.
Tips¶
- Run locally to preview:
zensical buildthen opensite/index.html. If cross-refs don't resolve, tryrm -rf .cache sitefirst. - Cross-references only resolve if the target object is documented
(included via a
:::directive somewhere) - Indentation matters for fenced code blocks inside docstrings — use 4 spaces
- Newlines in
Returns:/Args:start new entries — mkdocstrings treats each non-indented line under a section as a separate item. Keep descriptions as one continuous paragraph, or indent continuation lines:
```python # BAD — the dict line becomes a second return entry """ Returns: Aggregated metrics dict for this group: {"alias": str, "acc,none": float, ...} """
GOOD — one paragraph with inline code¶
"""
Returns:
Aggregated metrics dict for this group, e.g.
{"alias": str, "acc,none": float, ...}
"""
```
- The
signature_crossrefs = truesetting inzensical.tomlmeans type annotations in signatures automatically become links where possible - Don't duplicate types — if the function signature has type hints,
mkdocstrings renders them automatically; no need to repeat in
Args: - The
relative_crossrefsandscoped_crossrefsoptions inzensical.tomlenable the shorter[.sibling]and[SameName]reference forms