Filters¶

Response post-processing filters. Filters transform raw model outputs before scoring.

Filter ¶

Bases: Protocol[_T]


              flowchart TD
              lm_eval.api.filter.Filter[Filter]

              

              click lm_eval.api.filter.Filter href "" "lm_eval.api.filter.Filter"

Post-process model responses for a task before scoring.

Filters transform raw model outputs (instance.resps) into a form suitable for metric computation. They operate on all docs of a task at once, receiving a 2-D structure:

outer (Iterable) — one entry per doc
inner (Sequence) — one entry per repeat of that doc

Multiple filters can be chained via FilterEnsemble. T is the response element type: Completion (str) for generation tasks, LLOutput (tuple[float, bool]) for loglikelihood tasks.

Defaults to Completion.

Functions¶

apply ¶

apply(resps: Iterable[Sequence[_T]], docs: Sequence[dict[str, Any]]) -> Iterable[Sequence[_T]]

Transform model responses.

PARAMETER	DESCRIPTION
`resps`	Per-doc response sequences. Outer `Iterable` iterates over docs; inner `Sequence` holds repeats. TYPE: `Iterable[Sequence[_T]]`
`docs`	The source document for each entry (parallel to resps). TYPE: `Sequence[dict[str, Any]]`

RETURNS	DESCRIPTION
`Iterable[Sequence[_T]]`	Transformed responses in the same doc order. May be
`Iterable[Sequence[_T]]`	lazy (`map`) to allow generator chaining between filters.

Source code in lm_eval/api/filter.py

def apply(
    self,
    resps: Iterable[Sequence[_T]],
    docs: Sequence[dict[str, Any]],
) -> Iterable[Sequence[_T]]:
    """Transform model responses.

    Args:
        resps: Per-doc response sequences.  Outer ``Iterable``
            iterates over docs; inner ``Sequence`` holds repeats.
        docs: The source document for each entry (parallel to *resps*).

    Returns:
        Transformed responses **in the same doc order**.  May be
        lazy (``map``) to allow generator chaining between filters.
    """
    ...

FilterEnsemble `dataclass` ¶

FilterEnsemble(name: str, filters: list[Callable[[], Filter]])

A named chain of Filter steps applied sequentially.

Each Scorer owns one FilterEnsemble. When applied, it extracts (resps, doc) pairs from every Instance, threads them through each filter in order (outputs feed into the next filter's inputs), and stores the final result in Instance.filtered_resps[self.name].

Filters in the chain may return lazy iterables (e.g. map); materialisation is deferred until the final zip writes results back.

Attributes¶

name `instance-attribute` ¶

name: str

filters `instance-attribute` ¶

filters: list[Callable[[], Filter]]

Functions¶

apply ¶

apply(instances: Sequence[Instance]) -> None

Source code in lm_eval/api/filter.py

def apply(self, instances: Sequence[Instance]) -> None:
    resps, docs = zip(*((inst.resps, inst.doc) for inst in instances), strict=True)
    # resps, docs = list(resps), list(docs)

    for f in self.filters:
        # apply filters in sequence
        resps = f().apply(resps, docs)

    # add the end results after filtering to filtered_requests of their respective source instances.
    # has key `self.name`: each FilterEnsemble applied in a given run should use a unique name.
    for inst, resp in zip(instances, resps, strict=True):
        inst.filtered_resps[self.name] = resp

Built-in Filters¶

filters ¶

Attributes¶

all `module-attribute` ¶

__all__ = ['build_filter_ensemble', 'custom', 'extraction', 'selection', 'transformation']

Classes¶

Functions¶

build_filter_ensemble ¶

build_filter_ensemble(filter_name: str, components: list[tuple[str, dict[str, str | int | float] | None]]) -> FilterEnsemble

Create a filtering pipeline.

Source code in lm_eval/filters/__init__.py

def build_filter_ensemble(
    filter_name: str,
    components: list[tuple[str, dict[str, str | int | float] | None]],
) -> FilterEnsemble:
    """Create a filtering pipeline."""
    # create filters given its name in the registry, and add each as a pipeline step
    return FilterEnsemble(
        name=filter_name,
        filters=[
            partial(get_filter(func), **(kwargs or {})) for func, kwargs in components
        ],
    )

Filters¶

Filter ¶

Functions¶

apply ¶

FilterEnsemble dataclass ¶

Attributes¶

name instance-attribute ¶

filters instance-attribute ¶

Functions¶

apply ¶

Built-in Filters¶

filters ¶

Attributes¶

__all__ module-attribute ¶

Classes¶

Functions¶

build_filter_ensemble ¶

FilterEnsemble `dataclass` ¶

name `instance-attribute` ¶

filters `instance-attribute` ¶

all `module-attribute` ¶