Skip to content

Filters

Response post-processing filters. Filters transform raw model outputs before scoring.

Source

Filter

Bases: Protocol[_T]


              flowchart TD
              lm_eval.api.filter.Filter[Filter]

              

              click lm_eval.api.filter.Filter href "" "lm_eval.api.filter.Filter"
            

Post-process model responses for a task before scoring.

Filters transform raw model outputs (instance.resps) into a form suitable for metric computation. They operate on all docs of a task at once, receiving a 2-D structure:

  • outer (Iterable) — one entry per doc
  • inner (Sequence) — one entry per repeat of that doc

Multiple filters can be chained via FilterEnsemble. T is the response element type: Completion (str) for generation tasks, LLOutput (tuple[float, bool]) for loglikelihood tasks.

Defaults to Completion.

Functions

apply

apply(resps: Iterable[Sequence[_T]], docs: Sequence[dict[str, Any]]) -> Iterable[Sequence[_T]]

Transform model responses.

PARAMETER DESCRIPTION
resps

Per-doc response sequences. Outer Iterable iterates over docs; inner Sequence holds repeats.

TYPE: Iterable[Sequence[_T]]

docs

The source document for each entry (parallel to resps).

TYPE: Sequence[dict[str, Any]]

RETURNS DESCRIPTION
Iterable[Sequence[_T]]

Transformed responses in the same doc order. May be

Iterable[Sequence[_T]]

lazy (map) to allow generator chaining between filters.

Source code in lm_eval/api/filter.py
def apply(
    self,
    resps: Iterable[Sequence[_T]],
    docs: Sequence[dict[str, Any]],
) -> Iterable[Sequence[_T]]:
    """Transform model responses.

    Args:
        resps: Per-doc response sequences.  Outer ``Iterable``
            iterates over docs; inner ``Sequence`` holds repeats.
        docs: The source document for each entry (parallel to *resps*).

    Returns:
        Transformed responses **in the same doc order**.  May be
        lazy (``map``) to allow generator chaining between filters.
    """
    ...

FilterEnsemble dataclass

FilterEnsemble(name: str, filters: list[Callable[[], Filter]])

A named chain of Filter steps applied sequentially.

Each Scorer owns one FilterEnsemble. When applied, it extracts (resps, doc) pairs from every Instance, threads them through each filter in order (outputs feed into the next filter's inputs), and stores the final result in Instance.filtered_resps[self.name].

Filters in the chain may return lazy iterables (e.g. map); materialisation is deferred until the final zip writes results back.

Attributes

name instance-attribute

name: str

filters instance-attribute

filters: list[Callable[[], Filter]]

Functions

apply

apply(instances: Sequence[Instance]) -> None
Source code in lm_eval/api/filter.py
def apply(self, instances: Sequence[Instance]) -> None:
    resps, docs = zip(*((inst.resps, inst.doc) for inst in instances), strict=True)
    # resps, docs = list(resps), list(docs)

    for f in self.filters:
        # apply filters in sequence
        resps = f().apply(resps, docs)

    # add the end results after filtering to filtered_requests of their respective source instances.
    # has key `self.name`: each FilterEnsemble applied in a given run should use a unique name.
    for inst, resp in zip(instances, resps, strict=True):
        inst.filtered_resps[self.name] = resp

Built-in Filters

filters

Attributes

__all__ module-attribute

__all__ = ['build_filter_ensemble', 'custom', 'extraction', 'selection', 'transformation']

Classes

Functions

build_filter_ensemble

build_filter_ensemble(filter_name: str, components: list[tuple[str, dict[str, str | int | float] | None]]) -> FilterEnsemble

Create a filtering pipeline.

Source code in lm_eval/filters/__init__.py
def build_filter_ensemble(
    filter_name: str,
    components: list[tuple[str, dict[str, str | int | float] | None]],
) -> FilterEnsemble:
    """Create a filtering pipeline."""
    # create filters given its name in the registry, and add each as a pipeline step
    return FilterEnsemble(
        name=filter_name,
        filters=[
            partial(get_filter(func), **(kwargs or {})) for func, kwargs in components
        ],
    )