LM Base Class¶

Abstract base class for language models. Subclass this to add a new model backend to the evaluation harness.

LM ¶

LM()

Bases: ABC


              flowchart TD
              lm_eval.api.model.LM[LM]

              

              click lm_eval.api.model.LM href "" "lm_eval.api.model.LM"

Abstract base class for language models.

Subclasses take text (strings) as input and yield strings as output. Inputs and outputs should be tokenization-agnostic.

Source code in lm_eval/api/model.py

def __init__(self) -> None:
    # set rank and world size to a single process, by default.
    self._rank = 0
    self._world_size = 1
    self._device = None
    self.cache_hook: CacheHook = CacheHook(None)

Attributes¶

cache_hook `instance-attribute` ¶

cache_hook: CacheHook = CacheHook(None)

device `property` ¶

device

rank `property` ¶

rank: int

Index of this process. Default: 0 (single-process).

world_size `property` ¶

world_size: int

Total number of processes. Default: 1 (single-process).

tokenizer_name `property` ¶

tokenizer_name: str

Name of the tokenizer or chat template, used to fingerprint request caches.

Required for subclasses that support chat templating.

Functions¶

loglikelihood `abstractmethod` ¶

loglikelihood(requests: Sequence[LLInstance]) -> list[LLOutput]

Compute log-likelihood of generating a continuation from a context.

Downstream tasks should prefer this over other LM calls whenever possible.

PARAMETER	DESCRIPTION
`requests`	List of `Instance` objects. Each `Instance.args` is a `(context, continuation)` tuple. context — the conditioning text (implementations must handle empty string). continuation — the text to score. Word-boundary spaces belong in the continuation (e.g. `context="hello" continuation=" world"`). TYPE: `Sequence[LLInstance]`

RETURNS	DESCRIPTION
`list[LLOutput]`	A list of `(logprob, is_greedy)` tuples — (summed log-probability of
`list[LLOutput]`	the continuation, whether it would be produced by greedy decoding).

Source code in lm_eval/api/model.py

@abc.abstractmethod
def loglikelihood(self, requests: Sequence[LLInstance]) -> list[LLOutput]:
    """Compute log-likelihood of generating a continuation from a context.

    Downstream tasks should prefer this over other LM calls whenever possible.

    Args:
        requests: List of ``Instance`` objects. Each ``Instance.args`` is a ``(context, continuation)`` tuple.
            *context* — the conditioning text (implementations must handle empty string).
            *continuation* — the text to score. Word-boundary spaces belong in the
            continuation (e.g. ``context="hello"  continuation=" world"``).

    Returns:
        A list of ``(logprob, is_greedy)`` tuples — (summed log-probability of
        the continuation, whether it would be produced by greedy decoding).
    """
    ...

loglikelihood_rolling `abstractmethod` ¶

loglikelihood_rolling(requests: Sequence[LLInstance]) -> list[LLOutput]

Compute full log-likelihood of a string, with no truncation, for perplexity computation.

Uses the full max context length of the model.
Inputs exceeding that length are chunked, up to the max context length.
IMPORTANT: Each document's loglikelihood/perplexity is computed separately, unlike other implementations which may simply concatenate multiple documents together.
IMPORTANT: We maximize the amount of context for each prediction. Specifically, for inputs that we break into multiple chunks, the last input will still a full-sized context.

Example

Input tokens: [ 0 1 2 3 4 5 6 7 8 9 ]
Prefix: BOS/EOS
Max context length: 4
Resulting input/prediction pairs:

    INPUT:  BOS   0   1   2
    PRED:     0   1   2   3

    INPUT:    3   4   5   6
    PRED:     4   5   6   7

    INPUT:    5   6   7   8
    PRED:             8   9

Observe that:
  1. Each token is predicted exactly once
  2. For the last pair, we provide the full context, but only score the last two tokens

PARAMETER	DESCRIPTION
`requests`	List of `Instance` objects. Each `Instance.args` is a `(Literal[""], string)` tuple containing the text whose overall log-likelihood is computed. Context is always an empty string to keep the interface consistent with `loglikelihood`. TYPE: `Sequence[LLInstance]`

RETURNS	DESCRIPTION
`list[LLOutput]`	A list of `(logprob, Literal[False])` tuples — the log-probability of the string
`list[LLOutput]`	conditioned on the BOS/EOS token (or `prefix_token_id`).
`list[LLOutput]`	The second element is always False since this method does not compute greedy likelihood.

Source code in lm_eval/api/model.py

@abc.abstractmethod
def loglikelihood_rolling(self, requests: Sequence[LLInstance]) -> list[LLOutput]:
    """Compute full log-likelihood of a string, with no truncation, for perplexity computation.

    - Uses the full max context length of the model.
    - Inputs exceeding that length are chunked, up to the max context length.
    - IMPORTANT: Each document's loglikelihood/perplexity is computed *separately*, unlike other implementations
      which may simply concatenate multiple documents together.
    - IMPORTANT: We maximize the amount of context for each prediction. Specifically, for inputs that we break into
      multiple chunks, the last input will still a full-sized context.

    Example:
        ```text
        Input tokens: [ 0 1 2 3 4 5 6 7 8 9 ]
        Prefix: BOS/EOS
        Max context length: 4
        Resulting input/prediction pairs:

            INPUT:  BOS   0   1   2
            PRED:     0   1   2   3

            INPUT:    3   4   5   6
            PRED:     4   5   6   7

            INPUT:    5   6   7   8
            PRED:             8   9

        Observe that:
          1. Each token is predicted exactly once
          2. For the last pair, we provide the full context, but only score the last two tokens
        ```

    Args:
        requests: List of ``Instance`` objects. Each ``Instance.args`` is a ``(Literal[""], string)`` tuple containing
            the text whose overall log-likelihood is computed.
            Context is always an empty string to keep the interface consistent with ``loglikelihood``.

    Returns:
        A list of ``(logprob, Literal[False])`` tuples — the log-probability of the string
        conditioned on the BOS/EOS token (or ``prefix_token_id``).
        The second element is always False since this method does not compute greedy likelihood.
    """
    ...

generate_until `abstractmethod` ¶

generate_until(requests: Sequence[GenInstance]) -> list[str]

Generate greedily until a stopping sequence.

PARAMETER	DESCRIPTION
`requests`	List of `Instance` objects. Each `Instance.args` is a `(context, gen_kwargs)` tuple. context: str — the conditioning text. gen_kwargs: str — generation keyword arguments (e.g. `temperature`, `until`). TYPE: `Sequence[GenInstance]`

RETURNS	DESCRIPTION
`list[str]`	A list of generated continuation strings, one per request.

Source code in lm_eval/api/model.py

@abc.abstractmethod
def generate_until(self, requests: Sequence[GenInstance]) -> list[str]:
    """Generate greedily until a stopping sequence.

    Args:
        requests: List of ``Instance`` objects. Each ``Instance.args`` is a ``(context, gen_kwargs)`` tuple.
            *context*: str — the conditioning text.
            *gen_kwargs*: str — generation keyword arguments (e.g. ``temperature``, ``until``).

    Returns:
        A list of generated continuation strings, one per request.
    """
    ...

apply_chat_template ¶

apply_chat_template(chat_history: Sequence[dict[str, str]], add_generation_prompt=True) -> str | list[dict[str, str]]

Transform few-shot chat history into a string prompt for the model.

PARAMETER	DESCRIPTION
`chat_history`	Messages as `[{"role": ..., "content": ...}, ...]` dicts. TYPE: `Sequence[dict[str, str]]`
`add_generation_prompt`	Whether to append an assistant generation prefix (e.g. `<\|assistant\|>`). Set to False when prefilling an assistant message. DEFAULT: `True`

RETURNS	DESCRIPTION
`str \| list[dict[str, str]]`	The formatted prompt string, or a list of message dicts if the model handles templating internally.

Source code in lm_eval/api/model.py

def apply_chat_template(
    self, chat_history: Sequence[dict[str, str]], add_generation_prompt=True
) -> str | list[dict[str, str]]:
    """Transform few-shot chat history into a string prompt for the model.

    Args:
        chat_history: Messages as ``[{"role": ..., "content": ...}, ...]`` dicts.
        add_generation_prompt: Whether to append an assistant generation prefix
            (e.g. ``<|assistant|>``). Set to False when prefilling an assistant message.

    Returns:
        The formatted prompt string, or a list of message dicts if the model handles templating internally.
    """
    raise NotImplementedError(
        "To use this model with chat templates, please implement the 'apply_chat_template' method for your model type."
    )

create_from_arg_string `classmethod` ¶

create_from_arg_string(arg_string: str, additional_config: dict | None = None) -> Self

Create an LM instance from a comma-separated argument string.

PARAMETER	DESCRIPTION
`arg_string`	Arguments as `"key1=value1,key2=value2"`. TYPE: `str`
`additional_config`	Extra configuration merged into the parsed args. TYPE: `dict \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Self`	An instance of this LM subclass.

Source code in lm_eval/api/model.py

@classmethod
def create_from_arg_string(
    cls, arg_string: str, additional_config: dict | None = None
) -> Self:
    """Create an LM instance from a comma-separated argument string.

    Args:
        arg_string: Arguments as ``"key1=value1,key2=value2"``.
        additional_config: Extra configuration merged into the parsed args.

    Returns:
        An instance of this LM subclass.
    """
    additional_config = {} if additional_config is None else additional_config
    args = utils.simple_parse_args_string(arg_string)
    args2 = {k: v for k, v in additional_config.items() if v is not None}
    return cls(**args, **args2)

create_from_arg_obj `classmethod` ¶

create_from_arg_obj(arg_dict: dict[str, Any], additional_config: dict[str, Any] | None = None) -> Self

Create an LM instance from a dictionary of arguments.

PARAMETER	DESCRIPTION
`arg_dict`	Keyword arguments forwarded to the constructor. TYPE: `dict[str, Any]`
`additional_config`	Extra configuration merged into `arg_dict`. TYPE: `dict[str, Any] \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Self`	An instance of this LM subclass.

Source code in lm_eval/api/model.py

@classmethod
def create_from_arg_obj(
    cls,
    arg_dict: dict[str, Any],
    additional_config: dict[str, Any] | None = None,
) -> Self:
    """Create an LM instance from a dictionary of arguments.

    Args:
        arg_dict: Keyword arguments forwarded to the constructor.
        additional_config: Extra configuration merged into ``arg_dict``.

    Returns:
        An instance of this LM subclass.
    """
    additional_config = (
        {}
        if additional_config is None
        else {k: v for k, v in additional_config.items() if v is not None}
    )

    return cls(**arg_dict, **additional_config)

all_gather ¶

all_gather(tensor)

All-gather a tensor across ranks.

Returns concatenated tensor from all ranks. Default: no-op.

Source code in lm_eval/api/model.py

def all_gather(self, tensor):
    """All-gather a tensor across ranks.

    Returns concatenated tensor from all ranks. Default: no-op.
    """
    return tensor

barrier ¶

barrier() -> None

Synchronization barrier. Default: no-op.

Source code in lm_eval/api/model.py

def barrier(self) -> None:
    """Synchronization barrier. Default: no-op."""
    return

chat_template ¶

chat_template(chat_template: bool | str = False) -> str | None

Return the chat template string for this model.

Override in subclasses to define a specific format. Returns empty string by default (no chat template).

Source code in lm_eval/api/model.py

def chat_template(self, chat_template: bool | str = False) -> str | None:
    """Return the chat template string for this model.

    Override in subclasses to define a specific format. Returns empty string
    by default (no chat template).
    """
    return ""

set_cache_hook ¶

set_cache_hook(cache_hook: CacheHook) -> None

Source code in lm_eval/api/model.py

def set_cache_hook(self, cache_hook: CacheHook) -> None:
    self.cache_hook = cache_hook

TemplateLM provides common tokenization and chat template logic. Most built-in backends extend this rather than LM directly.

TemplateLM ¶

TemplateLM()

Bases: LM


              flowchart TD
              lm_eval.api.model.TemplateLM[TemplateLM]
              lm_eval.api.model.LM[LM]

                              lm_eval.api.model.LM --> lm_eval.api.model.TemplateLM
                


              click lm_eval.api.model.TemplateLM href "" "lm_eval.api.model.TemplateLM"
              click lm_eval.api.model.LM href "" "lm_eval.api.model.LM"

LM subclass that provides shared tokenization and scoring boilerplate.

Handles context/continuation encoding, empty-context logic, and delegates token-level scoring to _loglikelihood_tokens.

Source code in lm_eval/api/model.py

def __init__(self) -> None:
    # set rank and world size to a single process, by default.
    self._rank = 0
    self._world_size = 1
    self._device = None
    self.cache_hook: CacheHook = CacheHook(None)

Attributes¶

tokenizer `class-attribute` `instance-attribute` ¶

tokenizer = None

backend `class-attribute` `instance-attribute` ¶

backend = 'causal'

eot_token_id `abstractmethod` `property` ¶

eot_token_id: int

prefix_token_id `property` ¶

prefix_token_id

Functions¶

tok_encode `abstractmethod` ¶

tok_encode(string: str, add_special_tokens: bool | None = None, **kwargs) -> list[int]

Tokenize a string and return a list of token IDs.

Must handle strings that already contain the BOS token when add_special_tokens is None. Otherwise, uses the flag as given.

Source code in lm_eval/api/model.py

@abc.abstractmethod
def tok_encode(
    self, string: str, add_special_tokens: bool | None = None, **kwargs
) -> list[int]:
    """Tokenize a string and return a list of token IDs.

    Must handle strings that already contain the BOS token when
    ``add_special_tokens`` is None. Otherwise, uses the flag as given.
    """
    ...

loglikelihood ¶

loglikelihood(requests: Sequence[LLInstance], disable_tqdm: bool = False) -> list[LLOutput]

Compute log-likelihood of continuations given contexts.

Tokenizes each (context, continuation) pair and delegates to _loglikelihood_tokens. Empty contexts use prefix_token_id (typically BOS/EOS) as the conditioning token.

PARAMETER	DESCRIPTION
`requests`	List of `Instance` objects. Each `Instance.args` is a `(context, continuation)` tuple. TYPE: `Sequence[LLInstance]`
`disable_tqdm`	Whether to suppress the progress bar. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`list[LLOutput]`	A list of `(logprob, is_greedy)` tuples, one per request.

Source code in lm_eval/api/model.py

def loglikelihood(
    self, requests: Sequence[LLInstance], disable_tqdm: bool = False
) -> list[LLOutput]:
    """Compute log-likelihood of continuations given contexts.

    Tokenizes each ``(context, continuation)`` pair and delegates to
    ``_loglikelihood_tokens``. Empty contexts use ``prefix_token_id``
    (typically BOS/EOS) as the conditioning token.

    Args:
        requests: List of ``Instance`` objects. Each ``Instance.args`` is a ``(context, continuation)`` tuple.
        disable_tqdm: Whether to suppress the progress bar.

    Returns:
        A list of ``(logprob, is_greedy)`` tuples, one per request.
    """
    new_reqs = []
    for context, continuation in [req.args for req in requests]:
        if context == "":
            continuation_enc = self.tok_encode(
                continuation, add_special_tokens=False
            )
            # BOS or EOS as context: handle when context is empty -> (context + continuation) -> (BOS + continuation
            context_enc, continuation_enc = (
                ([self.prefix_token_id], continuation_enc)
                if self.prefix_token_id != continuation_enc[0]
                else (continuation_enc[:1], continuation_enc[1:])
            )
            # BOS or EOS as context
        else:
            context_enc, continuation_enc = self._encode_pair(context, continuation)

        new_reqs.append(((context, continuation), context_enc, continuation_enc))

    return self._loglikelihood_tokens(new_reqs, disable_tqdm=disable_tqdm)

loglikelihood_rolling `abstractmethod` ¶

loglikelihood_rolling(requests: Sequence[LLInstance], disable_tqdm: bool = False) -> list[LLOutput]

Source code in lm_eval/api/model.py

@abc.abstractmethod
def loglikelihood_rolling(
    self, requests: Sequence[LLInstance], disable_tqdm: bool = False
) -> list[LLOutput]: ...

generate_until `abstractmethod` ¶

generate_until(requests: Sequence[GenInstance], disable_tqdm: bool = False) -> list[str]

Source code in lm_eval/api/model.py

@abc.abstractmethod
def generate_until(
    self, requests: Sequence[GenInstance], disable_tqdm: bool = False
) -> list[str]: ...

chat_template ¶

chat_template(chat_template: bool | str = False) -> str | None

Select and return the appropriate chat template for this model.

Resolution order (adapted from Transformers apply_chat_template):

No tokenizer — returns the empty string (template handled by provider).
Tokenizer has a dict of templates — use the named or "default" entry.
Tokenizer has a single template — use it, falling back to default_chat_template if unset.

PARAMETER	DESCRIPTION
`chat_template`	`False`/`None` to disable, `True` to auto-select, or a string name to pick a specific template from a dict. TYPE: `bool \| str` DEFAULT: `False`

RETURNS	DESCRIPTION
`str \| None`	The selected template string, or `None` if disabled.

Source code in lm_eval/api/model.py

def chat_template(self, chat_template: bool | str = False) -> str | None:
    """Select and return the appropriate chat template for this model.

    Resolution order (adapted from Transformers ``apply_chat_template``):

    * No tokenizer — returns the empty string (template handled by provider).
    * Tokenizer has a dict of templates — use the named or ``"default"`` entry.
    * Tokenizer has a single template — use it, falling back to
      ``default_chat_template`` if unset.

    Args:
        chat_template: ``False``/``None`` to disable, ``True`` to auto-select,
            or a string name to pick a specific template from a dict.

    Returns:
        The selected template string, or ``None`` if disabled.
    """
    if self.tokenizer is None:
        return ""

    if chat_template is False or chat_template is None:
        eval_logger.warning(
            "model.chat_template was called with the chat_template set to False or None. "
            "Therefore no chat template will be applied. Make sure this is an intended behavior."
        )
        return None

    # Convert boolean chat_template to None to ensure compatibility with the adapted logic
    if isinstance(chat_template, bool):
        chat_template = None
    using_default_template = False

    # First, handle the cases when the model has a dict of multiple templates
    try:
        template = (
            self.tokenizer.chat_template or self.tokenizer.default_chat_template
        )
    except AttributeError:
        return None

    if isinstance(template, dict):
        using_default_dict = self.tokenizer.chat_template is None

        if chat_template is not None:
            if chat_template in template:
                selected_template = template[chat_template]
                if using_default_dict:
                    using_default_template = True
            else:
                raise ValueError(
                    f"The specified chat template '{chat_template}' is not available. "
                    f"Available template names are {sorted(template.keys())}."
                )
        else:
            # If user didn't pass a chat template, use the default template from the dict
            if "default" in template:
                selected_template = template["default"]
                using_default_template = True
            else:
                raise ValueError(
                    "This model has multiple chat templates with no default specified! Please either pass a chat "
                    "template or the name of the template you wish to use to the `chat_template` argument. Available "
                    f"template names are {sorted(template.keys())}."
                )

    # Cases when the model has a single template or no template
    else:
        # priority: `chat_template` argument > `tokenizer.chat_template` > `tokenizer.default_chat_template
        if isinstance(chat_template, str):
            eval_logger.warning(
                "Chat template name provided, but the tokenizer's chat template is not a dictionary. "
                "Using the tokenizer's chat template or the default template instead."
            )
        if self.tokenizer.chat_template is not None:
            selected_template = self.tokenizer.chat_template
        else:
            selected_template = self.tokenizer.default_chat_template
            using_default_template = True

    if using_default_template:
        eval_logger.warning(
            "No chat template is set for this tokenizer, falling back to a default class-level template. This is "
            "very error-prone, because models are often trained with templates different from the class default! "
            "Default chat templates are a legacy feature and will be removed in Transformers v4.43, at which "
            "point any code depending on them will stop working. We recommend setting a valid chat template before "
            "then to ensure that this model continues working without issues."
        )

    return selected_template

CachingLM wraps any LM instance to add response caching.

CachingLM ¶

CachingLM(lm: LM, cache_db: str)

LM wrapper that returns cached results when available, falling back to the underlying model.

PARAMETER	DESCRIPTION
`lm`	The underlying language model to wrap. TYPE: `LM`
`cache_db`	Path to the SQLite cache database. TYPE: `str`

Source code in lm_eval/api/model.py

def __init__(self, lm: LM, cache_db: str) -> None:
    """LM wrapper that returns cached results when available, falling back to the underlying model.

    Args:
        lm: The underlying language model to wrap.
        cache_db: Path to the SQLite cache database.
    """
    from sqlitedict import SqliteDict

    self.lm: LM = lm
    self.cache_db: str = cache_db
    if os.path.dirname(cache_db):
        os.makedirs(os.path.dirname(cache_db), exist_ok=True)
    self.dbdict = SqliteDict(cache_db, autocommit=True)

    # add hook to lm
    lm.set_cache_hook(self.get_cache_hook())

Attributes¶

lm `instance-attribute` ¶

lm: LM = lm

cache_db `instance-attribute` ¶

cache_db: str = cache_db

dbdict `instance-attribute` ¶

dbdict = SqliteDict(cache_db, autocommit=True)

Functions¶

getattr ¶

__getattr__(attr: str) -> Any

Source code in lm_eval/api/model.py

def __getattr__(self, attr: str) -> Any:
    lm_attr = getattr(self.lm, attr)
    if attr not in ["loglikelihood", "loglikelihood_rolling", "generate_until"]:
        eval_logger.debug("Passing through attribute '%s' to underlying LM", attr)
        return lm_attr

    def _fn(requests: list[Instance]) -> list[Instance]:
        res = []
        remaining_reqs = []
        warned = False
        # figure out which ones are cached and which ones are new
        eval_logger.info(
            "Loading '%s' responses from cache '%s' where possible...",
            attr,
            self.cache_db,
        )
        for req in tqdm(requests, desc="Checking cached requests"):
            hsh = hash_args(attr, req.args)
            if attr == "generate_until" and req.args[1].get("do_sample", False):
                # when we are doing non-greedy generation, don't use the cache
                # (else every "randomly sampled" generation would be identical for repeats > 1).
                if not warned:
                    eval_logger.warning(
                        "Arguments to lm.generate_until() '%s' include non-deterministic sampling. Caching will not be performed for such requests.",
                        req.args[1],
                    )
                    warned = True
                res.append(None)
                remaining_reqs.append(req)
            elif hsh in self.dbdict:
                ob = self.dbdict[hsh]

                assert ob is not None

                res.append(ob)
            else:
                res.append(None)
                remaining_reqs.append(req)
        eval_logger.info(
            "Cached requests: %d, Requests remaining: %d",
            len(requests) - len(remaining_reqs),
            len(remaining_reqs),
        )
        # actually run the LM on the requests that do not have cached results
        rem_res = getattr(self.lm, attr)(remaining_reqs) if remaining_reqs else []

        # stick the new ones back into the list and also cache any of the new ones
        resptr = 0
        for req, r in zip(remaining_reqs, rem_res, strict=True):
            while res[resptr] is not None:
                resptr += 1

            res[resptr] = r

            # caching
            hsh = hash_args(attr, req.args)
            self.dbdict[hsh] = r
        self.dbdict.commit()

        return res

    return _fn

get_cache_hook ¶

get_cache_hook() -> CacheHook

Source code in lm_eval/api/model.py

def get_cache_hook(self) -> CacheHook:
    return CacheHook(self)

LM Base Class¶

LM ¶

Attributes¶

cache_hook instance-attribute ¶

device property ¶

rank property ¶

world_size property ¶

tokenizer_name property ¶

Functions¶

loglikelihood abstractmethod ¶

loglikelihood_rolling abstractmethod ¶

generate_until abstractmethod ¶

apply_chat_template ¶

create_from_arg_string classmethod ¶

create_from_arg_obj classmethod ¶

all_gather ¶

barrier ¶

chat_template ¶

set_cache_hook ¶

TemplateLM ¶

Attributes¶

tokenizer class-attribute instance-attribute ¶

backend class-attribute instance-attribute ¶

eot_token_id abstractmethod property ¶

prefix_token_id property ¶

Functions¶

tok_encode abstractmethod ¶

loglikelihood ¶

loglikelihood_rolling abstractmethod ¶

generate_until abstractmethod ¶

chat_template ¶

CachingLM ¶

Attributes¶

lm instance-attribute ¶

cache_db instance-attribute ¶

dbdict instance-attribute ¶

Functions¶

__getattr__ ¶

get_cache_hook ¶

cache_hook `instance-attribute` ¶

device `property` ¶

rank `property` ¶

world_size `property` ¶

tokenizer_name `property` ¶

loglikelihood `abstractmethod` ¶

loglikelihood_rolling `abstractmethod` ¶

generate_until `abstractmethod` ¶

create_from_arg_string `classmethod` ¶

create_from_arg_obj `classmethod` ¶

tokenizer `class-attribute` `instance-attribute` ¶

backend `class-attribute` `instance-attribute` ¶

eot_token_id `abstractmethod` `property` ¶

prefix_token_id `property` ¶

tok_encode `abstractmethod` ¶

loglikelihood_rolling `abstractmethod` ¶

generate_until `abstractmethod` ¶

lm `instance-attribute` ¶

cache_db `instance-attribute` ¶

dbdict `instance-attribute` ¶

getattr ¶