MessageTokenLimiter

MessageTokenLimiter(
    max_tokens_per_message: int | None = None,
    max_tokens: int | None = None,
    min_tokens: int | None = None,
    model: str = 'gpt-3.5-turbo-0613',
    filter_dict: dict[str, Any] | None = None,
    exclude_filter: bool = True
)

Truncates messages to meet token limits for efficient processing and response generation.
This transformation applies two levels of truncation to the conversation history:
1. Truncates each individual message to the maximum number of tokens specified by max_tokens_per_message.
2. Truncates the overall conversation history to the maximum number of tokens specified by max_tokens.
NOTE: Tokens are counted using the encoder for the specified model. Different models may yield different token counts for the same text.
NOTE: For multimodal LLMs, the token count may be inaccurate as it does not account for the non-text input (e.g images).
The truncation process follows these steps in order:
1. The minimum tokens threshold (min_tokens) is checked (0 by default). If the total number of tokens in messages are less than this threshold, then the messages are returned as is. In other case, the following process is applied.
2. Messages are processed in reverse order (newest to oldest).
3. Individual messages are truncated based on max_tokens_per_message. For multimodal messages containing both text and other types of content, only the text content is truncated.
4. The overall conversation history is truncated based on the max_tokens limit. Once the accumulated token count exceeds this limit, the current message being processed get truncated to meet the total token count and any remaining messages get discarded.
5. The truncated conversation history is reconstructed by prepending the messages to a new list to preserve the original message order.

Parameters:
NameDescription
max_tokens_per_messageType: int | None

Default: None
max_tokensType: int | None

Default: None
min_tokensType: int | None

Default: None
modelType: str

Default: ‘gpt-3.5-turbo-0613’
filter_dictType: dict[str, typing.Any] | None

Default: None
exclude_filterType: bool

Default: True

Instance Methods

apply_transform

apply_transform(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]

Applies token truncation to the conversation history.

Parameters:
NameDescription
messagesThe list of messages representing the conversation history.

Type: list[dict[str, typing.Any]]
Returns:
TypeDescription
list[dict[str, typing.Any]]List[Dict]: A new list containing the truncated messages up to the specified token limits.

get_logs

get_logs(
    self,
    pre_transform_messages: list[dict[str, Any]],
    post_transform_messages: list[dict[str, Any]]
) -> tuple[str, bool]
Parameters:
NameDescription
pre_transform_messagesType: list[dict[str, typing.Any]]
post_transform_messagesType: list[dict[str, typing.Any]]