View Source LangChain.ChatModels.ChatAnthropic (LangChain v0.4.0-rc.1)
Module for interacting with Anthropic models.
Parses and validates inputs for making requests to Anthropic's messages API.
Converts responses into more specialized LangChain
data structures.
Callbacks
See the set of available callbacks: LangChain.Chains.ChainCallbacks
Rate Limit API Response Headers
Anthropic returns rate limit information in the response headers. Those can be accessed using an LLM callback like this:
handler = %{
on_llm_ratelimit_info: fn _chain, headers ->
IO.inspect(headers)
end
}
%{llm: ChatAnthropic.new!(%{model: "..."})}
|> LLMChain.new!()
# ... add messages ...
|> LLMChain.add_callback(handler)
|> LLMChain.run()
When a request is received, something similar to the following will be output to the console.
%{
"anthropic-ratelimit-requests-limit" => ["50"],
"anthropic-ratelimit-requests-remaining" => ["49"],
"anthropic-ratelimit-requests-reset" => ["2024-06-08T04:28:30Z"],
"anthropic-ratelimit-tokens-limit" => ["50000"],
"anthropic-ratelimit-tokens-remaining" => ["50000"],
"anthropic-ratelimit-tokens-reset" => ["2024-06-08T04:28:30Z"],
"request-id" => ["req_1234"]
}
Token Usage
Anthropic returns token usage information as part of the response body. The
LangChain.TokenUsage
is added to the metadata
of the LangChain.Message
and LangChain.MessageDelta
structs that are processed under the :usage
key.
%LangChain.MessageDelta{
content: [],
status: :incomplete,
index: nil,
role: :assistant,
tool_calls: nil,
metadata: %{
usage: %LangChain.TokenUsage{
input: 55,
output: 4,
raw: %{
"cache_creation_input_tokens" => 0,
"cache_read_input_tokens" => 0,
"input_tokens" => 55,
"output_tokens" => 4
}
}
}
}
The TokenUsage
data is accumulated for MessageDelta
structs and the final usage information will be on the LangChain.Message
.
Tool Choice
Anthropic supports forcing a tool to be used.
This is supported through the tool_choice
options. It takes a plain Elixir map to provide the configuration.
By default, the LLM will choose a tool call if a tool is available and it determines it is needed. That's the "auto" mode.
Example
Force the LLM's response to make a tool call of the "get_weather" function.
ChatAnthropic.new(%{
model: "...",
tool_choice: %{"type" => "tool", "name" => "get_weather"}
})
AWS Bedrock Support
Anthropic Claude is supported in AWS Bedrock.
To configure ChatAnthropic
for use on AWS Bedrock:
Request Model Access to get access to the Anthropic models you intend to use.
Using your AWS Console, create an Access Key for your application.
Set the key values in your
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
ENVs.Get the Model ID for the model you intend to use. Base Models
Refer to
LangChain.Utils.BedrockConfig
for setting up the Bedrock authentication credentials for your environment.Setup your ChatAnthropic similar to the following:
alias LangChain.ChatModels.ChatAnthropic
ChatAnthropic.new!(%{ model: "anthropic.claude-3-5-sonnet-20241022-v2:0", bedrock: BedrockConfig.from_application_env!() })
Thinking
Models like Claude 3.7 Sonnet introduced a hybrid approach which allows for "thinking" and reasoning. See the Anthropic thinking documentation for up-to-date instructions on the usage.
For instance, enabling thinking may require the temperature
to be set to 1
and other settings like topP
may not be allowed.
The model supports a :thinking
attribute where the data is a map that matches the structure in the
Anthropic documentation. It is passed along as-is.
Example:
# Enable thinking and budget 2,000 tokens for the thinking space.
model = ChatAnthropic.new!(%{
model: "claude-3-7-sonnet-latest",
thinking: %{type: "enabled", budget_tokens: 2000}
})
# Disable thinking
model = ChatAnthropic.new!(%{
model: "claude-3-7-sonnet-latest",
thinking: %{type: "disabled"}
})
As of the documentation for Claude 3.7 Sonnet, the minimum budget for thinking is 1024 tokens.
Prompt Caching
Anthropic supports prompt caching to reduce costs and latency for frequently repeated content. Prompt caching works by caching large blocks of content that are likely to be reused across multiple requests.
Prompt caching is configured through the cache_control
option in ContentPart
options. It can be applied
to both system messages, regular user messages, tool results, and tool definitions.
Anthropic limits a conversation to max of 4 cache_control blocks and will refuse to service requests with more.
Basic Usage
Setting cache_control: true
is a shortcut for the default ephemeral cache control:
# System message with caching
Message.new_system!([
ContentPart.text!("You are an AI assistant analyzing literary works."),
ContentPart.text!("<large document content>", cache_control: true)
])
# User message with caching
Message.new_user!([
ContentPart.text!("Please analyze this document:"),
ContentPart.text!("<large document content>", cache_control: true)
])
Advanced Cache Control
For more explicit control over caching parameters, you can provide a map instead of true
:
ContentPart.text!("content", cache_control: %{"type" => "ephemeral", "ttl" => "1h"})
When cache_control: true
is used, it automatically expands to %{"type" => "ephemeral"}
in the API request.
If you need specific cache control settings like TTL, providing them explicitly preserves the exact values
sent to the API.
The default is "5m" for 5 minutes but supports "1h" for 1 hour depending on your account.
Supported Content Types
Prompt caching can be applied to:
- Text content in system messages
- Text content in user messages
- Tool results in the
content
field when returning a list ofContentPart
structs. - Tool definitions in the
options
field when creating aFunction
struct.
For more information, see the Anthropic prompt caching documentation.
Summary
Functions
Calls the Anthropic API passing the ChatAnthropic struct with configuration, plus either a simple message or the list of messages to act as the prompt.
Converts a ContentPart to the format expected by the Anthropic API.
Converts a list of ContentParts to the format expected by the Anthropic API.
Convert a LangChain structure to the expected map of data for the Anthropic API.
Return the params formatted for an API request.
Convert a Function to the format expected by the Anthropic API.
Converts a Message to the format expected by the Anthropic API.
Setup a ChatAnthropic client configuration.
Setup a ChatAnthropic client configuration and return it or raise an error if invalid.
After all the messages have been converted using for_api/1
, this combines
multiple sequential tool response messages. The Anthropic API is very strict
about user, assistant, user, assistant sequenced messages.
Restores the model from the config.
Generate a config map that can later restore the model's configuration.
Types
@type t() :: %LangChain.ChatModels.ChatAnthropic{ api_key: term(), api_version: term(), bedrock: term(), beta_headers: term(), callbacks: term(), endpoint: term(), max_tokens: term(), model: term(), receive_timeout: term(), stream: term(), temperature: term(), thinking: term(), tool_choice: term(), top_k: term(), top_p: term(), verbose_api: term() }
Functions
Calls the Anthropic API passing the ChatAnthropic struct with configuration, plus either a simple message or the list of messages to act as the prompt.
Optionally pass in a callback function that can be executed as data is received from the API.
NOTE: This function can be used directly, but the primary interface
should be through LangChain.Chains.LLMChain
. The ChatAnthropic
module is more focused on
translating the LangChain
data structures to and from the Anthropic API.
Another benefit of using LangChain.Chains.LLMChain
is that it combines the
storage of messages, adding functions, adding custom context that should be
passed to functions, and automatically applying LangChain.MessageDelta
structs as they are are received, then converting those to the full
LangChain.Message
once fully complete.
@spec content_part_for_api(LangChain.Message.ContentPart.t()) :: map() | nil | no_return()
Converts a ContentPart to the format expected by the Anthropic API.
Handles different content types:
:text
- Converts to a text content part, optionally with cache control settings:thinking
- Converts to a thinking content part with required signature:unsupported
- Handles custom content types specified in options:image
- Converts to an image content part with base64 data and media type:image_url
- Raises an error as Anthropic doesn't support image URLs
Options
For :text
type:
:cache_control
- When provided, adds cache control settings to the content
For :thinking
type:
:signature
- Required signature for thinking content
For :unsupported
type:
:type
- Required string specifying the custom content type
For :image
type:
:media
- Required media type (:png
,:jpg
,:jpeg
,:gif
,:webp
, or a string)
Returns nil
for unsupported content without required options.
Converts a list of ContentParts to the format expected by the Anthropic API.
@spec for_api( LangChain.Message.t() | LangChain.Message.ContentPart.t() | LangChain.Function.t() ) :: %{required(String.t()) => any()} | no_return()
Convert a LangChain structure to the expected map of data for the Anthropic API.
@spec for_api(t(), message :: [map()], LangChain.ChatModels.ChatModel.tools()) :: %{ required(atom()) => any() }
Return the params formatted for an API request.
@spec function_for_api(LangChain.Function.t()) :: map() | no_return()
Convert a Function to the format expected by the Anthropic API.
Converts a Message to the format expected by the Anthropic API.
@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}
Setup a ChatAnthropic client configuration.
Setup a ChatAnthropic client configuration and return it or raise an error if invalid.
After all the messages have been converted using for_api/1
, this combines
multiple sequential tool response messages. The Anthropic API is very strict
about user, assistant, user, assistant sequenced messages.
Restores the model from the config.
Generate a config map that can later restore the model's configuration.