Mistral 3 Small (24B AWQ)

Mistral 3 Small is a language model that delivers capabilities comparable to larger models while being compact. It’s ideal for conversational agents, function calling, fine-tuning, and local inference with sensitive data.

Model details

CategoryDetails
Model NameMistral 3 Small
Version24B AWQ
Model CategoryLarge Language Model (LLM)
Size24B parameters
HuggingFace Modelcasperhansen/mistral-small-24b-instruct-2501-awq
OpenAI Compatible EndpointChat Completions
LicenseApache 2.0

Capabilities

FeatureDetails
Tool Calling
Azion Long-term Support (LTS)
Context Length32k tokens
Supports LoRA
Input DataText

Usage

Basic chat completion

This is an example of a basic chat completion request using this model:

const modelResponse = await Azion.AI.run("casperhansen-mistral-small-24b-instruct-2501-awq", {
"stream": true,
"max_tokens": 1024,
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Name the european capitals"
}
]
})
PropertyTypeDescription
streambooleanBoolean indicating if the response should be streamed.
max_tokensnumberThe maximum number of tokens in the response.
messagesarrayAn array of message objects containing the role and content of the message.
messages[].rolestringThe role of the message sender.
messages[].contentstringThe content of the message.

Response example:

{
"id": "chatcmpl-e27716424abf4b3f891ff4850470cb09",
"object": "chat.completion",
"created": 1746821581,
"model": "casperhansen-mistral-small-24b-instruct-2501-awq",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"reasoning_content": null,
"content": "Sure! Here is a list of some European capitals...",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 9,
"total_tokens": 527,
"completion_tokens": 518,
"prompt_tokens_details": null
},
"prompt_logprobs": null
}
PropertyTypeDescription
idstringUnique identifier for the response.
objectstringThe type of object returned in the response.
creatednumberTimestamp of when the response was created.
modelstringThe name of the model used for the request.
choicesarrayArray of objects containing the response choices.
usageobjectObject containing usage metrics for the request.
prompt_logprobsnumberLog probabilities of the prompt.
choices[].indexnumberIndex of the choice in the array.
choices[].messageobjectObject containing the message details.
choices[].message.rolestringThe role of the message sender.
choices[].message.reasoning_contentstringThe reasoning content of the message.
choices[].message.contentstringThe content of the message.
choices[].message.tool_callsarrayArray of tool call objects.
choices[].logprobsnumberLog probabilities of the choice.
choices[].finish_reason.stringThe reason the choice was finished.
choices[].stop_reasonstringThe reason the choice was stopped.
usage.prompt_tokensnumberThe number of tokens in the input prompt.
usage.total_tokensnumberThe total number of tokens processed.
usage.completion_tokensnumberThe number of tokens in the completion.
usage.prompt_tokens_detailsstringAdditional details about the prompt tokens.

Tool Calling example

This is an example of a tool calling request using this model:

const modelResponse = await Azion.AI.run("casperhansen-mistral-small-24b-instruct-2501-awq", {
"stream": true,
"max_tokens": 1024,
"messages": [
{
"role": "system",
"content": "You are a helpful assistant with access to tools."
},
{
"role": "user",
"content": "What is the weather in London?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state"
}
},
"required": [
"location"
]
}
}
}
]
})
PropertyTypeDescription
streambooleanIndicates whether to stream the response.
max_tokensnumberThe maximum number of tokens to generate in the response.
messagesarrayList of messages in the conversation.
messages[].rolestringThe role of the message sender.
messages[].contentstringThe content of the message.
toolsarrayA list of tools or functions the model can call.
tools[].typestringThe type of the tool.
tools[].functionobjectMetadata about the function being defined.
tools[].function.namestringName of the function.
tools[].function.descriptionstringDescription of what the function does.
tools[].function.parametersobjectJSON Schema describing the function’s parameters.

Response example:

{
"id": "chatcmpl-88affc4730cf4219a06d2b15aad9ad44",
"object": "chat.completion",
"created": 1746821866,
"model": "qwen-qwen25-vl-3b-instruct-awq",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"reasoning_content": null,
"content": null,
"tool_calls": [
{
"id": "chatcmpl-tool-fd3311e75aed4cbfbeb7244ced77379f",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"London\"}"
}
}
]
},
"logprobs": null,
"finish_reason": "tool_calls",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 293,
"total_tokens": 313,
"completion_tokens": 20,
"prompt_tokens_details": null
},
"prompt_logprobs": null
}
PropertyTypeDescription
idstringUnique identifier for the response.
objectstringThe type of object returned in the response.
creatednumberTimestamp of when the response was created.
modelstringThe name of the model used for the request.
choicesarrayArray of objects containing the response choices.
usageobjectObject containing usage metrics for the request.
choices[index]numberIndex of the choice in the array.
choices[].messageobjectObject containing the message details.
choices[].message.rolestringThe role of the message sender.
choices[].message.reasoning_contentstringThe reasoning content of the message.
choices[].message.contentstringThe content of the message.
choices[].message.tool_calls[]arrayArray of tool call objects.
choices[].message.tool_calls[].idstringUnique identifier for the tool call.
choices[].message.tool_calls[].typestringThe type of tool call.
choices[].message.tool_calls[].functionobjectObject containing the function details.
choices[].message.tool_calls[].function.namestringThe name of the function.
choices[].message.tool_calls[].function.argumentsstringThe arguments passed to the function.
usage.prompt_tokensnumberThe number of tokens in the input prompt.
usage.total_tokensnumberThe total number of tokens processed.
usage.completion_tokensnumberThe number of tokens in the completion.
usage.prompt_tokens_detailsstringAdditional details about the prompt tokens.

JSON schema

{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": [
"messages"
],
"properties": {
"messages": {
"type": "array",
"items":
"$ref": "#/components/schemas/Message"
}
},
"temperature": {
"type": "number",
"minimum": 0,
"maximum": 2
},
"top_p": {
"type": "number",
"minimum": 0,
"maximum": 1,
"default": 1
},
"n": {
"type": "integer",
"minimum": 1,
"default": 1
},
"stream": {
"type": "boolean",
"default": false
},
"max_tokens": {
"type": "integer",
"minimum": 1
},
"presence_penalty": {
"type": "number",
"minimum": -2,
"maximum": 2,
"default": 0
},
"frequency_penalty": {
"type": "number",
"minimum": -2,
"maximum": 2,
"default": 0
},
"tools": {
"type": "array",
"items": {
"$ref": "#/components/schemas/ToolDefinition"
}
}
},
"components": {
"schemas": {
"Message": {
"oneOf": [
{
"$ref": "#/components/schemas/SystemMessage"
},
{
"$ref": "#/components/schemas/UserMessage"
},
{
"$ref": "#/components/schemas/AssistantMessage"
},
{
"$ref": "#/components/schemas/ToolMessage"
}
]
},
"SystemMessage": {
"type": "object",
"required": [
"role",
"content"
],
"properties": {
"role": {
"type": "string",
"enum": [
"system"
]
},
"content": {
"$ref": "#/components/schemas/TextContent"
}
}
},
"UserMessage": {
"type": "object",
"required": [
"role",
"content"
],
"properties": {
"role": {
"type": "string",
"enum": [
"user"
]
},
"content": {
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"oneOf": [
{
"$ref": "#/components/schemas/TextContentItem"
}
]
}
}
]
}
}
},
"AssistantMessage": {
"oneOf": [
{
"$ref": "#/components/schemas/AssistantMessageWithoutToolCalls"
},
{
"$ref": "#/components/schemas/AssistantMessageWithToolCalls"
}
]
},
"ToolMessage": {
"type": "object",
"required": [
"role",
"content",
"tool_call_id"
],
"properties": {
"role": {
"enum": [
"tool"
]
},
"content": {
"type": "string"
},
"tool_call_id": {
"type": "string"
}
}
},
"AssistantMessageWithoutToolCalls": {
"type": "object",
"required": [
"role",
"content"
],
"properties": {
"role": {
"type": "string",
"enum": [
"assistant"
]
},
"content": {
"$ref": "#/components/schemas/TextContent"
}
},
"not": {
"required": [
"tool_calls"
]
}
},
"AssistantMessageWithToolCalls": {
"type": "object",
"required": [
"role",
"tool_calls"
],
"properties": {
"role": {
"type": "string",
"enum": [
"assistant"
]
},
"tool_calls": {
"type": "array",
"items": {
"$ref": "#/components/schemas/ToolCalls"
}
}
}
},
"TextContent": {
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"$ref": "#/components/schemas/TextContentItem"
}
}
],
"description": "Text content that can be provided either as a simple string or as an array of TextContentItem objects"
},
"TextContentItem": {
"type": "object",
"required": [
"type",
"text"
],
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
},
"text": {
"type": "string"
}
}
},
"ToolCalls": {
"type": "object",
"required": [
"function",
"id",
"type"
],
"properties": {
"function": {
"type": "object",
"required": [
"name",
"arguments"
],
"properties": {
"name": {
"type": "string"
},
"arguments": {
"type": "string"
}
}
},
"id": {
"type": "string"
},
"type": {
"enum": [
"function"
]
}
},
"description":"The name and arguments of a function that should be called, as generated by the model."
},
"ToolDefinition": {
"type": "object",
"required": [
"type",
"function"
],
"properties": {
"type": {
"type": "string",
"enum": [
"function"
]
},
"function": {
"type": "object",
"required": [
"name"
],
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"parameters": {
"type": "object",
"additionalProperties": true
},
"strict": {
"type": "boolean",
"default": false
}
}
}
},
"description": "Definition of a tool that can be used by the model"
}
}
}
}