Qwen3 Embedding 4b

Qwen3 Embedding 4B is a 4B-parameter multilingual embedding model (36 layers, 32K context) that outputs 2560‑dim vectors for text/code retrieval, classification, clustering, and bitext mining. It supports instruction-conditioned embeddings and is optimized for efficient, cross-lingual representation learning.

Model details

CategoryDetails
Model NameQwen/Qwen3-Embedding-4B
VersionOriginal
Model CategoryEmbedding
Size4B parameters
HuggingFace ModelQwen/Qwen3-Embedding-4B
OpenAI Compatible EndpointEmbeddings
LicenseApache 2.0

Capabilities

FeatureStatus
Context Length32k tokens
Input DataText
Output Dimensions256, 512, 1024, 2048, 4096

Usage

Embedding

const modelResponse = await Azion.AI.run("Qwen/Qwen3-Embedding-4B", {
"input": "The food was delicious and the waiter...",
"encoding_format": "float"
})

Response example:

{"id":"embd-84a83438abff420e9c785c1659ae8ad6","object":"list","created":1746821207,"model":"Qwen/Qwen3-Embedding-4B","data":[{"index":0,"object":"embedding","embedding":[0.01,...,0.005]}],"usage":{"prompt_tokens":11,"total_tokens":11,"completion_tokens":0,"prompt_tokens_details":null}}

Different dimensions can be selected by setting the dimensions parameter:

const modelResponse = await Azion.AI.run("qwen-qwen3-embedding-4b", {
"input": "The food was delicious and the waiter...",
"encoding_format": "float",
"dimensions": 256
})

JSON Schema

{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": [
"input"
],
"properties": {
"encoding_format": {
"type": "string",
"enum": [
"float",
"base64"
]
},
"dimensions": {
"enum": [
256,
512,
1024,
2048,
4096
]
},
"input": {
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"oneOf": [
{
"type": "string"
},
{
"type": "integer"
},
{
"type": "array",
"items": {
"type": "integer"
}
}
]
}
}
]
}
}
}