Embedding API 使用指南
图灵平台的嵌入模型可以将文本转换为高维向量表示,用于语义搜索、聚类、推荐等任务。
注意:
- 支持多种嵌入模型,包括 text-embedding-ada-002 等
- 单次请求最大支持 8192 个 token
- 返回的向量维度根据选择的模型而定
模型和维度
| 模型 | 默认维度 | 可调维度范围 |
|---|---|---|
| turing/text-embedding-ada-002 | 1536 | 不支持调整 |
| turing/text-embedding-3-small | 1536 | 512, 1536 |
| turing/text-embedding-3-large | 3072 | 256, 1024, 3072 |
自定义维度
对于支持维度调整的模型,您可以通过 dimensions 参数来指定输出向量的维度。较小的维度可以降低成本和提高性能,但可能会影响嵌入质量。
SDK
from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
model="text-embedding-ada-002",
input="The quick brown fox jumps over the lazy dog"
)
print(response.data[0].embedding)
print(f"Embedding dimension: {len(response.data[0].embedding)}")
# 批量处理多个文本
texts = [
"Hello world",
"Python programming",
"Machine learning"
]
response = client.embeddings.create(
model="text-embedding-ada-002",
input=texts
)
for i, embedding in enumerate(response.data):
print(f"Text {i+1} embedding dimension: {len(embedding.embedding)}")
# 使用自定义维度 (仅适用于 text-embedding-3-small 和 text-embedding-3-large)
response = client.embeddings.create(
model="text-embedding-3-small",
input="Custom dimension example",
dimensions=512 # 指定维度为 512
)
print(f"Custom embedding dimension: {len(response.data[0].embedding)}")
CURL
curl $TURING_BASE_URL/embeddings \
-H "Authorization: Bearer $TURING_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": "The quick brown fox jumps over the lazy dog"
}'
批量处理示例
curl $TURING_BASE_URL/embeddings \
-H "Authorization: Bearer $TURING_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": [
"Hello world",
"Python programming",
"Machine learning"
]
}'
自定义维度示例
curl $TURING_BASE_URL/embeddings \
-H "Authorization: Bearer $TURING_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog",
"dimensions": 512
}'