跳到主要内容

Embedding API 使用指南

图灵平台的嵌入模型可以将文本转换为高维向量表示,用于语义搜索、聚类、推荐等任务。

注意:

  • 支持多种嵌入模型,包括 text-embedding-ada-002 等
  • 单次请求最大支持 8192 个 token
  • 返回的向量维度根据选择的模型而定

模型和维度

模型默认维度可调维度范围
turing/text-embedding-ada-0021536不支持调整
turing/text-embedding-3-small1536512, 1536
turing/text-embedding-3-large3072256, 1024, 3072

自定义维度

对于支持维度调整的模型,您可以通过 dimensions 参数来指定输出向量的维度。较小的维度可以降低成本和提高性能,但可能会影响嵌入质量。

SDK

from openai import OpenAI

client = OpenAI()

response = client.embeddings.create(
model="text-embedding-ada-002",
input="The quick brown fox jumps over the lazy dog"
)

print(response.data[0].embedding)
print(f"Embedding dimension: {len(response.data[0].embedding)}")

# 批量处理多个文本
texts = [
"Hello world",
"Python programming",
"Machine learning"
]

response = client.embeddings.create(
model="text-embedding-ada-002",
input=texts
)

for i, embedding in enumerate(response.data):
print(f"Text {i+1} embedding dimension: {len(embedding.embedding)}")

# 使用自定义维度 (仅适用于 text-embedding-3-small 和 text-embedding-3-large)
response = client.embeddings.create(
model="text-embedding-3-small",
input="Custom dimension example",
dimensions=512 # 指定维度为 512
)

print(f"Custom embedding dimension: {len(response.data[0].embedding)}")

CURL

curl $TURING_BASE_URL/embeddings \
-H "Authorization: Bearer $TURING_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": "The quick brown fox jumps over the lazy dog"
}'

批量处理示例

curl $TURING_BASE_URL/embeddings \
-H "Authorization: Bearer $TURING_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": [
"Hello world",
"Python programming",
"Machine learning"
]
}'

自定义维度示例

curl $TURING_BASE_URL/embeddings \
-H "Authorization: Bearer $TURING_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog",
"dimensions": 512
}'