Embedding Vector Embedding
Obtain vector representations of given inputs that can be easily used by machine learning models and algorithms.
What are embedding vectors?
Embedding vectors are lists of floating-point numbers used to measure the relevance of text strings. The distance between two vectors measures their relevance: a small distance indicates high relevance, while a large distance indicates low relevance.
Embeddings are commonly used for:
- Search - Results are ranked by their relevance to the query string
- Clustering - Grouping text strings by similarity
- Recommendation - Recommending items with relevant text strings
- Anomaly Detection - Identifying outliers with lower relevance
- Classification - Classifying text strings by their most similar labels
Quick Start
Install the SDK
pip install openaiBasic Example
Please make sure to replace
$MODELVERSE_API_KEYwith your own API Key, obtain your API Key .
** Python **
from openai import OpenAI
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
response = client.embeddings.create(
input="Your text string goes here",
model="text-embedding-3-large"
)
print(response.data[0].embedding)** curl **
curl https://api.umodelverse.ai/v1/embeddings \
-H "Authorization: Bearer $MODELVERSE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "Your text string goes here",
"model": "text-embedding-3-large"
}'The response contains the embedding vector (a list of floating-point numbers) along with some additional metadata. You can extract the embedding vector, store it in a vector database, and use it for many different use cases.
API Reference
POST https://api.umodelverse.ai/v1/embeddings
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| input | string or array | Yes | Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings. Inputs must not exceed 8192 tokens and cannot be empty strings. The total token sum of all inputs in a single request can be up to 300,000. |
| model | string | Yes | The model ID to use, such as text-embedding-3-large. |
| dimensions | integer | No | The number of dimensions the output embedding vector should have. Supported only in text-embedding-3 and later model versions. |
| encoding_format | string | No | The format to return the embedding vector. Can be float or base64. Default: float |
Response Example
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023064255, -0.009327292, ..., -0.0028842222],
"index": 0
}
],
"model": "text-embedding-3-large",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}Response Field Descriptions
| Field | Type | Description |
|---|---|---|
| embedding | array | The embedding vector, a list of floating-point numbers. The length of the vector depends on the model. |
| index | integer | Index of the embedding in the list of embeddings. |
| object | string | The type of the object, always “embedding”. |
Embedding Models
| Model | Default Dimensions | Max Input | MTEB Evaluation Performance |
|---|---|---|---|
| text-embedding-3-large | 3072 | 8192 | 64.6% |
| text-embedding-ada-002 | 1536 | 8192 | 61.0% |
Reducing Embedding Dimensions
Using larger embedding vectors is often more expensive and consumes more compute, memory, and storage. You can shorten the embedding dimensions by passing in the dimensions parameter without losing the conceptual representation properties of the embedding.
For example, a text-embedding-3-large embedding can be shortened to 256 dimensions while still outperforming a 1536-dimensional text-embedding-ada-002.
** Python **
from openai import OpenAI
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
response = client.embeddings.create(
model="text-embedding-3-large",
input="Testing 123",
dimensions=256 # Specify output dimensions
)
print(response.data[0].embedding)** curl **
curl https://api.umodelverse.ai/v1/embeddings \
-H "Authorization: Bearer $MODELVERSE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "Testing 123",
"model": "text-embedding-3-large",
"dimensions": 256
}'Manual Dimension Normalization
If you need to manually truncate and normalize the embedding vector:
from openai import OpenAI
import numpy as np
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
def normalize_l2(x):
x = np.array(x)
if x.ndim == 1:
norm = np.linalg.norm(x)
if norm == 0:
return x
return x / norm
else:
norm = np.linalg.norm(x, 2, axis=1, keepdims=True)
return np.where(norm == 0, x, x / norm)
response = client.embeddings.create(
model="text-embedding-3-large",
input="Testing 123",
encoding_format="float"
)
cut_dim = response.data[0].embedding[:256]
norm_dim = normalize_l2(cut_dim)
print(norm_dim)Use Cases
1. Text Search
Use cosine similarity between the query’s embedding vector and each document to return the highest-scoring document.
from openai import OpenAI
import numpy as np
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def get_embedding(text, model="text-embedding-3-large"):
response = client.embeddings.create(input=text, model=model)
return response.data[0].embedding
def search_documents(documents, query, n=3):
query_embedding = get_embedding(query)
results = []
for doc in documents:
doc_embedding = get_embedding(doc)
similarity = cosine_similarity(query_embedding, doc_embedding)
results.append((doc, similarity))
results.sort(key=lambda x: x[1], reverse=True)
return results[:n]
# Example
documents = ["Python is a programming language", "Machine learning is fun", "The weather is nice today"]
results = search_documents(documents, "programming")
print(results)2. Embedding-based Q&A
Put the relevant document into the model’s context window for Q&A.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
# Assume relevant articles have been found through embedding search
relevant_article = "The gold medal for curling at the 2022 Winter Olympics was won by..."
query = f"""Answer the question using the following article. If you can't find the answer, write "I don't know."
Article:
\"\"\"
{relevant_article}
\"\"\"
Question: Which athletes won the curling gold medal at the 2022 Winter Olympics?
"""
response = client.chat.completions.create(
messages=[
{'role': 'system', 'content': 'You answer questions about the 2022 Winter Olympics.'},
{'role': 'user', 'content': query},
],
model="gpt-4o",
temperature=0,
)
print(response.choices[0].message.content)3. Clustering Analysis
Use embedding vectors to perform clustering and grouping of text.
import numpy as np
from sklearn.cluster import KMeans
# Assume embeddings is a list of obtained embedding vectors
embeddings = [...] # Embedding vectors obtained from the API
matrix = np.vstack(embeddings)
n_clusters = 4
kmeans = KMeans(
n_clusters=n_clusters,
init='k-means++',
random_state=42
)
kmeans.fit(matrix)
# Cluster labels for each text
labels = kmeans.labels_4. Recommendation System
Perform recommendations based on the similarity of embedding vectors.
from openai import OpenAI
import numpy as np
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
def get_embedding(text, model="text-embedding-3-large"):
response = client.embeddings.create(input=text, model=model)
return response.data[0].embedding
def recommend_similar(items, source_index, n=3):
"""Returns the n most similar items to the source item."""
embeddings = [get_embedding(item) for item in items]
source_embedding = embeddings[source_index]
similarities = []
for i, emb in enumerate(embeddings):
if i != source_index:
sim = np.dot(source_embedding, emb)
similarities.append((i, items[i], sim))
similarities.sort(key=lambda x: x[2], reverse=True)
return similarities[:n]5. Zero-Shot Classification
Classify without the need for training data using embeddings.
from openai import OpenAI
import numpy as np
client = OpenAI(
api_key="YOUR_MODELVERSE_API_KEY",
base_url="https://api.umodelverse.ai/v1"
)
def get_embedding(text, model="text-embedding-3-large"):
response = client.embeddings.create(input=text, model=model)
return response.data[0].embedding
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def classify_text(text, labels):
text_embedding = get_embedding(text)
label_embeddings = [get_embedding(label) for label in labels]
similarities = [cosine_similarity(text_embedding, le) for le in label_embeddings]
best_index = np.argmax(similarities)
return labels[best_index]
# Example
labels = ["positive", "negative", "neutral"]
result = classify_text("This product is amazing!", labels)
print(result) # Output: positiveFAQs
How to calculate the number of tokens in a string?
Use OpenAI’s tokenizer tiktoken:
import tiktoken
def num_tokens_from_string(string: str, encoding_name: str = "cl100k_base") -> int:
"""Returns the number of tokens in a text string."""
encoding = tiktoken.get_encoding(encoding_name)
num_tokens = len(encoding.encode(string))
return num_tokens
print(num_tokens_from_string("tiktoken is great!")) # Output: 4Use
cl100k_baseencoding for third-generation embedding models liketext-embedding-3-large.
How to quickly retrieve K nearest embedding vectors?
For fast searches among many vectors, it’s recommended to use a vector database like:
- AI Database (see documentation: AI Database)
- pgvector (see documentation: PostgreSQL)
Which distance function should be used?
Cosine similarity is recommended. OpenAI embeddings are normalized to length 1, meaning:
- Cosine similarity can be computed using only the dot product, making it faster
- Cosine similarity and Euclidean distance will yield the same ranking