Embedding Vector Embedding

Obtain vector representations of given inputs that can be easily used by machine learning models and algorithms.

What are embedding vectors?

Embedding vectors are lists of floating-point numbers used to measure the relevance of text strings. The distance between two vectors measures their relevance: a small distance indicates high relevance, while a large distance indicates low relevance.

Embeddings are commonly used for:

Search - Results are ranked by their relevance to the query string
Clustering - Grouping text strings by similarity
Recommendation - Recommending items with relevant text strings
Anomaly Detection - Identifying outliers with lower relevance
Classification - Classifying text strings by their most similar labels

Quick Start

Install the SDK


pip install openai

Basic Example

Please make sure to replace $MODELVERSE_API_KEY with your own API Key, obtain your API Key .

Python


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
response = client.embeddings.create(
    input="Your text string goes here",
    model="text-embedding-3-large"
)
 
print(response.data[0].embedding)

curl


curl https://api.umodelverse.ai/v1/embeddings \
  -H "Authorization: Bearer $MODELVERSE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Your text string goes here",
    "model": "text-embedding-3-large"
  }'

The response contains the embedding vector (a list of floating-point numbers) along with some additional metadata. You can extract the embedding vector, store it in a vector database, and use it for many different use cases.

API Reference

POST https://api.umodelverse.ai/v1/embeddings

Request Parameters

Parameter	Type	Required	Description
input	string or array	Yes	Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings. Inputs must not exceed 8192 tokens and cannot be empty strings. The total token sum of all inputs in a single request can be up to 300,000.
model	string	Yes	The model ID to use, such as `text-embedding-3-large`.
dimensions	integer	No	The number of dimensions the output embedding vector should have. Supported only in text-embedding-3 and later model versions.
encoding_format	string	No	The format to return the embedding vector. Can be `float` or `base64`. Default: `float`

Response Example


{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023064255, -0.009327292, ..., -0.0028842222],
      "index": 0
    }
  ],
  "model": "text-embedding-3-large",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Response Field Descriptions

Field	Type	Description
embedding	array	The embedding vector, a list of floating-point numbers. The length of the vector depends on the model.
index	integer	Index of the embedding in the list of embeddings.
object	string	The type of the object, always “embedding”.

Embedding Models

Model	Default Dimensions	Max Input	MTEB Evaluation Performance
text-embedding-3-large	3072	8192	64.6%
text-embedding-ada-002	1536	8192	61.0%

Reducing Embedding Dimensions

Using larger embedding vectors is often more expensive and consumes more compute, memory, and storage. You can shorten the embedding dimensions by passing in the dimensions parameter without losing the conceptual representation properties of the embedding.

For example, a text-embedding-3-large embedding can be shortened to 256 dimensions while still outperforming a 1536-dimensional text-embedding-ada-002.

Python


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
response = client.embeddings.create(
    model="text-embedding-3-large",
    input="Testing 123",
    dimensions=256  # Specify output dimensions
)
 
print(response.data[0].embedding)

curl


curl https://api.umodelverse.ai/v1/embeddings \
  -H "Authorization: Bearer $MODELVERSE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Testing 123",
    "model": "text-embedding-3-large",
    "dimensions": 256
  }'

Manual Dimension Normalization

If you need to manually truncate and normalize the embedding vector:


from openai import OpenAI
import numpy as np
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
def normalize_l2(x):
    x = np.array(x)
    if x.ndim == 1:
        norm = np.linalg.norm(x)
        if norm == 0:
            return x
        return x / norm
    else:
        norm = np.linalg.norm(x, 2, axis=1, keepdims=True)
        return np.where(norm == 0, x, x / norm)
 
response = client.embeddings.create(
    model="text-embedding-3-large",
    input="Testing 123",
    encoding_format="float"
)
 
cut_dim = response.data[0].embedding[:256]
norm_dim = normalize_l2(cut_dim)
print(norm_dim)

Use Cases

1. Text Search

Use cosine similarity between the query’s embedding vector and each document to return the highest-scoring document.


from openai import OpenAI
import numpy as np
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
 
def get_embedding(text, model="text-embedding-3-large"):
    response = client.embeddings.create(input=text, model=model)
    return response.data[0].embedding
 
def search_documents(documents, query, n=3):
    query_embedding = get_embedding(query)
    
    results = []
    for doc in documents:
        doc_embedding = get_embedding(doc)
        similarity = cosine_similarity(query_embedding, doc_embedding)
        results.append((doc, similarity))
    
    results.sort(key=lambda x: x[1], reverse=True)
    return results[:n]
 
# Example
documents = ["Python is a programming language", "Machine learning is fun", "The weather is nice today"]
results = search_documents(documents, "programming")
print(results)

2. Embedding-based Q&A

Put the relevant document into the model’s context window for Q&A.


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
# Assume relevant articles have been found through embedding search
relevant_article = "The gold medal for curling at the 2022 Winter Olympics was won by..."
 
query = f"""Answer the question using the following article. If you can't find the answer, write "I don't know."
 
Article:
\"\"\"
{relevant_article}
\"\"\"
 
Question: Which athletes won the curling gold medal at the 2022 Winter Olympics?
"""
 
response = client.chat.completions.create(
    messages=[
        {'role': 'system', 'content': 'You answer questions about the 2022 Winter Olympics.'},
        {'role': 'user', 'content': query},
    ],
    model="gpt-4o",
    temperature=0,
)
 
print(response.choices[0].message.content)

3. Clustering Analysis

Use embedding vectors to perform clustering and grouping of text.


import numpy as np
from sklearn.cluster import KMeans
 
# Assume embeddings is a list of obtained embedding vectors
embeddings = [...]  # Embedding vectors obtained from the API
 
matrix = np.vstack(embeddings)
n_clusters = 4
 
kmeans = KMeans(
    n_clusters=n_clusters,
    init='k-means++',
    random_state=42
)
kmeans.fit(matrix)
 
# Cluster labels for each text
labels = kmeans.labels_

4. Recommendation System

Perform recommendations based on the similarity of embedding vectors.


from openai import OpenAI
import numpy as np
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
def get_embedding(text, model="text-embedding-3-large"):
    response = client.embeddings.create(input=text, model=model)
    return response.data[0].embedding
 
def recommend_similar(items, source_index, n=3):
    """Returns the n most similar items to the source item."""
    embeddings = [get_embedding(item) for item in items]
    source_embedding = embeddings[source_index]
    
    similarities = []
    for i, emb in enumerate(embeddings):
        if i != source_index:
            sim = np.dot(source_embedding, emb)
            similarities.append((i, items[i], sim))
    
    similarities.sort(key=lambda x: x[2], reverse=True)
    return similarities[:n]

5. Zero-Shot Classification

Classify without the need for training data using embeddings.


from openai import OpenAI
import numpy as np
 
client = OpenAI(
    api_key="YOUR_MODELVERSE_API_KEY",
    base_url="https://api.umodelverse.ai/v1"
)
 
def get_embedding(text, model="text-embedding-3-large"):
    response = client.embeddings.create(input=text, model=model)
    return response.data[0].embedding
 
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
 
def classify_text(text, labels):
    text_embedding = get_embedding(text)
    label_embeddings = [get_embedding(label) for label in labels]
    
    similarities = [cosine_similarity(text_embedding, le) for le in label_embeddings]
    best_index = np.argmax(similarities)
    return labels[best_index]
 
# Example
labels = ["positive", "negative", "neutral"]
result = classify_text("This product is amazing!", labels)
print(result)  # Output: positive

FAQs

How to calculate the number of tokens in a string?

Use OpenAI’s tokenizer tiktoken:


import tiktoken
 
def num_tokens_from_string(string: str, encoding_name: str = "cl100k_base") -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens
 
print(num_tokens_from_string("tiktoken is great!"))  # Output: 4

Use cl100k_base encoding for third-generation embedding models like text-embedding-3-large.

How to quickly retrieve K nearest embedding vectors?

For fast searches among many vectors, it’s recommended to use a vector database like:

AI Database (see documentation: AI Database)
pgvector (see documentation: PostgreSQL)

Which distance function should be used?

Cosine similarity is recommended. OpenAI embeddings are normalized to length 1, meaning:

Cosine similarity can be computed using only the dot product, making it faster
Cosine similarity and Euclidean distance will yield the same ranking

Embedding Vector Embedding

What are embedding vectors?

Quick Start

Install the SDK

Basic Example

** Python **

** curl **

API Reference

Request Parameters

Response Example

Response Field Descriptions

Embedding Models

Reducing Embedding Dimensions

** Python **

** curl **

Manual Dimension Normalization

Use Cases

1. Text Search

2. Embedding-based Q&A

3. Clustering Analysis

4. Recommendation System

5. Zero-Shot Classification

FAQs

How to calculate the number of tokens in a string?

How to quickly retrieve K nearest embedding vectors?

Which distance function should be used?

Python

curl

Python

curl