Skip to Content
API CallText GenerationEmbeddings vector embedding

Embedding Vector Embedding

Obtain vector representations of given inputs that can be easily used by machine learning models and algorithms.

What are embedding vectors?

Embedding vectors are lists of floating-point numbers used to measure the relevance of text strings. The distance between two vectors measures their relevance: a small distance indicates high relevance, while a large distance indicates low relevance.

Embeddings are commonly used for:

  • Search - Results are ranked by their relevance to the query string
  • Clustering - Grouping text strings by similarity
  • Recommendation - Recommending items with relevant text strings
  • Anomaly Detection - Identifying outliers with lower relevance
  • Classification - Classifying text strings by their most similar labels

Quick Start

Install the SDK

pip install openai

Basic Example

Please make sure to replace $MODELVERSE_API_KEY with your own API Key, obtain your API Key.

** Python **

from openai import OpenAI client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) response = client.embeddings.create( input="Your text string goes here", model="text-embedding-3-large" ) print(response.data[0].embedding)

** curl **

curl https://api.umodelverse.ai/v1/embeddings \ -H "Authorization: Bearer $MODELVERSE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": "Your text string goes here", "model": "text-embedding-3-large" }'

The response contains the embedding vector (a list of floating-point numbers) along with some additional metadata. You can extract the embedding vector, store it in a vector database, and use it for many different use cases.

API Reference

POST https://api.umodelverse.ai/v1/embeddings

Request Parameters

ParameterTypeRequiredDescription
inputstring or arrayYesInput text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings. Inputs must not exceed 8192 tokens and cannot be empty strings. The total token sum of all inputs in a single request can be up to 300,000.
modelstringYesThe model ID to use, such as text-embedding-3-large.
dimensionsintegerNoThe number of dimensions the output embedding vector should have. Supported only in text-embedding-3 and later model versions.
encoding_formatstringNoThe format to return the embedding vector. Can be float or base64. Default: float

Response Example

{ "object": "list", "data": [ { "object": "embedding", "embedding": [0.0023064255, -0.009327292, ..., -0.0028842222], "index": 0 } ], "model": "text-embedding-3-large", "usage": { "prompt_tokens": 8, "total_tokens": 8 } }

Response Field Descriptions

FieldTypeDescription
embeddingarrayThe embedding vector, a list of floating-point numbers. The length of the vector depends on the model.
indexintegerIndex of the embedding in the list of embeddings.
objectstringThe type of the object, always “embedding”.

Embedding Models

ModelDefault DimensionsMax InputMTEB Evaluation Performance
text-embedding-3-large3072819264.6%
text-embedding-ada-0021536819261.0%

Reducing Embedding Dimensions

Using larger embedding vectors is often more expensive and consumes more compute, memory, and storage. You can shorten the embedding dimensions by passing in the dimensions parameter without losing the conceptual representation properties of the embedding.

For example, a text-embedding-3-large embedding can be shortened to 256 dimensions while still outperforming a 1536-dimensional text-embedding-ada-002.

** Python **

from openai import OpenAI client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) response = client.embeddings.create( model="text-embedding-3-large", input="Testing 123", dimensions=256 # Specify output dimensions ) print(response.data[0].embedding)

** curl **

curl https://api.umodelverse.ai/v1/embeddings \ -H "Authorization: Bearer $MODELVERSE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": "Testing 123", "model": "text-embedding-3-large", "dimensions": 256 }'

Manual Dimension Normalization

If you need to manually truncate and normalize the embedding vector:

from openai import OpenAI import numpy as np client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) def normalize_l2(x): x = np.array(x) if x.ndim == 1: norm = np.linalg.norm(x) if norm == 0: return x return x / norm else: norm = np.linalg.norm(x, 2, axis=1, keepdims=True) return np.where(norm == 0, x, x / norm) response = client.embeddings.create( model="text-embedding-3-large", input="Testing 123", encoding_format="float" ) cut_dim = response.data[0].embedding[:256] norm_dim = normalize_l2(cut_dim) print(norm_dim)

Use Cases

Use cosine similarity between the query’s embedding vector and each document to return the highest-scoring document.

from openai import OpenAI import numpy as np client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) def get_embedding(text, model="text-embedding-3-large"): response = client.embeddings.create(input=text, model=model) return response.data[0].embedding def search_documents(documents, query, n=3): query_embedding = get_embedding(query) results = [] for doc in documents: doc_embedding = get_embedding(doc) similarity = cosine_similarity(query_embedding, doc_embedding) results.append((doc, similarity)) results.sort(key=lambda x: x[1], reverse=True) return results[:n] # Example documents = ["Python is a programming language", "Machine learning is fun", "The weather is nice today"] results = search_documents(documents, "programming") print(results)

2. Embedding-based Q&A

Put the relevant document into the model’s context window for Q&A.

from openai import OpenAI client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) # Assume relevant articles have been found through embedding search relevant_article = "The gold medal for curling at the 2022 Winter Olympics was won by..." query = f"""Answer the question using the following article. If you can't find the answer, write "I don't know." Article: \"\"\" {relevant_article} \"\"\" Question: Which athletes won the curling gold medal at the 2022 Winter Olympics? """ response = client.chat.completions.create( messages=[ {'role': 'system', 'content': 'You answer questions about the 2022 Winter Olympics.'}, {'role': 'user', 'content': query}, ], model="gpt-4o", temperature=0, ) print(response.choices[0].message.content)

3. Clustering Analysis

Use embedding vectors to perform clustering and grouping of text.

import numpy as np from sklearn.cluster import KMeans # Assume embeddings is a list of obtained embedding vectors embeddings = [...] # Embedding vectors obtained from the API matrix = np.vstack(embeddings) n_clusters = 4 kmeans = KMeans( n_clusters=n_clusters, init='k-means++', random_state=42 ) kmeans.fit(matrix) # Cluster labels for each text labels = kmeans.labels_

4. Recommendation System

Perform recommendations based on the similarity of embedding vectors.

from openai import OpenAI import numpy as np client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) def get_embedding(text, model="text-embedding-3-large"): response = client.embeddings.create(input=text, model=model) return response.data[0].embedding def recommend_similar(items, source_index, n=3): """Returns the n most similar items to the source item.""" embeddings = [get_embedding(item) for item in items] source_embedding = embeddings[source_index] similarities = [] for i, emb in enumerate(embeddings): if i != source_index: sim = np.dot(source_embedding, emb) similarities.append((i, items[i], sim)) similarities.sort(key=lambda x: x[2], reverse=True) return similarities[:n]

5. Zero-Shot Classification

Classify without the need for training data using embeddings.

from openai import OpenAI import numpy as np client = OpenAI( api_key="YOUR_MODELVERSE_API_KEY", base_url="https://api.umodelverse.ai/v1" ) def get_embedding(text, model="text-embedding-3-large"): response = client.embeddings.create(input=text, model=model) return response.data[0].embedding def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) def classify_text(text, labels): text_embedding = get_embedding(text) label_embeddings = [get_embedding(label) for label in labels] similarities = [cosine_similarity(text_embedding, le) for le in label_embeddings] best_index = np.argmax(similarities) return labels[best_index] # Example labels = ["positive", "negative", "neutral"] result = classify_text("This product is amazing!", labels) print(result) # Output: positive

FAQs

How to calculate the number of tokens in a string?

Use OpenAI’s tokenizer tiktoken:

import tiktoken def num_tokens_from_string(string: str, encoding_name: str = "cl100k_base") -> int: """Returns the number of tokens in a text string.""" encoding = tiktoken.get_encoding(encoding_name) num_tokens = len(encoding.encode(string)) return num_tokens print(num_tokens_from_string("tiktoken is great!")) # Output: 4

Use cl100k_base encoding for third-generation embedding models like text-embedding-3-large.

How to quickly retrieve K nearest embedding vectors?

For fast searches among many vectors, it’s recommended to use a vector database like:

Which distance function should be used?

Cosine similarity is recommended. OpenAI embeddings are normalized to length 1, meaning:

  • Cosine similarity can be computed using only the dot product, making it faster
  • Cosine similarity and Euclidean distance will yield the same ranking