Skip to Content
API CallText GenerationOpenAI Chat Completions description

OpenAI Chat Completions Documentation

Important Note: Our service is fully compatible with the OpenAI API standard, so we strongly recommend directly referring to the official OpenAI API documentation for the most comprehensive and up-to-date parameter details and examples. This will allow you to leverage OpenAI’s vast resources (such as tutorials, SDKs). Below is a simplified interface overview, focusing on core fields and usage instructions. For advanced features or updates, please refer to the official documentation first. We have supplemented the fields’ specific meanings to enhance the quick start documentation’s straightforward practicality.

Overview

The /v1/chat/completions endpoint is used for generating model responses based on dialog messages, supporting text, image, and audio inputs. Suitable for chat, content generation, and other scenarios. It supports streaming responses. Request method: POST. Endpoint: https://api.umodelverse.ai/v1/chat/completions (compatible with OpenAI format).

Authentication: Use an API key, passed via Authorization: Bearer {api_key}. Note: Some parameters are specific to certain models (e.g., reasoning_effort for reasoning models). Deprecated parameters (such as functions) should be avoided, use tools instead.

Core Fields

Request Parameters

FieldTypeRequiredDefault ValueMeaning and Description
messagesarrayYesNoneList of dialog messages. Each message comprises role (system/user/assistant) and content (text/image/audio). Meaning: Defines the dialog context from which the model generates a response. Example: [{"role": "user", "content": "Hello!"}]. Supports multimodal input.
modelstringYesNoneModel ID, such as gpt-4o. Meaning: Specifies the model for generating responses. Refer to /v1/models for the list of available models.
frequency_penaltynumberNo0Frequency penalty (-2.0 to 2.0). Meaning: Reduces the generation of repeated tokens, increasing output diversity.
logit_biasmapNoNoneToken bias map. Meaning: Adjusts the generation probability of specific tokens (e.g., forbidding certain words).
logprobsbooleanNofalseWhether to return token log probabilities. Meaning: Used to analyze model confidence.
max_completion_tokensintegerNoNoneMaximum number of completion tokens (including reasoning tokens). Meaning: Controls response length to prevent overly long outputs.
max_tokensintegerNoNoneMaximum number of tokens (deprecated). Meaning: Similar to max_completion_tokens, used for older models.
nintegerNo1Number of generation options. Meaning: Returns multiple alternative responses, which will increase token consumption.
presence_penaltynumberNo0Presence penalty (-2.0 to 2.0). Meaning: Encourages new topics, avoiding repetition.
response_formatobjectNoNoneOutput format. Meaning: For example, {"type": "json_schema"} ensures structured JSON output.
seedintegerNoNoneRandom seed. Meaning: Ensures response determinism (repeated requests yield the same result).
stopstring/arrayNoNoneStop sequences. Meaning: Generation stops upon encountering this (e.g., “END”).
streambooleanNofalseWhether to use streaming responses. Meaning: Returns chunks in real-time for interactive applications.
temperaturenumberNo1Sampling temperature (0 to 2). Meaning: Controls randomness, with higher values being more creative and lower values being more deterministic.
tool_choicestring/objectNoauto (if tools are available)Tool selection strategy. Meaning: For example, auto lets the model decide whether to call a tool.
toolsarrayNoNoneList of available tools. Meaning: Enables function calls or built-in tools (such as web search).
top_pnumberNo1Nucleus sampling (0 to 1). Meaning: Controls diversity, mutually exclusive with temperature.
userstringNoNoneUser identifier. Meaning: Used for monitoring and abuse detection.
  • Other Fields: Such as metadata (for storing additional information), modalities (output type, such as [“text”, “audio”]), etc. Refer to the official documentation for the complete list.

Response Fields

FieldTypeMeaning and Description
choicesarrayList of completion options. Meaning: Each option contains the index, message (response content), and finish_reason.
createdintegerCreation timestamp. Meaning: Unix seconds indicating the time when the response was generated.
idstringResponse ID. Meaning: Uniquely identifies the completion.
modelstringModel used. Meaning: Confirms the actual model used.
objectstringObject type: chat.completion. Meaning: Response type identifier.
service_tierstringService tier. Meaning: If specified, returns the actual tier used.
system_fingerprintstringSystem fingerprint. Meaning: Monitors changes in back-end influence determinism.
usageobjectUsage statistics. Meaning: Includes prompt_tokens, completion_tokens, total_tokens, used for billing.
  • Streaming Responses: Returns a sequence of chunks, each chunk’s object is chat.completion.chunk, and includes delta (incremental content). Ends with [DONE].

Documentation Usage

Basic Process

  1. Build Request: Prepare the messages array, ensuring roles are correct.
  2. Send Request: Use HTTP POST with the API key.
  3. Parse Response: Extract message.content from choices.
  4. Stream Handling: If stream=true, read delta.content from each chunk sequentially.

Example (Curl, Non-Streaming)

curl https://api.umodelverse.ai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer {api_key}" \ -d '{ "model": "{model_name}", "messages": [{"role": "user", "content": "Hello!"}] }'

Example (Python, Streaming)

import openai client = openai.OpenAI(api_key="{api_key}", base_url="https://api.umodelverse.ai/v1/") stream = client.chat.completions.create( model="{model_name}", messages=[{"role": "user", "content": "Hello!"}], stream=True ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")

For more examples and advanced usage, please refer directly to the official OpenAI documentation.