OpenAI Chat Completions Documentation

Important Note: Our service is fully compatible with the OpenAI API standard, so we strongly recommend directly referring to the official OpenAI API documentation for the most comprehensive and up-to-date parameter details and examples. This will allow you to leverage OpenAI’s vast resources (such as tutorials, SDKs). Below is a simplified interface overview, focusing on core fields and usage instructions. For advanced features or updates, please refer to the official documentation first. We have supplemented the fields’ specific meanings to enhance the quick start documentation’s straightforward practicality.

Overview

The /v1/chat/completions endpoint is used for generating model responses based on dialog messages, supporting text, image, and audio inputs. Suitable for chat, content generation, and other scenarios. It supports streaming responses. Request method: POST. Endpoint: https://api.umodelverse.ai/v1/chat/completions (compatible with OpenAI format).

Authentication: Use an API key, passed via Authorization: Bearer {api_key}. Note: Some parameters are specific to certain models (e.g., reasoning_effort for reasoning models). Deprecated parameters (such as functions) should be avoided, use tools instead.

Core Fields

Request Parameters

Field	Type	Required	Default Value	Meaning and Description
messages	array	Yes	None	List of dialog messages. Each message comprises `role` (system/user/assistant) and `content` (text/image/audio). Meaning: Defines the dialog context from which the model generates a response. Example: `[{"role": "user", "content": "Hello!"}]`. Supports multimodal input.
model	string	Yes	None	Model ID, such as `gpt-4o`. Meaning: Specifies the model for generating responses. Refer to `/v1/models` for the list of available models.
frequency_penalty	number	No	0	Frequency penalty (-2.0 to 2.0). Meaning: Reduces the generation of repeated tokens, increasing output diversity.
logit_bias	map	No	None	Token bias map. Meaning: Adjusts the generation probability of specific tokens (e.g., forbidding certain words).
logprobs	boolean	No	false	Whether to return token log probabilities. Meaning: Used to analyze model confidence.
max_completion_tokens	integer	No	None	Maximum number of completion tokens (including reasoning tokens). Meaning: Controls response length to prevent overly long outputs.
max_tokens	integer	No	None	Maximum number of tokens (deprecated). Meaning: Similar to max_completion_tokens, used for older models.
n	integer	No	1	Number of generation options. Meaning: Returns multiple alternative responses, which will increase token consumption.
presence_penalty	number	No	0	Presence penalty (-2.0 to 2.0). Meaning: Encourages new topics, avoiding repetition.
response_format	object	No	None	Output format. Meaning: For example, `{"type": "json_schema"}` ensures structured JSON output.
seed	integer	No	None	Random seed. Meaning: Ensures response determinism (repeated requests yield the same result).
stop	string/array	No	None	Stop sequences. Meaning: Generation stops upon encountering this (e.g., “END”).
stream	boolean	No	false	Whether to use streaming responses. Meaning: Returns chunks in real-time for interactive applications.
temperature	number	No	1	Sampling temperature (0 to 2). Meaning: Controls randomness, with higher values being more creative and lower values being more deterministic.
tool_choice	string/object	No	auto (if tools are available)	Tool selection strategy. Meaning: For example, `auto` lets the model decide whether to call a tool.
tools	array	No	None	List of available tools. Meaning: Enables function calls or built-in tools (such as web search).
top_p	number	No	1	Nucleus sampling (0 to 1). Meaning: Controls diversity, mutually exclusive with temperature.
user	string	No	None	User identifier. Meaning: Used for monitoring and abuse detection.

Other Fields: Such as metadata (for storing additional information), modalities (output type, such as [“text”, “audio”]), etc. Refer to the official documentation for the complete list.

Response Fields

Field	Type	Meaning and Description
choices	array	List of completion options. Meaning: Each option contains the index, message (response content), and finish_reason.
created	integer	Creation timestamp. Meaning: Unix seconds indicating the time when the response was generated.
id	string	Response ID. Meaning: Uniquely identifies the completion.
model	string	Model used. Meaning: Confirms the actual model used.
object	string	Object type: `chat.completion`. Meaning: Response type identifier.
service_tier	string	Service tier. Meaning: If specified, returns the actual tier used.
system_fingerprint	string	System fingerprint. Meaning: Monitors changes in back-end influence determinism.
usage	object	Usage statistics. Meaning: Includes prompt_tokens, completion_tokens, total_tokens, used for billing.

Streaming Responses: Returns a sequence of chunks, each chunk’s object is chat.completion.chunk, and includes delta (incremental content). Ends with [DONE].

Documentation Usage

Basic Process

Build Request: Prepare the messages array, ensuring roles are correct.
Send Request: Use HTTP POST with the API key.
Parse Response: Extract message.content from choices.
Stream Handling: If stream=true, read delta.content from each chunk sequentially.

Example (Curl, Non-Streaming)


curl https://api.umodelverse.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {api_key}" \
  -d '{
    "model": "{model_name}",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Example (Python, Streaming)


import openai
 
client = openai.OpenAI(api_key="{api_key}", base_url="https://api.umodelverse.ai/v1/")
stream = client.chat.completions.create(
    model="{model_name}",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

For more examples and advanced usage, please refer directly to the official OpenAI documentation .