Usage Guide

Table of contents

  1. Basic Usage
    1. Python Client
    2. curl Commands
  2. Test Client Usage
  3. Response Handling
    1. Synchronous Mode
    2. Asynchronous Mode
  4. Best Practices

Basic Usage

Python Client

from openai import OpenAI

client = OpenAI(
    api_key="dummy_openai_api_key",  # Any string works as the key
    base_url="http://localhost:8080/v1"
)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

curl Commands

Chat completion request:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Check batch status:

curl http://localhost:8080/v1/batches/batch_123

List all batches:

curl http://localhost:8080/v1/batches

Test Client Usage

The included Python test client provides easy testing:

# Send a chat completion request
python client.py --api chat_completions --content "Write a joke"

# Check specific batch status
python client.py --api status_single_batch --batch_id batch_123

# List all batches
python client.py --api status_all_batches

# List only completed batches
python client.py --api status_all_batches --status_filter completed

Response Handling

Synchronous Mode

# Response will be returned when ready
response = client.chat.completions.create(...)
print(response.choices[0].message.content)

Asynchronous Mode

# Returns immediately with batch ID
response = client.chat.completions.create(...)
batch_id = response.id

# Check status later
status = client.batches.retrieve(batch_id)
print(f"Status: {status.batch.status}")

Best Practices

  1. Request Batching
    • Group similar requests together
    • Use appropriate batch window size
    • Consider request volume
  2. Error Handling
    try:
        response = client.chat.completions.create(...)
    except Exception as e:
        print(f"Error: {e}")
    
  3. Monitoring
    • Use the batch monitor tool
    • Track batch statuses
    • Monitor cache hits/misses