Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wb-21fd5541-docs-2661.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

After you train a model with Serverless RL, it’s automatically available for inference. This page shows you how to construct the endpoint for a trained model and send inference requests to it. Use this endpoint to integrate your model into your application or evaluation workflows. To send requests to your trained model, you need the following: The model’s endpoint uses the following schema:
wandb-artifact:///[ENTITY]/[PROJECT]/[MODEL-NAME]:[STEP]
The schema consists of:
  • Your W&B entity’s (team) name
  • The name of the project associated with your model
  • The trained model’s name
  • The training step of the model you want to deploy. This is usually the step where the model performed best in your evaluations.
For example, if your W&B team is named email-specialists, your project is called mail-search, your trained model is named agent-001, and you want to deploy it on step 25, the endpoint looks like this:
wandb-artifact:///email-specialists/mail-search/agent-001:step25
After you have your endpoint, you can integrate it into your normal inference workflows. The following examples show how to make inference requests to your trained model using a cURL request or the Python OpenAI SDK. Choose the example that matches your environment.

cURL

curl https://api.training.wandb.ai/v1/chat/completions \
    -H "Authorization: Bearer $WANDB_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
            "model": "wandb-artifact://[ENTITY]/[PROJECT]/[MODEL-NAME]:[STEP]",
            "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Summarize our training run."}
            ],
            "temperature": 0.7,
            "top_p": 0.95
        }'

OpenAI SDK

from openai import OpenAI

WANDB_API_KEY = "your-wandb-api-key"
ENTITY = "my-entity"
PROJECT = "my-project"

client = OpenAI(
    base_url="https://api.training.wandb.ai/v1",
    api_key=WANDB_API_KEY
)

response = client.chat.completions.create(
    model=f"wandb-artifact:///{ENTITY}/{PROJECT}/my-model:step100",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Summarize our training run."},
    ],
    temperature=0.7,
    top_p=0.95,
)

print(response.choices[0].message.content)