Use your trained models - Weights & Biases Documentation

After you train a model with Serverless RL, it’s automatically available for inference. This page shows you how to construct the endpoint for a trained model and send inference requests to it. Use this endpoint to integrate your model into your application or evaluation workflows. To send requests to your trained model, you need the following:

Your W&B API key
The Serverless RL API’s base URL, https://api.training.wandb.ai/v1/
Your model’s endpoint

The model’s endpoint uses the following schema:

wandb-artifact:///[ENTITY]/[PROJECT]/[MODEL-NAME]:[STEP]

The schema consists of:

Your W&B entity’s (team) name
The name of the project associated with your model
The trained model’s name
The training step of the model you want to deploy. This is usually the step where the model performed best in your evaluations.

For example, if your W&B team is named email-specialists, your project is called mail-search, your trained model is named agent-001, and you want to deploy it on step 25, the endpoint looks like this:

wandb-artifact:///email-specialists/mail-search/agent-001:step25

After you have your endpoint, you can integrate it into your normal inference workflows. The following examples show how to make inference requests to your trained model using a cURL request or the Python OpenAI SDK. Choose the example that matches your environment.

cURL

curl https://api.training.wandb.ai/v1/chat/completions \
    -H "Authorization: Bearer $WANDB_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
            "model": "wandb-artifact://[ENTITY]/[PROJECT]/[MODEL-NAME]:[STEP]",
            "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Summarize our training run."}
            ],
            "temperature": 0.7,
            "top_p": 0.95
        }'

OpenAI SDK

from openai import OpenAI

WANDB_API_KEY = "your-wandb-api-key"
ENTITY = "my-entity"
PROJECT = "my-project"

client = OpenAI(
    base_url="https://api.training.wandb.ai/v1",
    api_key=WANDB_API_KEY
)

response = client.chat.completions.create(
    model=f"wandb-artifact:///{ENTITY}/{PROJECT}/my-model:step100",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Summarize our training run."},
    ],
    temperature=0.7,
    top_p=0.95,
)

print(response.choices[0].message.content)

Documentation Index

​cURL

​OpenAI SDK

cURL

OpenAI SDK