Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wb-21fd5541-docs-2661.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This guide applies to all W&B deployment types:
  • Multi-tenant Cloud: Team-level BYOB
  • Dedicated Cloud: Instance and team-level BYOB
  • Self-Managed: Instance and team-level BYOB
The bucket provisioning instructions in this guide are the same regardless of your deployment type.

Overview

Bring your own bucket (BYOB) lets you store W&B artifacts and other sensitive data in your own cloud or on-premises infrastructure. For Dedicated Cloud or Multi-tenant Cloud, W&B doesn’t copy the data you store in your bucket to the W&B managed infrastructure. This page is for W&B administrators and platform engineers who need to retain ownership of artifact storage to meet data governance, residency, or compliance requirements.
  • Communication between W&B SDK / CLI / UI and your buckets occurs using pre-signed URLs.
  • W&B uses garbage collection and related processes to remove deleted artifacts and run data from your bucket over time. For artifact deletion, see Delete an artifact. Deleted run data on Dedicated Cloud and Self-Managed deployments also depends on GORILLA_DATA_RETENTION_PERIOD as described in Configure environment variables. W&B doesn’t guarantee cleanup timing. For a single overview of bucket usage and costs, see Manage bucket storage and costs.
  • You can specify a sub-path when you configure a bucket, to ensure that W&B doesn’t store any files in a folder at the root of the bucket. This helps you better conform to your organization’s bucket governance policy.

Data stored in the central database vs buckets

When you use BYOB functionality, W&B stores certain types of data in the W&B central database, and other types in your bucket. Use the following lists to understand which data remains in W&B-managed infrastructure and which data W&B writes to your own storage.

Database

The W&B central database stores the following data:
  • Metadata for users, teams, artifacts, experiments, and projects.
  • Reports.
  • Experiment logs.
  • System metrics.
  • Console logs.

Buckets

Your storage bucket stores the following data:
  • Experiment files and metrics.
  • Artifact files.
  • Media files.
  • Run files.
  • Exported history metrics and system events in Parquet format.

Bucket scopes

You can configure your storage bucket to one of two scopes:
ScopeDescription
Instance levelIn Dedicated Cloud and Self-Managed, any user with the required permissions within your organization or instance can access files stored in your instance’s storage bucket. Not applicable to Multi-tenant Cloud.
Team levelIf you configure a W&B team to use a team level storage bucket, team members can access files stored in it. Team level storage buckets allow greater data access control and data isolation for teams with sensitive data or strict compliance requirements.

Team level storage helps different business units or departments that share an instance to efficiently use the infrastructure and administrative resources. It also lets separate project teams manage AI workflows for separate customer engagements. Available for all deployment types. You configure team level BYOB when you set up the team.
This design supports different storage topologies, depending on your organization’s needs. For example:
  • The same bucket can serve the instance and one or more teams.
  • Each team can use a separate bucket, some teams can choose to write to the instance bucket, or multiple teams can share a bucket by writing to subpaths.
  • Buckets for different teams can reside in different cloud infrastructure environments or regions, and different storage admin teams can manage them.
For example, suppose you have a team called Kappa in your organization. Your organization (and team Kappa) uses the instance level storage bucket by default. Next, you create a team called Omega. When you create team Omega, you configure a team level storage bucket for that team. Team Kappa can’t access files that team Omega generates. However, team Omega can access files that team Kappa creates. To isolate data for team Kappa, you must also configure a team level storage bucket for them.

Availability matrix

Before you begin, confirm that BYOB is available for your deployment type and storage provider. W&B can connect to the following storage providers:
  • CoreWeave AI Object Storage: High-performance, S3-compatible object storage service optimized for AI workloads.
  • Amazon S3: Object storage service offering scalability, data availability, security, and performance.
  • Google Cloud Storage: Managed service for storing unstructured data at scale.
  • Azure Blob Storage: Cloud-based object storage solution for storing massive amounts of unstructured data like text, binary data, images, videos, and logs.
  • S3-compatible storage such as MinIO Enterprise (AIStor) or other enterprise-grade solutions hosted in your cloud or on-premises infrastructure.
The following table shows the availability of BYOB at each scope for each W&B deployment type.
W&B deployment typeInstance levelTeam levelAdditional information
Dedicated CloudInstance and team level BYOB are supported for CoreWeave AI Object Storage, Amazon S3, Google Cloud Storage, Microsoft Azure Blob Storage, and S3-compatible storage such as MinIO Enterprise (AIStor) hosted in your cloud or on-premises infrastructure.
Multi-tenant CloudNot applicable1Team level BYOB is supported for CoreWeave AI Object Storage, Amazon S3, and Google Cloud Storage.
Self-ManagedInstance and team level BYOB are supported for CoreWeave AI Object Storage, Amazon S3, Google Cloud Storage, Microsoft Azure Blob Storage, and S3-compatible storage such as MinIO Enterprise (AIStor) hosted in your cloud or on-premises infrastructure.
1.Azure Blob Storage is not supported for team level BYOB on Multi-tenant Cloud. The following sections guide you through the process of setting up BYOB.

Provision your bucket

After you verify availability, you’re ready to provision your storage bucket, including its access policy and CORS. Provisioning creates the bucket that W&B writes to and grants the W&B platform the permissions it needs to generate pre-signed URLs on your behalf. Select a tab to continue.
Requirements:
  • Multi-tenant Cloud, or
  • Dedicated Cloud v0.73.0 or later, or
  • Self-Managed v0.73.0 or later deployed with v0.33.14+ of the Helm chart
  • A CoreWeave account with AI Object Storage enabled and with permission to create buckets, API access keys, and secret keys.
  • Your W&B instance must be able to connect to CoreWeave network endpoints.
For details, see Create a CoreWeave AI Object Storage bucket in the CoreWeave documentation.
  1. Multi-tenant Cloud: Obtain your organization ID, which is required for your bucket policy.
    1. Log in to the W&B App.
    2. In the left navigation, click Create a new team.
    3. In the drawer that opens, copy the W&B organization ID, which appears above Invite team members.
    4. Leave this page open. You use it to configure W&B.
  2. Dedicated Cloud / Self-Managed: Obtain your customer namespace, which is required for your bucket policy.
    1. In the W&B App, click your user profile icon, then click System Console.
    2. Click the Authentication tab.
    3. At the bottom of the page, copy the value for Customer Namespace. Keep this value for configuring the bucket policy.
    4. You can close the System Console.
  3. In CoreWeave, create the bucket with a name of your choice in your preferred CoreWeave availability zone. Optionally, create a folder for W&B to use as a sub-path for all W&B files. Make a note of the bucket name, availability zone, API access key, secret key, and sub-path.
  4. Set the following cross-origin resource sharing (CORS) policy for the bucket:
    [
      {
        "AllowedHeaders": [
          "*"
        ],
        "AllowedMethods": [
          "GET",
          "HEAD",
          "PUT"
        ],
        "AllowedOrigins": [
          "*"
        ],
        "ExposeHeaders": [
          "ETag"
        ],
        "MaxAgeSeconds": 3000
      }
    ]
    
    CoreWeave storage is S3-compatible. For details about CORS, see Configuring cross-origin resource sharing (CORS) in the AWS documentation.
  5. Configure a bucket policy that grants the required permissions for your W&B deployment to access the bucket and generate pre-signed URLs that AI workloads in your cloud infrastructure or user browsers use to access the bucket. See Bucket Policy Reference in the CoreWeave documentation.
    {
      "Version": "2012-10-17",
      "Statement": [
      {
        "Sid": "AllowWandbUser",
        "Action": [
          "s3:GetObject*",
          "s3:GetEncryptionConfiguration",
          "s3:ListBucket",
          "s3:ListBucketMultipartUploads",
          "s3:ListBucketVersions",
          "s3:AbortMultipartUpload",
          "s3:DeleteObject",
          "s3:PutObject",
          "s3:GetBucketCORS",
          "s3:GetBucketLocation",
          "s3:GetBucketVersioning"
        ],
        "Effect": "Allow",
        "Resource": [
          "arn:aws:s3:::<cw-bucket>/*",
          "arn:aws:s3:::<cw-bucket>"
        ],
        "Principal": {
          "CW": "arn:aws:iam::wandb:static/<wb-cw-principal>"
        },
        "Condition": {
          "StringLike": {
            "wandb:OrgID": [
              "<wb-org-id>"
            ]
          }
        }
      },
      {
        "Sid": "AllowUsersInOrg",
        "Action": "s3:*",
        "Effect": "Allow",
        "Resource": [
          "arn:aws:s3:::<cw-bucket>",
          "arn:aws:s3:::<cw-bucket>/*"
        ],
        "Principal": {
          "CW": "arn:aws:iam::<cw-storage-org-id>:*"
        }
      }]
    }
    
    The clause beginning with "Sid": "AllowUsersInOrg" grants users in your organization direct access to the bucket. If you don’t need this ability, you can omit the clause from your policy.
  6. In the bucket policy, replace placeholders:
    • <cw-bucket>: your bucket name.
    • <cw-wandb-principal>:
      • Multi-tenant Cloud: arn:aws:iam::wandb:static/wandb-integration-public
      • Dedicated Cloud or Self-Managed: arn:aws:iam::wandb:static/wandb-integration
    • <wb-org-id>:
  7. Dedicated Cloud: Contact support to complete additional steps.
  8. Self-Managed: Update your W&B deployment to set the environment variable GORILLA_SUPPORTED_FILE_STORES to the exact string cw:// and restart W&B. Otherwise, CoreWeave doesn’t appear as an option when you configure team storage.
Next, configure W&B.
Next, determine the storage address.

Determine the storage address

After you provision the bucket, you need a storage address that W&B uses to locate and authenticate to it. The following sections describe the syntax to use to connect a W&B team to a BYOB storage bucket. In the examples, replace placeholder values between angle brackets (<>) with your bucket’s details. Select a tab for detailed instructions.
This section is relevant only for team level BYOB on Dedicated Cloud or Self-Managed. For instance level BYOB or for Multi-tenant Cloud, you’re ready to Configure W&B.Determine the full bucket path using the following format. Replace placeholders between angle brackets (<>) with the bucket’s values.Bucket format:
cw://<accessKey>:<secretAccessKey>@cwobject.com/<bucketName>?tls=true
W&B supports the cwobject.com HTTPS endpoint. TLS 1.3 is required. Contact support to express interest in other CoreWeave endpoints.
After you determine the storage address, you’re ready to configure team level BYOB.

Configure W&B

After you provision your bucket and determine its address, you’re ready to configure BYOB at the instance level or team level. This final step tells W&B to route storage of artifacts, run files, and other large objects to your bucket.
Plan your storage bucket layout carefully. After you configure a storage bucket for W&B, migrating its data to another bucket is complex and requires assistance from W&B. This applies to storage for Dedicated Cloud and Self-Managed, as well as team-level storage for Multi-tenant Cloud. For questions, contact support.

Instance level BYOB

For CoreWeave AI Object Storage at the instance level, contact W&B support instead of following these instructions. Self-service configuration isn’t yet supported.
For Dedicated Cloud: Share the bucket details with your W&B team, who configures your Dedicated Cloud instance. For Self-Managed, you can configure instance level BYOB using the W&B App:
  1. Log in to W&B as a user with the admin role.
  2. Click the user icon at the top, then click System Console.
  3. Navigate to Settings > System Connections.
  4. In the Bucket Storage section, ensure the identity in the Identity field has access to the new bucket.
  5. Select the Provider.
  6. Enter the Bucket Name.
  7. Optional: Enter the Path to use in the new bucket.
  8. Click Save.
After you save, W&B uses the configured bucket as the default storage destination for new artifacts and run files at the instance level.

Team level BYOB

You can configure team level BYOB when you create a team in the W&B App or using the SCIM API (POST Groups with optional storageBucket). You have two options:
  • Use an existing bucket: You must determine the storage location for your bucket first.
  • Create a new bucket (Multi-tenant Cloud only): W&B can automatically create a bucket in your cloud provider when you create the team. W&B supports this for CoreWeave, AWS, and Google Cloud.
  • After you create a team, you can’t change its storage.
  • For instance level BYOB, see Instance level BYOB instead.
  • If you plan to configure CoreWeave storage for the team, review the CoreWeave requirements and contact support to verify that your bucket is configured correctly in CoreWeave and to validate your team’s configuration, since you can’t change the storage details after you create the team.
Select your deployment type to continue.
  1. Dedicated Cloud: You must provide the bucket path to your account team so that they can add it to your instance’s supported file stores before you follow the rest of these steps to use the storage bucket for a team.
  2. Self-Managed: You must add the bucket path to the GORILLA_SUPPORTED_FILE_STORES environment variable and then restart W&B before you follow the rest of these steps to use the storage bucket for a team.
  3. Log in to W&B as a user with the admin role, click the icon at the top left to open the left navigation, then click Create a team to collaborate.
  4. Provide a name for the team.
  5. Set Storage Type to External storage.
    To use the instance level storage for team storage (regardless of whether it’s internal or external), leave Storage Type set to Internal, even if the instance level bucket is configured for BYOB. To use separate external storage for the team, set Storage Type for the team to External and configure the bucket details in the next step.
  6. Click Bucket location.
  7. To use an existing bucket, select it from the list. To add a new bucket, click Add bucket at the bottom, then provide the bucket’s details. Click Cloud provider and select CoreWeave, AWS, Google Cloud, or Azure. If the cloud provider isn’t listed, ensure that you’ve followed the instructions in Provision your bucket to add the bucket path to the supported file stores for your instance. If the storage provider is still not listed, contact support for assistance.
  8. Specify the bucket details.
    • For CoreWeave, provide only the bucket name.
    • For Amazon S3, Google Cloud, or S3-compatible storage, provide the full bucket path you determined earlier.
    • For Azure on W&B Dedicated or Self-Managed, set Account name to the Azure account and Container name to the Azure Blob Storage container.
    • Optionally, provide additional connection settings:
      • If applicable, set Path to the bucket sub-path.
      • CoreWeave: No additional connection settings required.
      • AWS: Set KMS key ARN to the ARN of your KMS encryption key.
      • Google Cloud: No additional connection settings required.
      • Azure: Specify values for Tenant ID and Managed Identity Client ID. These fields are mandatory unless you configured the connection string with GORILLA_SUPPORTED_FILE_STORES.
  9. Click Create team.
If W&B encounters errors accessing the bucket or detects invalid settings, an error or warning displays at the bottom of the page. Otherwise, W&B creates the team.

Troubleshooting

If W&B reports errors when validating or connecting to your bucket, use the following sections to diagnose the most common causes by storage provider.

CoreWeave

This section helps troubleshoot problems connecting to CoreWeave AI Object Storage.
  • Connection errors
    • Verify that your W&B instance can connect to CoreWeave network endpoints.
    • CoreWeave uses virtual-hosted style paths, where the bucket name is a subdomain at the beginning of the path. For example, cw://bucket-name.cwobject.com is correct, while cw://cwobject.com/bucket-name/ isn’t.
    • Bucket names must not contain underscores (_) or other characters incompatible with DNS rules.
    • Bucket names must be globally unique among CoreWeave locations.
    • Bucket names must not begin with cw- or vip-, which are reserved prefixes.
  • CORS validation failures
    • A CORS policy is required. CoreWeave is S3-compatible. For details about CORS, see Configuring cross-origin resource sharing (CORS) in the AWS documentation.
    • AllowedMethods must include methods GET, PUT, and HEAD.
    • ExposeHeaders must include ETag.
    • The CORS policy’s AllowedOrigins must include W&B front-end domains. The example CORS policies provided on this page include all domains using *.
  • LOTA endpoint issues
    • W&B doesn’t yet support connections to LOTA endpoints. To express interest, contact support.
  • Access key and permission errors
    • Verify that your CoreWeave API access key isn’t expired.
    • Verify that your CoreWeave API access key and secret key have sufficient permissions GetObject, PutObject, DeleteObject, ListBucket. The examples on this page meet this requirement. See Create and Manage Access Keys in the CoreWeave documentation.

Google Cloud

This section helps troubleshoot problems connecting to Google Cloud Storage.