Documentation8 min read

Getting started with Rai Shield

Deploy your first AI gateway in minutes. This guide covers installation, configuration, and your first request routing.

Overview

Rai Shield is a high-performance AI gateway built with Rust and WebAssembly. It provides request routing, governance, and content licensing for AI applications — deployable anywhere from edge networks to enterprise infrastructure.

Installation

Using Cargo

The simplest way to install Rai Shield is via Cargo, Rust's package manager:

cargo install rai-shield

From source

For development or custom builds, clone the repository and build with your preferred features:

# Clone the repository
git clone https://git.rai.onl/rai/shield.git
cd shield

# Build with default features
cargo build --release

# Or with specific features
cargo build --release --features "fastly,metrics"

Available feature flags:

  • fastly — Fastly Compute deployment target
  • cloudflare — Cloudflare Workers support
  • metrics — Prometheus metrics endpoint
  • tracing — OpenTelemetry integration

Configuration

Create a configuration file at ~/.config/rai/shield.toml:

# Rai Shield configuration

[server]
host = "0.0.0.0"
port = 8080

[routing]
default_provider = "anthropic"

[[providers]]
name = "anthropic"
endpoint = "https://api.anthropic.com"
api_key_env = "ANTHROPIC_API_KEY"

[[providers]]
name = "openai"
endpoint = "https://api.openai.com"
api_key_env = "OPENAI_API_KEY"

[governance]
rate_limit = 100  # requests per minute
max_tokens = 4096

Your first request

Start the gateway and send a test request:

# Start Rai Shield
rai-shield serve

# In another terminal, send a request
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

The gateway will route requests to the appropriate provider based on your configuration, applying rate limits and governance policies automatically.

Routing strategies

Rai Shield supports several routing strategies out of the box:

Strategy Description Use case
model-match Routes based on requested model Multi-provider setups
cost-optimised Selects cheapest capable provider Cost-sensitive workloads
latency-optimised Routes to fastest provider Real-time applications
round-robin Distributes evenly across providers Load balancing

The right routing strategy depends on your specific requirements. Start with model-match for predictable behaviour, then experiment with optimisation strategies.

Next steps

Now that you have Rai Shield running, explore these areas:

  1. Configure governance policies for rate limiting and content filtering
  2. Set up content licensing with RSL integration
  3. Deploy to production on Fastly, Cloudflare, or bare metal
  4. Enable monitoring with Prometheus and Grafana

Need help? Join our Matrix community or get in touch.