Plurence is currently in Public Beta. Features and pricing may change. Not recommended for production workloads. Beta Terms

Integration Guides

Streaming Responses

The Plurence gateway streams responses token-by-token using Server-Sent Events (SSE) — the same protocol as OpenAI.

Enable streaming

Set "stream": true in your request body. The gateway returns text/event-stream and proxies tokens as they arrive from the upstream provider.

curl
curl https://gateway.plurence.com/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "stream": true,
    "messages": [{"role": "user", "content": "Count to 5."}]
  }'

Python (OpenAI SDK)

python
from openai import OpenAI

client = OpenAI(
    api_key="plr_live_...",
    base_url="https://gateway.plurence.com/v1",
)

with client.chat.completions.stream(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a short story."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node / TypeScript (OpenAI SDK)

typescript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'plr_live_...',
  baseURL: 'https://gateway.plurence.com/v1',
});

const stream = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  stream: true,
  messages: [{ role: 'user', content: 'Hello!' }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

SSE event format

Each event follows the OpenAI streaming format:

sse events
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"}}]}

data: [DONE]

Related