# Getting Started

This guide walks through making your first requests after completing [Installation](installation.html).

## Your first request

Send a non-streaming request to any configured provider:

```bash
curl -X POST http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": [
      {"role": "user", "content": "Explain the BEAM in one sentence."}
    ]
  }'
```

Response:

```json
{
  "id": "01950000-0000-0000-0000-000000000000",
  "object": "response",
  "model": "gpt-4o",
  "status": "completed",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{"type": "output_text", "text": "The BEAM is..."}],
      "status": "completed"
    }
  ],
  "usage": {}
}
```

## Streaming responses

Add `"stream": true` to receive Server-Sent Events as the model generates:

```bash
curl -X POST http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "model": "gpt-4o",
    "stream": true,
    "input": [
      {"role": "user", "content": "Write a haiku about Elixir."}
    ]
  }'
```

You'll receive a stream of events:

```
event: response.created
data: {"id":"01950000...","status":"queued",...}

event: response.in_progress
data: {"type":"response.in_progress","sequence_number":0}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"Concurrent","sequence_number":3}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":" streams flow","sequence_number":4}

event: response.completed
data: {"id":"01950000...","status":"completed",...}

data: [DONE]
```

See [Streaming](streaming.html) for the full event catalogue and client examples.

## Using tools

Define tools in the request and the model will call them when appropriate:

```bash
curl -X POST http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-6",
    "input": [
      {"role": "user", "content": "What time is it in Tokyo?"}
    ],
    "tools": [
      {
        "type": "function",
        "name": "get_time",
        "description": "Get the current time in a given timezone",
        "parameters": {
          "type": "object",
          "properties": {
            "timezone": {"type": "string", "description": "IANA timezone name"}
          },
          "required": ["timezone"]
        }
      }
    ]
  }'
```

When the model decides to call `get_time`, OpenResponses emits a `function_call` item in the output. You then submit the result in a follow-up request using `previous_response_id`. See [Tool Dispatch](tool_dispatch.html) for the full flow.

## Multi-turn conversations

Use `previous_response_id` to continue a conversation. OpenResponses automatically reconstructs the full context from the cache:

```bash
# First turn
curl -X POST http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": [{"role": "user", "content": "My name is Alice."}]
  }'
# → {"id": "resp_001", ...}

# Second turn — no need to repeat the history
curl -X POST http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "previous_response_id": "resp_001",
    "input": [{"role": "user", "content": "What is my name?"}]
  }'
```

See [Conversation History](conversation_history.html) for caching behaviour and TTL configuration.

## Choosing a model

The model name determines which provider adapter is used:

| Model prefix | Provider |
|---|---|
| `gpt-*` | OpenAI |
| `claude-*` | Anthropic |
| `gemini-*` | Google Gemini |
| `llama*`, `mistral*`, `phi*`, `qwen*` | Ollama (local) |

See [Providers](providers.html) to add API keys and customise routing.
