SegmindSegmind / Docs

Async Inference (V2)

Submit inference requests asynchronously and poll for results. Ideal for long-running models like video generation, image upscaling, and LLMs.

The V2 async API lets you submit a request, get a request_id immediately, and poll for the result when it's ready. No long-lived HTTP connections.

Quick start

1. Submit a request

curl -X POST "https://api.segmind.com/v2/seedream-4.5" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a red rose on a wooden table, studio lighting",
    "aspect_ratio": "1:1",
    "seed": 123
  }'

Response:

{
  "request_id": "2c7f59ea-13f1-402c-9353-915a2b5a2124",
  "status": "QUEUED",
  "poll_url": "https://api.segmind.com/v1/requests/2c7f59ea-...",
  "status_url": "https://api.segmind.com/v2/requests/2c7f59ea-.../status",
  "response_url": "https://api.segmind.com/v2/requests/2c7f59ea-..."
}
FieldDescription
request_idUnique identifier for this request
statusAlways QUEUED on submit
poll_urlV1 poll endpoint (backward compatible)
status_urlLightweight status check (no output payload)
response_urlFull result endpoint (output + metadata)

2. Check status (lightweight)

Use status_url for efficient polling — it returns only status and metrics, no output payload.

curl "https://api.segmind.com/v2/requests/2c7f59ea-.../status" \
  -H "x-api-key: YOUR_API_KEY"

While processing:

{
  "status": "QUEUED",
  "request_id": "2c7f59ea-...",
  "status_url": "https://api.segmind.com/v2/requests/2c7f59ea-.../status",
  "response_url": "https://api.segmind.com/v2/requests/2c7f59ea-...",
  "metrics": {}
}

When done:

{
  "status": "COMPLETED",
  "request_id": "2c7f59ea-...",
  "metrics": { "inference_time": 13.06 }
}

3. Fetch the result

Once status is COMPLETED, fetch the full result from response_url.

curl "https://api.segmind.com/v2/requests/2c7f59ea-..." \
  -H "x-api-key: YOUR_API_KEY"

Image result:

{
  "status": "COMPLETED",
  "images": [
    {
      "url": "https://segmind-inference-io.s3.amazonaws.com/e17ba-output.jpg",
      "content_type": "image/jpeg",
      "file_size": "868446"
    }
  ],
  "output": "https://segmind-inference-io.s3.amazonaws.com/e17ba-output.jpg",
  "seed": "123",
  "prompt": "a red rose on a wooden table, studio lighting",
  "timings": { "inference": 13.06 },
  "metrics": { "inference_time": 13.06 }
}

Response formats by modality

The result shape depends on what the model produces.

Image models

{
  "status": "COMPLETED",
  "images": [{ "url": "...", "content_type": "image/jpeg", "file_size": "..." }],
  "output": "https://...",
  "seed": "123",
  "prompt": "...",
  "timings": { "inference": 13.06 },
  "metrics": { "inference_time": 13.06 }
}

Video models

{
  "status": "COMPLETED",
  "video": {
    "url": "...",
    "content_type": "video/mp4",
    "file_name": "output.mp4",
    "file_size": 5757619
  },
  "output": "https://..."
}

LLM / text models

{
  "status": "COMPLETED",
  "output": "The generated text...",
  "reasoning": null,
  "partial": false,
  "error": null
}

The output field is always present across all modalities for backward compatibility.

Status values

StatusDescription
QUEUEDRequest accepted, waiting for a worker
PROCESSINGA worker has picked up the request
COMPLETEDInference finished, result available
FAILEDInference failed (see error field)

Result expiry

Request status and results are stored for 1 hour after submission. After that, the status key expires and polling any endpoint will return HTTP 404. Fetch your results within this window.

Error handling

Failed requests return HTTP 422 on V2 endpoints:

{
  "status": "FAILED",
  "error": "Prompt is Mandatory and must be string",
  "metrics": {}
}

Not found returns HTTP 404:

{
  "error": "Request 00000000-... not found"
}

Endpoints summary

EndpointMethodDescription
/v2/{model}POSTSubmit async request
/v2/requests/{id}/statusGETLightweight status + metrics
/v2/requests/{id}GETFull result (when COMPLETED)
/v1/requests/{id}GETLegacy poll (status + output combined)

Full Python example

import requests
import time

API_KEY = "YOUR_API_KEY"
BASE = "https://api.segmind.com"

# Submit
resp = requests.post(
    f"{BASE}/v2/seedream-4.5",
    headers={"x-api-key": API_KEY},
    json={"prompt": "a beautiful sunset", "aspect_ratio": "16:9"},
)
data = resp.json()
request_id = data["request_id"]
print(f"Submitted: {request_id}")

# Poll
while True:
    status = requests.get(
        f"{BASE}/v2/requests/{request_id}/status",
        headers={"x-api-key": API_KEY},
    ).json()

    if status["status"] in ("COMPLETED", "FAILED"):
        break
    time.sleep(2)

# Fetch result
if status["status"] == "COMPLETED":
    result = requests.get(
        f"{BASE}/v2/requests/{request_id}",
        headers={"x-api-key": API_KEY},
    ).json()
    print(f"Image URL: {result['images'][0]['url']}")
    print(f"Inference time: {result['timings']['inference']}s")

On this page