AI Infrastructure Reliability

Is your AI API down? Know before your users do.

IncidentHub monitors OpenAI, Anthropic, Google AI, Mistral, and every major cloud provider in real time. Get instant alerts, compare reliability scores, and make smarter infrastructure decisions.

AI & cloud incidents

302

Providers tracked

11

Critical incidents

4

Top cause tag

api

Updated every 5 minutes · Data from provider status pages

How it works

Three steps to stay ahead of AI and cloud outages.

1

Track

We monitor status pages for OpenAI, Anthropic, Google AI, AWS, and 50+ providers every 5 minutes.

2

Alert

Get notified via Slack, webhooks, or Google Chat the moment a provider reports an issue.

3

Analyze

Compare AI API reliability scores, review outage history, and decide when to add fallback providers.

AI API Reliability Dashboard

Which LLM provider is most reliable this month? Compare uptime, incident frequency, and mean resolution time across OpenAI, Anthropic, Google AI, and more.

View dashboard

Plans for every team size

Start free with 2 tracked AI services. Upgrade as your monitoring needs grow.

Compare plans →

Free

For individual developers exploring AI APIs

Free forever
$0/month

Track a small watchlist of AI and cloud services, test webhook alerts, and browse public reliability data.

2 tracked services

  • 2 AI/cloud service watchlist slots
  • Webhook and Google Chat alerts
  • Public reliability rankings and outage history
  • Blog and annual reports preview
Start free

Pro

For on-call engineers building on AI APIs

Most popular
$29/month

Monitor more AI providers, get reliability comparisons, and route alerts to the channels your team already uses.

10 tracked services

  • 10 AI/cloud service watchlist slots
  • Slack, Google Chat, PagerDuty, and webhook alerts
  • AI reliability comparison dashboard
  • Incident analytics and downtime statistics
Go Pro

Teams

For platform and infrastructure teams

Full platform
$79/month

Shared monitoring across your AI stack, API access for internal tooling, and executive-ready reliability reports.

25 tracked services

  • 25 AI/cloud service watchlist slots
  • Shared alert destinations and team workflows
  • Full API access for dashboards and internal tools
  • Monthly reliability reports for vendor reviews
Start Teams

Enterprise

For organizations with custom SLA and compliance needs

Custom
$299/month

Unlimited monitoring, dedicated support, SSO, SLA guarantees, and custom integrations for your infrastructure team.

Unlimited services

  • Unlimited AI/cloud service tracking
  • Custom SLA and uptime guarantees
  • SSO / SAML integration
  • Dedicated support and onboarding
Contact sales

Outage trackers by provider

Deep-dive into outage history, status timelines, and reliability data for AI and cloud services.

OpenAI Outage History

openai down

OpenAI and ChatGPT incident history, API uptime data, and recovery timelines for teams building on GPT models.

Anthropic / Claude Outages

claude api down

Anthropic Claude API and Claude Chat outage tracking, reliability scores, and incident history.

Google AI / Vertex Outages

google ai outage

Google Cloud and Vertex AI / Gemini incident history, outage timelines, and reliability data.

Mistral AI Outages

mistral ai status

Mistral AI platform status, API outage history, and reliability tracking for teams using Mistral models.

Cohere Outages

cohere api status

Cohere API outage history and reliability data for enterprise RAG and embedding workloads.

Replicate Outages

replicate status

Replicate platform status, open-source model hosting uptime, and incident tracking.

AWS Outage History

aws outage history

AWS outages by year, timeline, and root-cause context for engineers tracking historical downtime.

Cloudflare Outages

cloudflare outage

Live and historical Cloudflare downtime tracking for teams who depend on edge infrastructure.

GitHub Outages

github down

GitHub and GitHub Actions outage tracking for developers who need answers during deploy failures.

AI API Reliability Compared

ai api reliability

Side-by-side reliability comparison of OpenAI, Anthropic, Google AI, Mistral, and other LLM providers.

Biggest Cloud Outages

biggest cloud outages

A ranked list of the longest and most impactful AI and cloud outages in our dataset.

Cloud Downtime Statistics

cloud downtime statistics

Incidents per year, average downtime, and provider-level patterns drawn from our full incident dataset.

Total Incidents

302

Providers Tracked

11

Critical Incidents

4

Top Cause

api

Recent Incidents

View all →

From the blog

AI reliability insights, outage analysis, and infrastructure decision-making.

All posts →