All Systems Operational

Astro Hosted Operational
90 days ago
99.94 % uptime
Today
Scheduling and Running DAGs and Tasks Operational
90 days ago
99.82 % uptime
Today
Deployment Access Operational
90 days ago
99.96 % uptime
Today
Deployment Management Operational
90 days ago
99.98 % uptime
Today
Cloud UI Operational
90 days ago
99.9 % uptime
Today
Cloud API Operational
90 days ago
99.97 % uptime
Today
Cloud Image Repository Operational
90 days ago
100.0 % uptime
Today
Astro Hybrid Operational
90 days ago
99.84 % uptime
Today
Scheduling and Running DAGs and Tasks Operational
90 days ago
99.84 % uptime
Today
Deployment Access Operational
Deployment Management Operational
Cloud UI Operational
Cloud API Operational
Cloud Image Repository Operational
Cluster Management Operational
Cloud IDE Operational
90 days ago
100.0 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Mar 25, 2025

No incidents reported today.

Mar 24, 2025
Resolved - We have confirmed that no additional clusters were affected beyond those that were initially identified. This incident is fully resolved.
Mar 24, 14:50 UTC
Identified - We have identified that this issue is specific to clusters that have custom networking, specifically route tables that require a carve-out for traffic back to Astro's control plane. The public IPs for the control plane were changed, and certain custom networking setups required that the IPs be updated accordingly.

We have fixed this for all customers who reported this issue and are checking all clusters to determine if there are any others affected.

Mar 24, 14:23 UTC
Investigating - We are experiencing an issue in a few clusters, causing the Airflow UI and API to become unavailable.
The team is actively investigating the issue.

Mar 24, 10:30 UTC
Mar 23, 2025
Resolved - We have determined the event that caused this downtime and we are confident that it will not occur again. We will post a public RCA in the coming week.
Mar 23, 18:27 UTC
Monitoring - We have applied a remediation for all of the affected clusters. No clusters are currently experiencing downtime. We are continuing to examine the root cause and will update again when we are confident that the issue will not recur.
Mar 23, 15:44 UTC
Identified - We have identified a problem with scaling behavior that is causing a limited number of clusters to experience downtime. The message 'Internal Server Error' displays on the UI preventing the viewing of DAGs and the Airflow UI. This is in some cases affecting task execution. We are working on a fix currently.
Mar 23, 14:51 UTC
Mar 22, 2025
Resolved - This incident has been resolved
Mar 22, 02:44 UTC
Update - We are continuing to investigate this issue.
Mar 21, 22:05 UTC
Investigating - We are currently investigating this issue. Tasks do not appear to be impacted.
Mar 21, 21:55 UTC
Mar 21, 2025
Mar 20, 2025

No incidents reported.

Mar 19, 2025

No incidents reported.

Mar 18, 2025
Resolved - This incident has been resolved.
Mar 18, 05:24 UTC
Update - We are continuing to monitor for any further issues.
Mar 18, 04:21 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 18, 04:21 UTC
Identified - The issue has been identified and a fix is being implemented.
Mar 18, 04:12 UTC
Investigating - We are currently experiencing an issue impacting the Astro Control Plane Cluster during routine maintenance activities.

Current Impact:

Astro UI and Astro API are not available at this moment. However, airflow tasks will continue to run.

Actions Being Taken:

Our engineering team is actively monitoring and working to restore services promptly.

Next Update:

We will provide further status updates as more information becomes available.

We apologize for the inconvenience and thank you for your patience.

Mar 18, 03:43 UTC
Mar 17, 2025

No incidents reported.

Mar 16, 2025

No incidents reported.

Mar 15, 2025

No incidents reported.

Mar 14, 2025
Resolved - This incident has been resolved.
Mar 14, 12:51 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 14, 08:36 UTC
Identified - Our engineering team is actively investigating the root cause of this issue. We are working on implementing a long-term fix to restore full functionality. Further updates will be provided as we make progress.
Mar 14, 03:41 UTC
Investigating - Affected Services: Astro Alerts (DAG success, failure, SLA miss notifications)

Description: We are currently investigating an issue affecting DAG alerts on Astro.

Impact: Customers may experience delays or failures in receiving DAG alerts.

Mar 14, 03:39 UTC
Mar 13, 2025

No incidents reported.

Mar 12, 2025

No incidents reported.

Mar 11, 2025

No incidents reported.