Resolved -
All Astronomer components have returned to a healthy state.
Oct 20, 21:49 UTC
Update -
We have observed pods that were previously stuck in the pending state slowly getting scheduled on EC2 nodes following mitigations applied by the AWS team. This should start resolving issues with task execution. We are actively monitoring the situation.
Oct 20, 18:19 UTC
Monitoring -
We have made a hotfix update to Astro to relieve the Airflow UI slowness in Azure, GCP, and in AWS regions other than us-east-1. We are continuing to monitor the impact of the change, but early signs indicate that the speed of the UI should be improving.
This update has no effect on the issues unique to deployments in AWS us-east-1.
Oct 20, 16:30 UTC
Update -
We are continuing to investigate this issue.
Oct 20, 16:12 UTC
Update -
The AWS outage is affecting an internal tool, which is causing Airflow UI slowness in clusters running on all clouds (not just AWS). Our development team is working on a fix.
Oct 20, 14:27 UTC
Update -
We’re aware that the Airflow UI has been running very slowly following the recent AWS outage. Our team is actively investigating the issue.
Oct 20, 13:08 UTC
Update -
We are continuing to investigate this issue.
Oct 20, 10:00 UTC
Investigating -
We are aware of an ongoing AWS outage in the US-EAST-1 (N. Virginia) region that is impacting multiple AWS services and related infrastructure components. Customers with Astro clusters and deployments hosted on AWS may experience degraded performance, failed task executions, or delays in accessing their environments. Our team is actively monitoring the situation and assessing the impact across affected deployments.
For real-time updates from AWS, please refer to their Service Health Dashboard - https://health.aws.amazon.com/health/status
Next update will be provided as more information becomes available.
Oct 20, 10:00 UTC