Monitoring - We are continuing to monitor the situation as AWS works toward full recovery in the affected Availability Zone. At this time, we are seeing signs of stabilization, and most transient failures should continue to self-resolve.
We will provide further updates once AWS confirms that the issue has been fully resolved.
May 08, 2026 - 07:45 UTC
Investigating - We are currently observing elevated task failures and latency for some deployments running in the AWS us-east-1 region. This is related to an ongoing AWS incident affecting a single Availability Zone (use1-az4), where EC2 and EBS resources have experienced impairments.
What to expect: You may see intermittent task failures or retries in your deployments. In most cases, these failures are transient and should self-resolve automatically as AWS continues recovery.
What you should do: No immediate action is required. However, if you notice consistent or prolonged failures, please reach out to Astronomer Support, and we’ll help investigate further.
We are continuing to monitor the situation closely and will share updates as needed.
Resolved -
The incident has been resolved.
May 7, 14:24 UTC
Monitoring -
A fix has been implemented, and we are monitoring the results.
May 7, 13:49 UTC
Update -
A fix has been prepared and is moving through deployment. We will provide another update once the rollout is complete and validation is underway.
May 7, 12:57 UTC
Identified -
We have identified the cause of the delayed delivery for some time-based alerts and are implementing a fix. We will provide another update once the fix has been deployed and validated.
May 7, 11:15 UTC
Investigating -
We are investigating delayed delivery for some time-based alerts. The delay is primarily visible for DAG Timeliness alerts, and may also affect Observe-related SLA, Proactive SLA, and Data Quality monitor alerts. DAG Duration and Task Duration alerts do not appear to be affected at this time.
May 7, 10:31 UTC
Resolved -
Our team has determined this only affects internal access, Astro end users are unaffected.
May 6, 20:05 UTC
Investigating -
Attempting to access the dags page in the Airflow UI results in a 403 Forbidden error. This should not be affecting task execution.
May 6, 19:56 UTC
Resolved -
This incident is now resolved, and all Cost Breakdown data is now up to date. The issue that caused the delay has been fixed and should not recur.
May 4, 20:15 UTC
Update -
We are continuing to investigate this issue.
May 4, 10:56 UTC
Investigating -
We’re investigating an issue affecting Dashboard cost breakdown data. For affected customers, cost breakdown information may appear stale and may not have updated since May 1, 2026. Our team is actively investigating the cause and working to restore current data. We’ll share another update as we have more information.
May 4, 10:56 UTC
Resolved -
We rolled back a change we made to our authentication system. Any image pushes or config changes to a deployment that occurred after our rollback caused the deployment to fix itself, which is why so many people found their 403s resolved themselves after some time.
Apr 28, 00:16 UTC
Update -
We are continuing to investigate this issue.
Apr 27, 22:05 UTC
Investigating -
It seems to be only in AWS clusters for now. We have a workaround we can apply while we investigate.
Apr 27, 22:04 UTC
Resolved -
Azure has confirmed the issue is resolved and services are back to normal. This incident is now closed. For more details, see the Azure incident history: https://azure.status.microsoft/en-us/status/history/
Apr 25, 05:05 UTC
Monitoring -
Azure has fixed the issue, and services are back to normal in East US. We are monitoring to make sure everything stays stable.
Apr 25, 03:56 UTC
Identified -
Azure East US has reported multi-service impact that is affecting Astro deployments in the region. For more information on Azure outage, please visit: https://azure.status.microsoft/en-us/status
Apr 24, 17:28 UTC