This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.
Final Update: Wednesday, 14 July 2021 03:24 UTC
We've confirmed that all systems are back to normal with no customer impact as of 07/14, 03:10 UTC. Our logs show the incident started on 07/13, 23:30 UTC and that during the 3 hours and 40 minutes that it took to resolve the issue subset of customers using Azure Log Analytics in West US 2 Region experienced issues with intermittent Log Data Latency and Incorrect Alert Activation .
-Jayadev
We've confirmed that all systems are back to normal with no customer impact as of 07/14, 03:10 UTC. Our logs show the incident started on 07/13, 23:30 UTC and that during the 3 hours and 40 minutes that it took to resolve the issue subset of customers using Azure Log Analytics in West US 2 Region experienced issues with intermittent Log Data Latency and Incorrect Alert Activation .
- Root Cause: The failure was due to issues with unhealthy nodes in the backend services.
- Incident Timeline: 3 Hours & 40 minutes - 07/13, 23:30 UTC through 07/14, 03:10 UTC
-Jayadev
Update: Wednesday, 14 July 2021 02:24 UTC
Root cause has been isolated to issue related to some of the nodes in the backend service which went unhealthy and that led to queue buildup which was impacting data latency for Azure Log Analytics Customers in West US 2 Region. To address this issue we restarted the affected nodes and scaled out the clusters to drain the backlog queue. Some customers may experience issues with intermittent log data latency and incorrect alert activation starting from 07/13 23:30 UTC and we estimate the issue to be resolved in the next 4 hours.
Root cause has been isolated to issue related to some of the nodes in the backend service which went unhealthy and that led to queue buildup which was impacting data latency for Azure Log Analytics Customers in West US 2 Region. To address this issue we restarted the affected nodes and scaled out the clusters to drain the backlog queue. Some customers may experience issues with intermittent log data latency and incorrect alert activation starting from 07/13 23:30 UTC and we estimate the issue to be resolved in the next 4 hours.
- Next Update: Before 07/14 06:30 UTC