Experiencing Data Latency Issue in Azure Log Analytics for West US 2 Region – 07/14 – Resolved

Posted by

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Final Update: Wednesday, 14 July 2021 03:24 UTC

We've confirmed that all systems are back to normal with no customer impact as of 07/14, 03:10 UTC. Our logs show the incident started on 07/13, 23:30 UTC and that during the 3 hours and 40 minutes that it took to resolve the issue subset of customers using Azure Log Analytics in West US 2 Region experienced issues with intermittent Log Data Latency and Incorrect Alert Activation .
  • Root Cause: The failure was due to issues with unhealthy nodes in the backend services.
  • Incident Timeline: 3 Hours & 40 minutes - 07/13, 23:30 UTC through 07/14, 03:10 UTC
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.

-Jayadev

Update: Wednesday, 14 July 2021 02:24 UTC

Root cause has been isolated to issue related to some of the nodes in the backend service which went unhealthy and that led to queue buildup which was impacting data latency for Azure Log Analytics Customers in West US 2 Region. To address this issue we restarted the affected nodes and scaled out the clusters to drain the backlog queue. Some customers may experience issues with intermittent log data latency and incorrect alert activation starting from 07/13 23:30 UTC and we estimate the issue to be resolved in the next 4 hours. 
  • Next Update: Before 07/14 06:30 UTC
-Jayadev

This articles are republished, there may be more discussion at the original link. But if you found this helpful, you're more than welcome to let us know!

This site uses Akismet to reduce spam. Learn how your comment data is processed.