This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.
Final Update: Monday, 20 January 2020 04:19 UTC
We've confirmed that all systems are back to normal with no customer impact as of 1/19, 19:00 UTC. Our logs show the incident started on 1/16, 14:00 UTC and that during the 3 days, 5 hours that it took to resolve the issue a very small percentage of customers experienced Alerting failures for SCOM, ZABBIX, NAGIOS data types in the following regions: UK South, Australia South East, Japan East, Central India, South East Asia, West Central US, Canada Central, West US 2.
We've confirmed that all systems are back to normal with no customer impact as of 1/19, 19:00 UTC. Our logs show the incident started on 1/16, 14:00 UTC and that during the 3 days, 5 hours that it took to resolve the issue a very small percentage of customers experienced Alerting failures for SCOM, ZABBIX, NAGIOS data types in the following regions: UK South, Australia South East, Japan East, Central India, South East Asia, West Central US, Canada Central, West US 2.
- Root Cause: The failure was due to code regression in the most recent deployment.
- Incident Timeline: 3 days, 5 hours - 1/16, 14:00 UTC through 1/19, 19:00 UTC
-Jeff
Update: Monday, 20 January 2020 02:34 UTC
Root cause has been isolated to an issue with latest update deployed which was impacting alert types - SCOM, ZABBIX, NAGIOS. To address this issue a hot fix is still being deployed. Some customers may experience alerting failures for the mentioned alert types.
Root cause has been isolated to an issue with latest update deployed which was impacting alert types - SCOM, ZABBIX, NAGIOS. To address this issue a hot fix is still being deployed. Some customers may experience alerting failures for the mentioned alert types.
- Work Around: None
- Next Update: Before 01/20 15:00 UTC
Update: Sunday, 19 January 2020 14:06 UTC
Root cause has been isolated to an issue with latest update deployed which was impacting alert types - SCOM, ZABBIX, NAGIOS. To address this issue a hot fix is been deployed. Some customers may experience alerting failures for the mentioned alert types.
Root cause has been isolated to an issue with latest update deployed which was impacting alert types - SCOM, ZABBIX, NAGIOS. To address this issue a hot fix is been deployed. Some customers may experience alerting failures for the mentioned alert types.
- Work Around:None
- Next Update: Before 01/20 02:30 UTC
Initial Update: Sunday, 19 January 2020 10:55 UTC
We are aware of issues within Azure Monitors and are actively investigating. Some customers may experience Alerting failure for the alert types - SCOM, ZABBIX, NAGIOS.
-Monish
We are aware of issues within Azure Monitors and are actively investigating. Some customers may experience Alerting failure for the alert types - SCOM, ZABBIX, NAGIOS.
- Work Around: None
- Next Update: Before 01/19 15:00 UTC
-Monish