Datto SaaS Protection - Backup/Export Failures Impacting Subset Of v2 Nodes
Incident Report for Datto
Postmortem

On March 26th, 2024, Datto SaaS Protection and Datto Backupify customers started to experience a service degradation. Starting April 9th, 2024, some end domains began to experience a complete service interruption.  

  

The root cause for this service degradation was identified to be a Microsoft 365 Exchange release, which was confirmed by Microsoft. Microsoft had patched their Exchange servers and part of the deployment process involved reindexing all Exchange data in a way which affects APIs which Independent Software Vendors (ISVs) like Datto, utilize for backup and caused all items to be resynchronized.  In this case the Exchange API which Datto uses to backup data was affected, resulting in a 14x increase in daily data backup volume compared to the average daily load. This increase also had an impact on Microsoft API throttling due to the large increase in requests to the Microsoft API to resynchronize each Exchange folder.  

  

Datto was in direct communication with Microsoft from the outset of this problem, and on April 20th, 2024, Microsoft stopped the re-indexing process for all Datto customers. Microsoft confirmed that they have found an alternative approach to complete their software deployment and will not resume re-indexing Exchange folders moving forward.  

  

Datto expedited deployment of a significant amount of additional infrastructure to process the additional data and deployed a number of software changes to process this large data volume.  

  

As of May 16th, 2024, the issue is resolved.

Posted Jun 03, 2024 - 17:09 UTC

Resolved
This incident has been resolved.
Posted Jun 03, 2024 - 17:07 UTC
Update
We have completed deploying additional infrastructure to all data centers in the impacted US East and international regions. A vast majority of domains are now current on incremental backups, with the domains that received additional infrastructure in the last 24 hours expected to catch up by the end of the coming weekend.

The root cause analysis will be provided once investigations are complete.

Thank you for your patience and understanding. We remain committed to delivering exceptional service and continuous improvement.
Posted May 17, 2024 - 20:01 UTC
Monitoring
All changes have been deployed across our v2 platform and we are monitoring the results.
Posted May 16, 2024 - 18:54 UTC
Update
Our R&D team continues to deploy changes to alleviate backups errors on our v2 platform.


ETA for completion by region :


USE1 (use1-bfyii-xxxx) - 5/13/2024
GBE2 (gbe2-bfyii-xxxx) - 5/17/2024
DES1 (des1-bfyii-xxxx) - 5/17/2024

Please note : Timeline are subject to change

Subscribe to Datto SaaS Protection on the Status Page for up to date information. You can monitor the current status of this issue at https://status.datto.com/
Posted May 06, 2024 - 20:42 UTC
Update
We are continuing to work on a fix for this issue.
Posted Apr 30, 2024 - 20:07 UTC
Update
We continue to deploy changes that have alleviated backup errors on the following v2 nodes:

use1-bfyii-1297
use1-bfyii-2475
use1-bfyii-2814
use1-bfyii-2162
use1-bfyii-2158
use1-bfyii-2364
use1-bfyii-2825
use1-bfyii-1597
use1-bfyii-1601
use1-bfyii-2597
use1-bfyii-2678
use1-bfyii-2683
use1-bfyii-2690
use1-bfyii-2735
use1-bfyii-2817
use1-bfyii-2818
use1-bfyii-2819
use1-bfyii-2374
use1-bfyii-1802
use1-bfyii-2476
use1-bfyii-1590
use1-bfyii-2298
use1-bfyii-2303
use1-bfyii-2299
use1-bfyii-2300
use1-bfyii-2301
use1-bfyii-2361
use1-bfyii-2378
use1-bfyii-2379
use1-bfyii-2730
use1-bfyii-674
use1-bfyii-1738
use1-bfyii-693
use1-bfyii-1798
use1-bfyii-2377
use1-bfyii-2368
use1-bfyii-2296
use1-bfyii-1665
use1-bfyii-2292
use1-bfyii-1740
use1-bfyii-1745
use1-bfyii-2483
use1-bfyii-1804
use1-bfyii-1811
use1-bfyii-1849
use1-bfyii-2742
use1-bfyii-1925
use1-bfyii-1949
use1-bfyii-2060
use1-bfyii-2367
use1-bfyii-2066
use1-bfyii-2057
use1-bfyii-2304

A subset of v2 nodes are still impacted, but additional infrastructure is in the process of being deployed against these nodes to improve performance.

Subscribe to Datto SaaS Protection on the Status Page for up to date information.

You can monitor the current status of this issue at https://status.datto.com/
Posted Apr 26, 2024 - 19:24 UTC
Update
Changes were deployed beginning on April 18th that have alleviated backup errors on the following nodes:

use1-bfyii-2475
use1-bfyii-2814
use1-bfyii-2162
use1-bfyii-2158
use1-bfyii-2364
use1-bfyii-1601
use1-bfyii-2597
use1-bfyii-2678
use1-bfyii-2683
use1-bfyii-2690
use1-bfyii-2817
use1-bfyii-2818
use1-bfyii-2819
use1-bfyii-1802
use1-bfyii-2476
use1-bfyii-1590
use1-bfyii-2298
use1-bfyii-2303
use1-bfyii-2299
use1-bfyii-2300
use1-bfyii-2301
use1-bfyii-2361
use1-bfyii-2378
use1-bfyii-2730
use1-bfyii-674
use1-bfyii-1738
use1-bfyii-2377
use1-bfyii-2368
use1-bfyii-2296
use1-bfyii-1665
use1-bfyii-2292
use1-bfyii-1740
use1-bfyii-1745
use1-bfyii-2483
use1-bfyii-1804
use1-bfyii-1811
use1-bfyii-1849
use1-bfyii-2742
use1-bfyii-1925
use1-bfyii-1949
use1-bfyii-2060
use1-bfyii-2367
use1-bfyii-2066
use1-bfyii-2057

A subset of v2 nodes are still impacted, but additional infrastructure is in the process of being deployed against these nodes to improve performance.

Subscribe to Datto SaaS Protection on the Status Page for up to date information.

You can monitor the current status of this issue at https://status.datto.com/
Posted Apr 25, 2024 - 19:54 UTC
Update
Changes were deployed beginning on April 18th that have alleviated backup errors on the following nodes:


use1-bfyii-2475
use1-bfyii-2814
use1-bfyii-2158
use1-bfyii-2364
use1-bfyii-1601
use1-bfyii-2597
use1-bfyii-2683
use1-bfyii-2690
use1-bfyii-2817
use1-bfyii-2818
use1-bfyii-1802
use1-bfyii-2476
use1-bfyii-1590
use1-bfyii-2298
use1-bfyii-2303
use1-bfyii-2299
use1-bfyii-2300
use1-bfyii-2301
use1-bfyii-2361
use1-bfyii-2378
use1-bfyii-2730
use1-bfyii-674
use1-bfyii-2377
use1-bfyii-2368
use1-bfyii-2296
use1-bfyii-1665
use1-bfyii-2483
use1-bfyii-1849
use1-bfyii-2742
use1-bfyii-2367


A subset of v2 nodes are still impacted, but additional infrastructure is in the process of being deployed against these nodes to improve performance. We expect the deployments for USE to complete by Wednesday, April 24th and for DES and GBE nodes to complete by Friday, April 26th.
Posted Apr 24, 2024 - 19:01 UTC
Update
Changes were deployed beginning on April 18th that have alleviated backup errors on the following nodes:


use1-bfyii-1601
use1-bfyii-1665
use1-bfyii-1802
use1-bfyii-1849
use1-bfyii-2296
use1-bfyii-2298
use1-bfyii-2300
use1-bfyii-2301
use1-bfyii-2303
use1-bfyii-2367
use1-bfyii-2475
use1-bfyii-2597
use1-bfyii-2742
use1-bfyii-2817
use1-bfyii-2818
use1-bfyii-674


A subset of v2 nodes are still impacted, but additional infrastructure is in the process of being deployed against these nodes to improve performance. We expect the deployments for USE to complete by Wednesday, April 24th and for DES and GBE nodes to complete by Friday, April 26th.
Posted Apr 23, 2024 - 19:47 UTC
Update
Changes were deployed beginning on April 18th that have alleviated backup errors on the following nodes:


use1-bfyii-2475
use1-bfyii-1601
use1-bfyii-1802
use1-bfyii-2300
use1-bfyii-2301
use1-bfyii-2296
use1-bfyii-1665
use1-bfyii-1849
use1-bfyii-2367


A subset of v2 nodes are still impacted, but additional infrastructure is in the process of being deployed against these nodes to improve performance. We expect the deployments for USE to complete by Wednesday, April 24th and for DES and GBE nodes to complete by Friday, April 26th.
Posted Apr 22, 2024 - 22:05 UTC
Update
Changes were deployed beginning on April 18th that have reduced backup errors on the impacted nodes.

V2 nodes are still impacted, but additional infrastructure is in the process of being deployed against nodes to improve performance. We expect the deployments for USE nodes to complete by early next week and for DES and GBE nodes to complete by next Friday.
Posted Apr 19, 2024 - 19:35 UTC
Identified
We are currently aware of a problem where a subset of Datto SaaS Protection backups will start but never complete on certain V2 nodes. This can also impact exports of affected nodes.

Please note: This is not impacting partners on v3 pods.


Our Engineering team has identified the problem and is working towards a resolution.



Subscribe to Datto SaaS Protection on the Status Page for up to date information.



You can monitor the current status of this issue at https://status.datto.com/
Posted Apr 09, 2024 - 17:48 UTC
This incident affected: Datto SaaS Protection (SaaS Protection Backups).