On 7-October-2021 at 21:14 UTC, Datto Workplace Partners experienced a service interruption which caused Workplace V10 agents to intermittently lose connection with the service.
The root cause for this service interruption was identified to be queries causing database blocking due to mass automatic agent upgrades from v10.2 to v10.3.
Although the update was not expected to cause any impact to user agent connections based on QA testing, the load on the US6 production cell was higher than anticipated. The reason for this was the combination of the nature of the changes in how synced files are stored on devices (synced files are now encrypted: see Release Notes: Workplace for Windows and Mac v10.3) as well as the high average count of synced files on the cell.
Our Engineering team has switched to a phased update on the cell for briefer periods of time to avoid overloading the service and this resolved the problem on 8-October-2021 by 0:55 UTC.
We plan to implement a feature in the future to roll out agent updates with a slower pace which should prevent this issue from reoccurring.