RMM - Syrah - Reports of some agents rebooting following 10.1 platform release and subsequent agent update.
Incident Report for Datto
Postmortem

Following the 10.1 version update deployment on the Syrah platform, Datto RMM Partners experienced an issue that caused Windows devices with the Datto RMM agent installed to stop unexpectedly.

The root cause for this service interruption was identified to be new code introduced with the release for recursively killing child processes that may start as a result of a process the Datto RMM agent starts.

This code, which targets processes by the process ID, was inadvertently killing Windows processes due to the reuse of these process IDs by Windows. Therefore system processes were incorrectly killed thus causing blue screen issues. 

The errant code was reverted to resolve the issue, and new code is being written for a future release that will take this behavior into account.

Posted Nov 17, 2021 - 15:01 UTC

Resolved
This incident has been resolved.
Posted Nov 12, 2021 - 13:16 UTC
Monitoring
The fix has been implemented in the new version of the agent.

We will continue the roll-out of the 10.1 release as scheduled.
Posted Nov 10, 2021 - 11:26 UTC
Update
The root cause has been identified and the Engineering Team is preparing a permanent fix.

We appreciate your patience
Posted Nov 09, 2021 - 16:51 UTC
Update
Our Engineering team has rolled out an update to the agent that mitigates the behavior experienced earlier today.

Agents will update organically if they are online. Partners can expedite the update process by restarting the agent or rebooting the machine.

The Engineering Team continues to investigate the root cause of the issue to implement a permanent fix before we continue the release of the 10.1 version to the other zones.
Posted Nov 09, 2021 - 12:10 UTC
Update
Our Engineering Team is still working on the agent hotfix to resolve the issue.

Thank you for your continued patience and we apologise for the inconvenience caused.
Posted Nov 09, 2021 - 11:36 UTC
Update
Our engineering team is currently working on a hotfix.

We will update this page once more information becomes available.
Posted Nov 09, 2021 - 09:20 UTC
Identified
Our engineers have identified the cause of the restarts and are currently working on a hot fix.

We will update this page once deployed.
Posted Nov 09, 2021 - 08:00 UTC
Update
Our engineers are continuing to investigate the cause of the Syrah based agent reboots.

We have identified a workaround which Partners may apply to impacted agents to mitigate the issue. Please contact support and we will guide you through it.

Thank you again for your patience.
Posted Nov 09, 2021 - 05:18 UTC
Update
Our engineers are continuing to investigate the cause of Syrah based agent reboots.

Thank you for your continued patience.
Posted Nov 09, 2021 - 01:58 UTC
Investigating
Our engineers are urgently investigating a small number of reports of Syrah based agents restarting and/or bluescreening following the 10.1 platform release and subsequent agent update.

We apologise for any inconvenience caused and will update this status within 30 minutes.
Posted Nov 09, 2021 - 00:54 UTC
This incident affected: Datto RMM (Syrah (APAC)).