On Tuesday May 14th 7:17 AM UTC Datto RMM Partners experienced a service interruption where devices experienced latency issues while connecting to the Zinfandel platform.
The root cause of this service interruption was identified as a memory saturation issue experienced by the load balancer service. As a result of this the load balancers could not connect to the backend service handling agent connection to the platform.
Various mitigation steps were taken to reduce the impact of the issue while the R&D team was working on the resolution and the team deployed a permanent fix on May 17th 17:42 PM UTC by scaling the size of the instances that represent the load balancer service. Alert mechanism related to instance memory will be incorporated in our infrastructure to get awareness of these kinds of issues much earlier going forward and avoid an incident like this.