PSA - LON/UK Zones (WW16/WW4) - Service unavailable or long loading time - Under Investigation
Incident Report for Datto
Postmortem

On Thursday, December 31, 2020 11:45 UTC, partners in the LON, UK2 and UKLR zones experienced error page displayed instead of being able to log in, disconnection after successful login. The service interruption lasted for 32 minutes.

Upon troubleshooting the issue, it was discovered that a tenant in the data center we are colocated in was the target of a Distributed Denial of Service (DDoS) attack. Although the data center subscribes to a DDoS mitigation vendor, the customer in question was not a part of the mitigation due to a configuration error. Manual intervention by the DDoS mitigation platform was required before service was able to be restored.

The data center’s DDoS mitigation platform was configured to operate on /24 IP networks. A tenant in the data center was supposed to receive DDoS mitigation, however, their devices operated on a /23 network meaning not all of their IP addresses were included in the DDoS mitigation platform.

In addition to reconfiguring the DDoS mitigation platform to be able to accept network address spaces other than /24 networks, our data center provider has completed an audit of all IP addresses advertised on their network to ensure they are included in the DDoS mitigation platform. 

Additionally, our data center provider is working with its Network provider to deploy payload thresholding to limit the core network traffic as a second line of defense. This will help to ensure future attacks are “blackholed” prior to impairing network availability/performance should the customer specific mitigate service fail. Work has been completed on this task and testing will begin starting January 29, 2021. 

They will also be revisiting priorities for ISP advertisements to ensure they have appropriate load balancing in place.

Posted Feb 01, 2021 - 14:51 UTC

Resolved
This incident has been resolved.
Posted Dec 31, 2020 - 14:22 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Dec 31, 2020 - 12:22 UTC
Investigating
Our teams are currently investigating Service unavailable or long loading times for PSA on LON Zones. An update will be posted here within 30 minutes with the status of this investigation.

Thank you for your patience!
Posted Dec 31, 2020 - 12:06 UTC
This incident affected: Autotask PSA (UK (United Kingdom), UK 2 (United Kingdom)).