PSA - LON Zone - Service unavailable or long loading time - Under Investigation
Incident Report for Datto
Postmortem

On September 03, 2020 starting at 10:15 UTC and lasting until 10:38 UTC, partners with PSA databases in the LON datacenter experienced a service interruption which caused an error page to be displayed instead of the normal login page. Users with established connections would have received errors and would have been disconnected from their session.

The root cause for this service interruption was an error in a framework component on one or more servers caused a race condition which caused Send To Dev Errors (STDEs) and prevented users from logging in. 

Engineers troubleshooting the issue restarted the web server application pools to reset all worker processes and the site behaviour returned to normal.

Server logs and STDE logs were inspected to determine the root cause, but engineers found not correlation between log data and the behaviour of the affected server(s). We have put monitors in place to alert us if a similar condition presents itself again in the future.

Posted Sep 16, 2020 - 14:11 UTC

Resolved
This incident has been resolved.
Posted Sep 03, 2020 - 11:08 UTC
Monitoring
A fix has been implemented and the service is now restored. We are monitoring the platform.
Posted Sep 03, 2020 - 09:42 UTC
Update
Please rest assured that we have our best technical resources currently investigating this issue and we are aiming to restore the service as fast as possible.

We apologise for the inconvenience
Posted Sep 03, 2020 - 09:35 UTC
Investigating
Our teams are currently investigating connectivity issues for PSA in the LON zone. An update will be posted here with the status of this investigation.

Thank you for your patience!
Posted Sep 03, 2020 - 09:21 UTC
This incident affected: Autotask PSA (UK (United Kingdom)).