Systems Failure
Incident Report for Ontraport
Postmortem

One of our physical servers experienced a low-level networking failure related to a very rare expression of a bug in the kernels network modules. This error, unfortunately, took down several critical subsystems that are required for proper functioning.

As a result of this failure, we are working toward re-evaluating the distribution and redundancy of the impacted systems so that this rare, but ultimately quite large, failure mode will not be encountered in the future.

Posted May 19, 2022 - 12:27 PDT

Resolved
On 2022/05/14, 22:48 PDT we experienced a loss of critical hardware due to a low level networking failure. This failure resulted in several critical subsystems being taken offline for an extended period of time, which resulted in many services being unavailable.
Posted May 14, 2022 - 23:00 PDT