We want to take this opportunity to address the recent service disruptions that have been occurring.
We understand that any downtime can be frustrating and inconvenient, and we sincerely apologize for any inconvenience you have experienced.
Our partner's team has identified the root cause of these outages related to the hypervisor nodes on the servers. Specifically, they have been dealing with an unexpected kernel-level crash on several of these nodes that is suspected to be related to drives against the NICs as part of an automatic system update.
Since these issues began, the team worked diligently to reduce the impact of these system reboots on service availability. Modifications implemented have significantly decreased the downtime window from the initial lengthy durations to a much shorter window; however, we, of course, understand that it's far better these incidents didn't occur in the first place.
We understand the importance of consistent, uninterrupted, reliable service and are dedicated to resolving this issue fully. For those of you who have been long-term clients with us, you are aware already that this type of issue is a rarity and is always considered our top priority.
We have reason to believe that once the remaining nodes have rebooted into the latest kernel version, the system will stabilize and return to the reliable service you've come to expect from us.
We want to reassure you that our highest priority is the reliability and stability of our services. We appreciate your understanding and patience during this time and apologize again for any inconvenience caused.
The reboot process will continue until the entire cluster of nodes is running via the latest kernel; however, due to the scale of the cluster, this may occur at any time.
We will provide further updates once the issue has been entirely remediated.
Thank you for your understanding.
99.769% April 2023 | ||||||
---|---|---|---|---|---|---|
Sun | Mon | Tue | Wen | Thu | Fri | Sat |
26 | 27 | 28 | 29 | 30 | 31 | 1 |
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 | 29 |
30 | 1 | 2 | 3 | 4 | 5 | 6 |
99.626% May 2023 | ||||||
---|---|---|---|---|---|---|
Sun | Mon | Tue | Wen | Thu | Fri | Sat |
30 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 1 | 2 | 3 |
4 | 5 | 6 | 7 | 8 | 9 | 10 |
100% June 2023 | ||||||
---|---|---|---|---|---|---|
Sun | Mon | Tue | Wen | Thu | Fri | Sat |
28 | 29 | 30 | 31 | 1 | 2 | 3 |
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 1 |
2 | 3 | 4 | 5 | 6 | 7 | 8 |
We want to take this opportunity to address the recent service disruptions that have been occurring.
We understand that any downtime can be frustrating and inconvenient, and we sincerely apologize for any inconvenience you have experienced.
Our partner's team has identified the root cause of these outages related to the hypervisor nodes on the servers. Specifically, they have been dealing with an unexpected kernel-level crash on several of these nodes that is suspected to be related to drives against the NICs as part of an automatic system update.
Since these issues began, the team worked diligently to reduce the impact of these system reboots on service availability. Modifications implemented have significantly decreased the downtime window from the initial lengthy durations to a much shorter window; however, we, of course, understand that it's far better these incidents didn't occur in the first place.
We understand the importance of consistent, uninterrupted, reliable service and are dedicated to resolving this issue fully. For those of you who have been long-term clients with us, you are aware already that this type of issue is a rarity and is always considered our top priority.
We have reason to believe that once the remaining nodes have rebooted into the latest kernel version, the system will stabilize and return to the reliable service you've come to expect from us.
We want to reassure you that our highest priority is the reliability and stability of our services. We appreciate your understanding and patience during this time and apologize again for any inconvenience caused.
The reboot process will continue until the entire cluster of nodes is running via the latest kernel; however, due to the scale of the cluster, this may occur at any time.
We will provide further updates once the issue has been entirely remediated.
Thank you for your understanding.