Yesterday afternoon we experienced some problems with the service, initially problems with the search component of the service and shortly afterwards the site itself. The problems persisted for approximately 40 minutes. Unfortunately the initial SMS alert from the monitoring service either wasn’t sent or didn’t arrive, resulting in the on-call engineer only receiving the second alert half an hour after the first should have been sent, this meant the outage to the service was far longer than we’d normally expect. In order to try and avoid this kind of problem in the future, we’ve increased the frequency of monitoring alerts, so if one is missed, the subsequent message is sent sooner.
Latest Updates: outage RSS
-
robl
-
robl
We experienced about 5 minutes of outage earlier in the hour due to the Folksy servers coming under increased load. We’re investigating at the moment but the service should be back to normal and we’re keeping a close eye on it.
-
robl
We just experienced a problem with one of the servers in the Folksy Cluster. This may have resulted in some customers experiencing error messages whilst attempting to access the site. The problem persisted for approximately 10 minutes whilst the problem was diagnosed. We’ve removed the server from the cluster whilst we investigate further. The service is now running normally.
-
robl
We’ve experienced some slow response times and an outage earlier today due to the Folksy database servers being overloaded. We’ll be migrating onto a new database cluster in a scheduled maintenance period which should alleviate these problems as part of our ongoing upgrades to the service. The mainteance will begin at midnight tonight (00:00 3rd Nov 09) and last for approximately an hour.
-
robl
Our upstream hosting providers are experiencing problems with one of their system components which is causing a service outage. We’ll update once we know more.
Update (10:09am) : The issue is still being investigated by our hosting provider.
Update (12:53pm) : We’ve added a holding page as the outage is expected to continue for the next few hours.
Update (1:54pm) : Our hosting company hopes to bring the service back online at around 18:00 this evening, we’ll post any further updates as we hear.
Update (17:52pm) : We’ve brought the service up on alternative servers whilst the hosting company restores our service. The forums and main site are running as normal, however the blog will be restored tomorrow.
-
robl
We just experienced a problem with the search component which lead to other issues on the site such as an inability to list or edit items on the site. The service has been restarted and service should be back to normal. The problem persisted for approximately 15 minutes.
-
robl
We are experiencing some problems with our upstream hosting provider, resulting in an outage to the service. We’re waiting for our hosting provider to rectify the issue. We’ll update here when we know more.
UPDATE : It appears to be a failed SAN disk, resulting in a general performance degradation. The disk is being replaced at the moment.
UPDATE2: The failed disk has been replaced and the storage array is rebuilding, this means the site may be a little slow until this operation completes, but it is now available again.
-
robl
Our search component just failed, meaning users may have experienced problems searching and taking other actions on the site. We’ve added further monitoring and automated restart processes to minimise disruption if this happens again in the future.
-
robl
We just rebooted the search service on Folksy, customers will have experienced approx 2 minutes outage whilst this was in progress.
-
robl
We just experienced a problem with one of the servers in the folksy cluster. Customers hitting this server will have been served an error page whilst it was rebooting. Approximate outage time was 10 minutes.