Tag Archives: outage

Folksy issues 16:15

Hi — we’re having some performance/availability issues. This is due to a caching service we use being down. We’ve raised an urgent ticket with their support people, and will let you know as soon is it is resolved.
In the meantime, we’re sorry about the inconvenience.
Thanks,
Doug.

UPDATE: This is looking fixed, for now, although we’ll be monitoring, of course. Apologies again to anyone who was affected.

Advertisements
Tagged , ,

Image issues

Hi — there is a major outage at the moment of rubygems, the package manager we use to provision instances of our image service servers.
We’re trying to mitigate it, but at the moment it looks like we’re going to have to ride the issue out and hope that the people at rubygems can fix things quickly.
In the meantime, the vast majority of regularly-accessed images are in the cache and are still serving.
But people won’t be able to upload new images, and some images won’t be displaying if they weren’t already in the cache as of 2 hours or so ago.
We’re really sorry for the inconvenience, and are monitoring the situation with rubygems to fix things as soon as they have a handle on the problem they’re having.
Thanks,
Doug.

Tagged , ,

Site became unresponsive for around 5 minutes at 13:30

Hi — the site became unresponsive for a short while at 13:30 — this was due to some web workers on our application beginning to timeout.
We’re currently investigating why this occurred.
We restarted the web workers in question, and that seems to have stopped the timeout errors appearing, but of course we’ll be monitoring the servers to make sure that everything is as it should be.
Sorry to anyone who got caught by that, we hope our catching and fixing it quickly minimised the inconvenience.
Thanks,
Doug.

Tagged , ,

Report on search index outage

Hi — as promised, I’m passing along the details from our search index provider about the outage we had this morning.
They said they were having issues with memory leakage, which necessitated a full cluster restart. They are investigating what caused that, and in the meantime have added lots of physical memory to the cluster to prevent another occurrence of the outage.
I’ll let you know when we hear anything else.
Thanks,
Doug.

Tagged , ,

Search index service is back up

Hi — it looks like the search index service is back up.
We’ll let you know what our provider says as soon as they get back to us — I imagine they’re pretty busy making sure things are all in order at the moment, but we’ll keep you posted as soon as we hear from them.
In the meantime, we’re sorry once again for the outage and the inconvenience it caused.
Thanks,
Doug.

Tagged , ,

Search index service down

Hello — our search index service has fallen over, which means that large parts of the site are currently broken. We’ve contacted our search index service provider, and they will be scurrying to fix this as quickly as they can.
In the meantime, we’re looking into constructing an alternative index of our own, in case it takes them a while.
Hopefully our alternative won’t be needed, or even have the time to be completed, because our provider will have resolved things quickly.
We’re really sorry about this, and will keep you posted as to progress, here.
Thanks,
Doug.

Tagged , ,

Update on the Folksy issues of the last couple of days

Hi — just an update of where we’re at with the Folksy updates.

Upon investigation, it looks like our database service is no longer handling the number of calls that are being generated for it.

To fix this, we’re looking at increasing the bandwidth of that service, or maybe rebuilding a bigger, better database. We’ll see which looks like the better option as we investigate the ramifications of each.

Until that is completed (should be a day or two’s work), we’ve increased the cache times of some services (like the services that generate the smorgasbord on the front page) to reduce the number of database calls. We’ve also disabled some callbacks, like the ones that highlight items you have loved, to the same end.

Those services we’ve disabled will be getting re-enabled some time tonight, so we should be running with a full service very soon.

Those services whose cache times have been increased will stay at the increased levels until we have the better database solution in place.

Once again, we’re really sorry for any inconvenience that has been caused to everyone whilst we’ve had these problems.

Thanks for being so supportive and patient with us, it really is appreciated.

   Doug.

Tagged , , ,

Folksy outage for 5 minutes at around 12:10

Hi — we had an outage that quickly resolved itself at around 12:10, today.

We’re not currently sure why, but are investigating.

We’ll keep you posted as to what we find.

In the meantime, we’re sorry for any inconvenience.

Thanks,

   Doug.

Tagged

Folksy erroring for 30 seconds, just now

Hi — sorry, Folksy was throwing an error for 30 seconds, there, due to a bad deploy.

We rolled back right away, and are fixing the cause of the issue, now.

Sorry to anyone who was inconvenienced.

   Doug.

Tagged ,

Folksy disappeared for a minute there, sorry

Hi — Folksy just disappeared for a minute, sorry.

We’ve been pushing a lot of code changes recently, and this one caused a momentary wobble on the servers.

We caught and fixed the error within a minute, so everything should be good, now.

We of course apologise to anyone who was inconvenienced by that.

   Doug.

Tagged ,
Advertisements