…and Why I Haven’t Updated in a Few Days
An astute reader might recall the so-called “capacitor plague” from the earlier part of this decade. The general consensus holds that the plague of failing capacitors originated from corporate espionage and the theft of an electrolytic formula minus a critical component. Without the critical component–a stabilizer–charge and discharge cycles combined with their respective heating a cooling would eventually generate a build up of hydrogen gas, triggering a potentially catastrophic failure of the capacitor.
I recall reading about that in 2005, because it was then when the influx of failing boards that had integrated these capacitors from the years previous began to hit computer repair shops. I was working for TCI during my fall semester of that year, next door to MDC Computers, and I recall that for several months, they were tending to nearly a machine a week suffering from “bad caps.”
When I left to finish my studies, I thought that the faulty capacitor problem would be destined to become a distant memory. In December 2006, I built my existing workstation; it was reasonably inexpensive, and I’ve always had an interest in building and integrating the components of my own volition, but I never realized that a fairly critical component would fail about three years and three months later due to precisely the same reason that had kept the guys at MDC insanely busy for months.
It was Thursday evening, and I was working on a TurboGears project. As is typical of those evenings when I feel endowed with a sense of adventure enough to explore unfamiliar frameworks, I had a few dozen windows open, some music playing, several SSH sessions, and the likes. I recall that I was exploring some of TurboGear’s internals and was in the process of setting up a template to test a new idea.
Then my monitor shut itself off.
Over the years, I’ve experienced a few unusual hardware failures, including faulty video cards, but save for a catastrophic event tied to CPU, motherboard, memory, or power supply failures, I’ve never experienced many that would freeze the entire operating system. Everything had frozen; no longer were my speakers playing in the background (I was listening to something in the trance genre) and neither would the machine respond to pings from the file server. It was dead.
Of course, when you’re working on a system that mysteriously freezes, your first inclination is to send a few curses about software problems flying away. I had been booted to Windows that day, and you might imagine that several of my foul words had been directed toward Redmond, Washington! Whatever it was, it was probably a driver fault, and I suspected at the time–ironically, in retrospect–that it was tied to the video card. NVIDIA’s drivers are reasonably stable, though their quality has degraded over time, and I’ve seldom encountered any extraordinary circumstances under which they’ve failed. My thoughts, then, shifted to my aging Sound Blaster Live!; two months prior, it began to exhibit a foul temper when loading games and often generated a blaring, throbbing tone in protest. Creative Labs certainly would never be a company one might accuse of writing outstanding drivers! (In their defense, they have gotten better over the years, but I’ve seen more than my fair share of “irq_not_less_than_or_equal_to” BSoDs implicating Creative’s sblive! driver.)
Nevertheless, a simple software problem can always be solved by popping the reset button, booting back to the operating system at fault, and examining logs for hints that might lead to the apprehension of the bug in question. Once Windows finished starting, I entered my password and waited. As the background appeared and the taskbar was loading, the black screen mysteriously appeared and the system hung.
I sat there for a moment perplexed that the same event which was originally responsible for forcing me into troubleshooting mode had returned to haunt me a second time. Worse, there was no evidence of a driver-related BSoD or other kernel-level OS fault (Windows started normally without so much as a prompt to enter safe mode). I knew what to do: Boot to Gentoo, examine the message log for potential clues that might indict a growing hardware problem, and then take a closer look at the Windows partitions for further evidence of possible kernel dumps.
I made the decision then to boot Gentoo and waited. A few errors popped up during init that indicated my Windows drive couldn’t be mounted, and neither dbus nor hald were able to start. I wasn’t terribly surprised by the latter, but the former event had me puzzled. dbus and hald were dependent upon a library I had rebuilt some days earlier, and I suspected then that they were linked to the earlier version; certainly nothing a revdep-rebuild wouldn’t fix. But the issue with my Windows drive being unmountable bothered me. Had the drive actually died? There wasn’t any indication of an immediate hardware problem in the device, but it most assuredly could not be ruled out.
As I entered my password and waited momentarily for the X session to finally launch, I mulled over the likelihood of a hard disk failure. That had to be it! No matter, I thought, I’d reimage the machine tomorrow and make my determination of a potential storage issue. Yet again, my contemplation was interrupted by a black screen–and a total system freeze.
Double-U Tee Eff, question mark.
I tapped my fingers on the keyboard. This was unusual, very unusual. Two hard drives failing? Possible, yes, but unlikely. Perhaps the SATA controller on the motherboard had gone on the fritz or was on its death throes. Clearly, something was angry. Very, very angry. I had one more operating system to try.
Another tap of the reset button, a few thumps of my thumb against the wrist rest, and Ubuntu was loading. Ahhh, KDE 4! How I had forgotten about you, I thought. Surely this would work. I tapped away my password, hit enter, and waited…
Black screen. Again.
Whatever was happening was far from humorous now. I reached over for a live CD but withdrew my hand only moments before touching the case. That wasn’t going to work if the SATA controller had indeed failed–my DVD drive is also plugged into that same controller! I’d have to dig up an old PATA drive later if I wanted to even consider the possibility of booting to a repair disk. BIOS might have additional clues.
One more press of the reset button revealed a more puzzling ending: nothing. No boot. No beeps. No errors. The system was dead.
For the rest of that night, I tried a variety of tests and made dozens of attempts to resuscitate the machine. Nothing worked. Not until the morning after I had tested the system’s power supply before I glanced over at the video card sitting on my work bench had the culprits been revealed.
Close-up of the most badly blown capacitor
Oblique view of the row of failed capacitors
The board in question was an EVGA 7600 GS (512MiB RAM) and failed approximately three months after the end of the 3 year warranty period. Of course, it isn’t like I’d actually want to replace it; I was due for an upgrade!
The interesting bit about this story is that about four or five months ago, I was startled at the sound of a loud “pop” coming from the direction of my workstation. I assumed it was the violently explosive reaction of a wayward bug that managed to discover a short, yet nothing was immediately indicative of any problem. I even disassembled my workstation and examined the motherboard for capacitor problems, but I never considered that the video card would be at fault. As it turns out, capacitor failures on this board are unsurprising. It’s just interesting how long the card managed to limp along without any overt indication of its plight. (When I questioned my father–an electrical engineer–about this, he explained that capacitors which fail with a burst along their top vent still have some capacitance that diminishes over time and exposure to heat. Eventually, they will fail completely, but unless the failure is close to the contact point, the capacitor can still work.)
It was an interesting find, to say the least!
4 Responses to “Popping like Popcorn: A Tale of Four Capacitors”
Holy cow… This would’ve made an awesome episode of “Unsolved Mysteries” had you not found the damned thing.
Bah… now what am I going to do with the next hour of my life?
I was writing up a project proposal. I think I need a mental release. Maybe something with zombies would suffice…
Well, well, well! The exact same problem as you – exactly the same capacitors have burst – after 4.5 years.
But, another capacitor, at the edge opposite the VGA connectors (there’s 2 there), the one closer to the edge, has also burst. Did that one also burst for you?
I did hear pops – but I didn’t think it was coming from my computer! My god!
Elsewhere on the ‘net I read that the caps are “electrolytic” and not “solid” caps, but they are painted to look like solid caps (no plastic cover, the blue crescents).
I wonder if this is an EVGA problem? And what’s that copper-coily-thing-in-grey among the capacitors – the top of it seems melted, just as in your pics.
Ps. What video card do you have now and did you buy it after good research? How is it performing now?
It certainly seems you got more use out of your card than I did! Mine lasted about 3 years. I replaced it with another EVGA that I only used for two (I didn’t want to purchase that same brand again, but it was on an emergency basis and was the only one the local store had in stock that was NVIDIA-based). My second EVGA card worked quite well, but before my latest build (May of this year, 2011), it was causing me similar grief possibly due to the PCIE power connector either not fully connecting or due to the power supply connector. I’m not 100% which because the problem was so intermittent.
In my case, only 4 capacitors burst. It’s hard to see from the pictures–which admittedly are of really poor quality–but the remaining caps did OK. I think the card just gave up the ghost once all the electrolyte boiled off (you can see a little of the brown stuff around the edges–there’s no solid states on here). :)
On the plus side, most solid state capacitors I’ve seen in the last couple years have all been square, so they’re fairly easy to spot!
The copper coil is an inductor of sorts and is most likely a choke coil. See here for a brief discussion about someone inquiring how to replacing one. It’s not melted, though. I believe that it’s actually embedded in epoxy in the 7600s that happens to be colored gray.
Currently, for my new build, I’m using a Zotac (450 GTS) card. It was a toss up between MSI, Gigabyte, Asus, and Zotac. I wasn’t willing to spend a whole lot on the card as I’m not a huge gamer, and the MSI cards at the time had complaints related to noisy cooling (big minus). So far, the Zotac seems to be holding up quite well, but I’ve only had it for the whole of about 3 months. It runs a little cooler than my last EVGA card (9600 GSO) at 65C under heavy load versus almost 71C (!) but seems to do well. I needed it mostly for the dual DVIs, and it was reasonably inexpensive. I may replace it down the road with a more powerful one, but it does quite well and the heat sink seems better designed than the two previous EVGA cards I’ve had. The one on the 7600 was a joke.
I’m not sure if this is exclusively an EVGA problem, and it might just be coincidence with the bad caps that flooded the market for a few years (and still continue to). It does give me pause for thought about purchasing something from a company that doesn’t vet their components…
Leave a comment