[cfarm-users] overheating and dead HBA
Luke Yasuda
jing at jing.rocks
Wed Jun 25 09:12:12 CEST 2025
Hello all,
Sad news: cfarm420~430 except cfarm423 are down. Today I had to shutdown
almost all servers at home because the room temperature was rising to 34
degrees and became too unbearable, and BMC keeps sending me so many
alerts. The air conditioning is working very hard but not sufficient.
After a few hours, I booted some of them back online, but
cfarm420..422,430 are still down, because the HBA is probably dead
(overheat?) (the drives themselves and the SAS expander are probably
OK).
cfarm424..429 will be back online in a jiffy, once the room temperature
is back to 28 degrees. As for cfarm420..422,430, until I replace the HBA
or find a way to reroute(?) the drives, they will be down and stay down
for an unknown period of time. I apologize for the inconvenience.
Clearly I need a long term solution to the overheating issue. Apparently
I have too many servers at home. Swapping the air conditioner for a more
powerful one is a no-go because my landlord would not do it. But
fortunately, related to my personal life and news, I plan to buy a home
soon, I'll personally pick out the best air conditioner units, and it
will be the permanent home for those servers. But until then, until the
servers are moved to a new home (not later than the end of this year),
there will be down time, maybe 4~8 hours per week.
Downtime - sysadmins' nightmare. This shall be a lesson for those who
want to host servers at home.
--
Luke Yasuda
About me: https://jing.rocks/about/
GPG Fingerprint: 4E09 8D19 00AA 3F72 1899 2614 09B3 316E 13A1 1EFC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: OpenPGP digital signature
URL: <http://lists.tetaneutral.net/pipermail/cfarm-users/attachments/20250625/376673f9/attachment.sig>
More information about the cfarm-users
mailing list