December 22, 20169 yr Hi, I've been using my system for quite some years without real trouble, but now I seem to be in kind of a rough period. Yesterday I got a disk read error but it slipped by without me noticing. This morning I was moving a whole bunch of data on my Unraid but all of a sudden I got a message that there was not enough space. When I checked out the GUI my biggest data disk had disappeared. As I've run into trouble with faulty/loose SATA-cables in the past I thought it was just going to be that so I opened up my system and made sure everything was attached securely. However when I tried to reboot the system did not seem to detect my flash drive. I've run into this in the past and sometimes had to reboot several times before it got detected. At the same time my system acts up display-wise. HDMI always seems to output (but I only have a screen downstairs), VGA is about as reliable as my flash drive detection. The moment I attach an HDMI screen , USB flash detection works like it should. So you'd probably think just attach an HDMI screen and you're set? I wish.... Seems like the problem comes from somewhere else. During rebuilding my 'missing' drive I all of a sudden couldn't connect anymore to the GUI, nor ping my server. When I went upstairs to check it out I had all kinds of garbage-characters filling up my VGA-attached screen. I now moved the system downstairs again and attached it to a HDMI-screen. It boots like normal so I decided to run a memory test. I tried it multiple times but each time after a few minutes it gives me a screen like the attached screenshot. Now I've disconnected the flash drive and it booted until I now just have a blinking cursor. This screen stays in perfect 'blinking' condition. Could it be that my flash drive is faulty or do I need to test other stuff out first? Thanks a bundle to anyone who can help me along...!
December 22, 20169 yr Since you didn't attach diagnostics or describe your hardware at all, it's a little difficult, but my first guess is bad RAM. Try removing half of it, and see what happens. If symptoms are the same, switch to the half that you removed.
December 22, 20169 yr I'm afraid you have a serious hardware issue. That's a really interesting display there! Something is flipping the 2 bit on, which turns spaces into quotes, and adds 2 to many letters. Since it got to Test #4 without reporting a memory error, in an obvious memory corruption scenario, I think it's not a memory issue but a bus or register issue. That means motherboard or CPU. Could also be a heat issue. Overheating chips can be very flaky, but I would have hoped the system had shut down before now. I suppose I should mention bad power as a remotely possible suspect, but this doesn't look like it to me (but I could be wrong). You are going to have to swap out major components, until it all just works, including a Memtest. Personally I think it's the motherboard, and you may want to check for bulging caps on it.
December 22, 20169 yr Author Good point: I don't have a syslog, but I can at least give you some info on my system : CPU : Core™ i3-2100 (2x 3100 MHz) HR3I01 MB : H67M (Intel® H67 Express) GRIR02 (on-board GPU) RAM : 4 GB DDR3-1333 (4096 MB) ICIF50 Flash drive : Cruzer 4GB As I only have one piece of RAM removing half is a bit tough ... Anything else I can try to give you guys a clearer picture?
December 22, 20169 yr As I only have one piece of RAM removing half is a bit tough ... Anything else I can try to give you guys a clearer picture? I was afraid of that. Unfortunately, I think Rob's analysis is probably correct, in that you are going to have to start replacing parts, aka shotgun troubleshooting. In your shoes, my first purchase would be another stick of RAM, worst case you end up with 8GB of RAM, which never hurts. Bad caps are always a possibility for flaky issues, but your board is a little newer than I would expect to start seeing cap problems. Do you have any PCIe cards? They map into memory space as well, so a badly misbehaving card is also possible.
December 22, 20169 yr Author The only other card in there is a "Promise Pci Sata 300 TX4 Controller Card (4x SATA) [HDCPROSATA300TX4]" No PCIe though. Caps all seem to be in normal condition (as far as I can tell) Buying another piece of RAM certainly should be possible. Next on the list then probably would be a new MB ? Anything I need to/can do to hopefully save my data (as I was rebuilding my 'missing' - drive during the failure)?
December 22, 20169 yr I think the best thing you can do for your data is disconnect it! Any access to your data could possibly result in data corruption. I'd pull all drive power cables.
December 22, 20169 yr Anything I need to/can do to hopefully save my data (as I was rebuilding my 'missing' - drive during the failure)? Until you get things stable through a memtest, I'd disconnect all your data drives completely. No telling what would happen to your data if you wrote to the drives with the system flaky like it is.
December 22, 20169 yr Author Thanks. I disconnected all power cables to my drives. Weird thing now is that I created another flash drive just to run Memtest. It succesfully completed a pass and has been running for 25 minutes straight without any hiccups. Could this at all be flash drive related? Or would you rather suspect a power problem in this case? Edit: I just started from my unraid flash drive and Memtest also seems to run without problems. So the only changed factor is the fact that all my drives stopped drawing power / are no longer connected. Do I need to look for another power supply or could something else still be causing this?
December 22, 20169 yr I'd say the power supply became a strong suspect. But first, re-seat all chips and cards, make sure they all have good connections. Then reconnect one drive again, and try one Memtest pass. Then add a couple more back and retest, and repeat ...
December 23, 20169 yr Author Memtest had 5 full passes without any errors or weird characters on screen. (no disks attached, 1, 2 , 3 and all 6). I've now been rebuilding for the last 7 hours and all seems stable. No idea what caused this so I'm still worried this might return at any moment but I'll take it. Thank you guys for your advice. Fingers crossed... and happy holidays!
Archived
This topic is now archived and is closed to further replies.