6.2.0 RC3 crashing with parity check


Recommended Posts

Crashing when running parity check on unRaid 6.2.0 RC3.

 

Hello there. My system seems to crash - it goes all the way to powered off - when I run a parity check.

Needless to say this behavior is troubling. I'm trying to transition to unRaid away from a Synology system.

My basic setup is an AMD A8 APU with 16GB of RAM and an SAS9200-8E card going to a Lenovo SA120 DAS attachment with 12 disks in it right now. I haven't added the old Synology disks to the array yet just in case I need to put them back in to the Synology box if I lose data  :o

 

Attached are my diagnostics. Please note that I have a gigabit USB lan attached to the server and my network switch appears to be wonked and is throwing errors. I've got a new switch in the mail since mine appears not to support LACP/LAG. The system exhibited these behaviors prior to adding the additional NIC.

tun-diagnostics-20160821-0734.zip

Link to comment

I'm getting drive temps for all drives at about 39-40C. CPU temp not going above 51C. Fans all running properly both by visually inspecting and reporting from software.

 

I've thought about just trashing this MB/CPU because it's from an old desktop, but I just dropped a bunch of money on the SA120, caddies, and now a new switch. I've kind of exhausted my fun money.

 

Thanks.

Link to comment

OK, maybe there's something to this. The Dynamix temp monitor reports 53C sitting at idle, quickly shoots up to 62C when running parity check. The BIOS reports 63C sitting at idle with the stock cooler running, the top of the 2U case open and all the case fans running.

 

I'm inclined to believe that the Dynamix drivers - k10temp nct6775 - are reporting incorrectly by roughly 10C and my system is crashing when running parity check (because I believe the BIOS more than Dynamix).

 

I think I'm probably just going to scrap this board.

 

I'm running dual parity, BTW.

Link to comment

You did look at the CPU heatsink very carefully and verified that it does not have ANY dust or dirt between the fins?  Are there any filters on the case?  Clean them also.

 

There is also a possibility that the heatsink has been disturbed and it not longer in proper contact with the cooler.  The AMD boards that I have seen use a locking latch to secure the heat sink to the CPU.  That latch didn't get released for any reason did it? 

Link to comment

OK - here are the answers to everyones questions (I think).

Motherboard is ASRock FM2A85X

CPU is AMD A8-5500

PSU is EVGA SuperNOVA 650 P2, 80+ PLATINUM 650W - only things on it are case fans, motherboard/CPU(APU), 16GB of RAM (Corsair "Ripsaw") and an SSD and an LSI SAS9200-8E connected to the external Lenovo SA120 DAS.

I removed the heatsink, sprayed and removed dust with compressed CO2, cleaned old CPU grease with the 2-step cleaner that the Arctic Silver people make and re-applied Arctic Silver 5 and re-seated and re-latched the CPU heatsink. I followed their directions to the letter, including "tinting" the heatsink. This is the stock CPU heatsink which is a downdraft cooler. I don't think having the case open would make CPU temps worse since it's downdraft and has unrestricted airflow with the lid off. I pondered getting an aftermarket cooler, but the "vertical" models don't fit in this case with its limited clearance. It's possible that an aftermarket downdraft would improve temps slightly, I guess.

I re-flashed the BIOS with the latest updates and reset to defaults.

Case is a Rosewill 2U rackmount case (man, it's a crappy case, I wish it had never darkened my doorstep) - I have removed all of the air filters to assist in airflow

I'm now running Memtest again - now that I've done everything noted above. I found that Memtest froze on test #7 more than once when forcing multi-processor mode (which I hadn't tried before but decided to try since this seems to be a CPU issue). This seems to maybe be a bug in Memtest - http://canardpc.com/forums/threads/84663-Memtest86-is-freezing-while-running-test-7. Not sure what to make of this.

So I now have it running in single processor mode - IE I just let it run after selecting memtest from the boot menu. This single processor mode ran fine for hours and hours previously. I'm letting it run for 8+ hours tonight to see what happens.

 

Thanks everyone for the helpful replies.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.