01111000 Posted April 8, 2014 Share Posted April 8, 2014 Hey guys. One of my old Samsung 1.5tbs is starting to give a bunch of errors during parity checks. My motherboard has 6 Sata slots and I use a 2 port card to give me a total of 8 drives. The server currently has 8 drives, so I purchased a 4 port card I saw on sale (from the Good Deals) forum a while back to expand a bit. This is the card: http://www.newegg.com/Product/Product.aspx?Item=16-124-064 So I installed the new card in place of the old 2 port card and the array came online just fine with everything assigned correctly. I added a WD 3TB Red I got on sale to the new card and started Preclearing it. During the preclear, another drive, a 2TB Seagate (Samsung RMA) red-balled. I figured that it just died, so when the 3TB Red finished it's preclear, I reassigned it to the old slot and the drive was rebuilt. What's odd is that I decided to run a preclear on the 2TB that redballed and it seems to be working fine (it's not finished yet, its around 51%). Ironically, during this new preclear, the new 3TB WD Red drive I just installed has redballed. Both of the drives that have redballed are on the same controller; is it the controllers fault and how can I check?. How can I troubleshoot this? What logs do you guys need? Link to comment
dgaschk Posted April 8, 2014 Share Posted April 8, 2014 Attach a syslog. What model number PSU? Link to comment
01111000 Posted April 9, 2014 Author Share Posted April 9, 2014 The 2TB Seagate that previously Red Balled passed the Preclear without any issues (everything was 0). Attached the Syslog Here is the PSU: http://www.newegg.com/Product/Product.aspx?Item=N82E16817371029 Now that I think about it, it's probably the PSU? Currently 9 Drives in the server. Other Specs: System: Supermicro - X7SPA-HF CPU: Intel® Atom CPU D525 @ 1.80GHz - 1.8 GHz Cache: 48 kB, 1024 kB Memory: 4096 MB (max. 4 GB) UNRAID 5.0, was waiting for Tom to drop 5.0.6 before I upgrade but you know how that goes. Thanks for all the help guys. syslog.txt Link to comment
dgaschk Posted April 9, 2014 Share Posted April 9, 2014 Look in /var/log for additional syslog files. They are named syslog.1, syslog.2, etc. ls /var/log cp /var/log/syslog.* /boot The second command will copy all of the syslog files to the flash drive. Attach the additional files. Link to comment
01111000 Posted April 9, 2014 Author Share Posted April 9, 2014 The files that the syslog is returning is old stuff on the cache drive that were downloaded through sabnzbd. Any idea where I can upload these files to? Each one is over a megabyte in size and pastebin's limit is only 500kb. Link to comment
trurl Posted April 9, 2014 Share Posted April 9, 2014 The files that the syslog is returning is old stuff on the cache drive that were downloaded through sabnzbd. Any idea where I can upload these files to? Each one is over a megabyte in size and pastebin's limit is only 500kb. Do you mean the older syslogs are large because they have a lot of entries for mover? You can probably just delete those lines to make them smaller. Then you can zip the logs and post directly to the forum. Text compresses very well. Link to comment
01111000 Posted April 10, 2014 Author Share Posted April 10, 2014 Thanks for all the help so far guys, syslogs attached. It's cluttered. syslog.2.zip Link to comment
dgaschk Posted April 10, 2014 Share Posted April 10, 2014 Check for BOIS and firmware updates and try a new PSU. See here: http://lime-technology.com/forum/index.php?topic=12219.0 Link to comment
01111000 Posted April 10, 2014 Author Share Posted April 10, 2014 Is there anything in the Syslog that indicated an error (I looked but didn't see or understand a cause). I'll try a new PSU, if that's your recommendation. How would I proceed after I purchase a new PSU though? I would still need to rebuild that drive, correct? Should I re run a preclear on that 3TB drive before rebuilding, as well? If it is the PSU, what exactly happens? Does it not have enough juice to power it up so when it tries to and can't does the system report the drive as a failure since it can't see/read from it at that time? Link to comment
dgaschk Posted April 10, 2014 Share Posted April 10, 2014 If the PSU is the issue this should no longer appear in the syslog: Apr 8 10:37:42 Tower kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Apr 8 10:37:42 Tower kernel: ata8.00: failed command: SMART Apr 8 10:37:42 Tower kernel: ata8.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in Apr 8 10:37:42 Tower kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 8 10:37:42 Tower kernel: ata8.00: status: { DRDY } Apr 8 10:37:42 Tower kernel: ata8: hard resetting link Apr 8 10:37:42 Tower kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Apr 8 10:37:47 Tower kernel: ata8.00: qc timeout (cmd 0xec) Apr 8 10:37:47 Tower kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4) Apr 8 10:37:47 Tower kernel: ata8.00: revalidation failed (errno=-5) Apr 8 10:37:47 Tower kernel: ata8: hard resetting link Apr 8 10:37:48 Tower kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Apr 8 10:37:48 Tower kernel: ata8.00: configured for UDMA/133 Link to comment
01111000 Posted April 19, 2014 Author Share Posted April 19, 2014 It's been a week since I changed the PSU (went from a 400W to a 600W). Rebuilt that drive while I precleared another and everything is running fine again (fingers crossed). Thanks for the help. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.