rickardk Posted December 4, 2011 Posted December 4, 2011 Server disabled disk 3 while my movie scrapper was updating nfo-files (on all disks). Reiserfsck on disk3 returned no errors and no SMART errors. Can it be a controller going bad on me? This was yesterday and since then I had to hard reboot two times cause the machine stopped responding. In unmenus syslog-viewer I can see a couple of red areas. Please can anyone tell me what to do. Syslog attached syslog-2011-12-04.txt
rickardk Posted December 4, 2011 Author Posted December 4, 2011 Dec 4 14:16:26 UNRAID2 kernel: sd 5:0:0:0: [sdt] Attached SCSI disk (Drive related) Dec 4 14:16:26 UNRAID2 kernel: ata20: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Dec 4 14:16:26 UNRAID2 kernel: ata20.00: qc timeout (cmd 0xec) (Drive related) Dec 4 14:16:26 UNRAID2 kernel: ata20.00: failed to IDENTIFY (I/O error, err_mask=0x4) (Errors) Dec 4 14:16:26 UNRAID2 kernel: ata20: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Dec 4 14:16:26 UNRAID2 kernel: ata20.00: ATA-8: WDC WD20EADS-00R6B0, 01.00A01, max UDMA/133 (Drive related) Dec 4 14:16:26 UNRAID2 kernel: ata20.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA (Drive related)
Rajahal Posted December 4, 2011 Posted December 4, 2011 Hi Rickard, The 'failed to identify' errors you are seeing in your syslog are perfectly normal for any server using more than one Supermicro AOC-SASLP-MV8 card. As the server boots it loads the drivers for the first card normally. As it detects the second card, it tries to load the same drives again and throws an error. This is common for any server using multiple SASLP cards, and it can be safely ignored. I also see no other indications in your syslog of any type of hardware error. Since unRAID disabled disk3, that means that a write to that disk failed. I would suggest replacing disk3 as soon as possible and letting unRAID rebuild its data onto a new disk. With disk3 outside of your server, you can then use another computer to run further tests on it (such as a long SMART test) to evaluate the disk's health. If the disk is causing the server to lock up, then chances are it needs to be replaced. If after removing disk3 you still have issues with your server not responding, then please let us know. However, I suspect this is a simple case of a single drive going bad.
rickardk Posted December 4, 2011 Author Posted December 4, 2011 Thanks! Just replaced the disk and rebuild started.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.