Just Me Posted June 15, 2015 Share Posted June 15, 2015 Hey! This is my first real problem I have with unraid and I'm not sure how to handle it. I'm copying some file from one disk share to another so that I can reformat some disks with xfs. I copied all files from disk4 to disk1 without any problems, changed the filesystem to xfs for disk4 and copied all files from disk5 to disk4. Then I compared all files from disk5 with disk4 using rsync so that I know all files on disk4 are okay but unraid shows read error on disk4. So what should I do now? diagnostics.zip attached. Since the syslog is quite long, the error accrued at the end of the log so better start at the end SMART report for disk4 is WDC_WD20EARS-00MVWB0_WD-WMAZA1724425 (sdk) Thanks in advance for your help. p.s. unraid 6.0 RC 6a nas-diagnostics-20150616-0135.zip Link to comment
dgaschk Posted June 16, 2015 Share Posted June 16, 2015 SMART looks ok. Run a parity check. Link to comment
Just Me Posted June 16, 2015 Author Share Posted June 16, 2015 Thanks for your reply. I'm running a non correcting parity check now, this will take a while. I'll report. Link to comment
enetec Posted June 16, 2015 Share Posted June 16, 2015 Have you never fallen to the floor one of your Samsungs? Link to comment
Just Me Posted June 16, 2015 Author Share Posted June 16, 2015 Pardon? I don't unterstand what you mean, sorry Link to comment
Just Me Posted June 17, 2015 Author Share Posted June 17, 2015 Okay, status report: The non correcting parity check finished with 0 errors. The webgui shows 0 errors at all disks. I hate it when issues disappear on their own, now I have to worry about if it recurs Link to comment
dgaschk Posted June 17, 2015 Share Posted June 17, 2015 Have you never fallen to the floor one of your Samsungs? I think he's asking if you dropped one of the Samsungs. Those drives report G-shock and one of them looks like it took a bump but is otherwise ok. Link to comment
dgaschk Posted June 17, 2015 Share Posted June 17, 2015 Increase parity check frequency and hopefully it will reveal itself. Link to comment
enetec Posted June 17, 2015 Share Posted June 17, 2015 Have you never fallen to the floor one of your Samsungs? I think he's asking if you dropped one of the Samsungs. Those drives report G-shock and one of them looks like it took a bump but is otherwise ok. Yes, exactly. A VERY BAD bump... SMART value for that attribute is reporting 1 that is the worst possible value (e.g. a BUMP harder than hard drive specifications...) Anyway even one of your Seagates is reporting some reallocated sectors... I would check it's surface with a HDD Regenerator live CD before putting back it on production... Link to comment
Just Me Posted June 17, 2015 Author Share Posted June 17, 2015 I have a lot of funny SMART values but no, I've never dropped this Samsung drive, I have no clue why the raw value is that high. The other SMART data are fine and the drive never showed any issues. Relating to the 2 reallocated sectors on one of the Seagates. The SMART report shows this value for years now, even before I used the drive in my unRAID server. I precleared this drive two times before I put it in my array, the value is constant so I guess this is not an issues. What I'm concerned about is the command timeout. The raw value is 38655361034, value is 100, worst 098, threshold 000; 098 >> 000 so I guess it is okay too. Link to comment
Just Me Posted June 24, 2015 Author Share Posted June 24, 2015 So here we are again with errors. This time it is disk3, still no problem in smart data. Syslog and Smart report attached. What could be the reason? Cable? SATA controller-card? I don't know where to start or even how? nas-sylog-smart.zip Link to comment
dgaschk Posted June 26, 2015 Share Posted June 26, 2015 Check for BIOS and SATA card firmware updates. Do the disks share a controller? Link to comment
Just Me Posted June 26, 2015 Author Share Posted June 26, 2015 Yes, disk 3 and 4 are connected to the mainboard so they share the onboard controller. No BIOS update available. No new firmware for the cheap 2 port SI3132 controller or the adaptec 1430SA controller (last update is from 2010). Last night I run a parity check, no sync errors but again disk errors, this time it was disk4. There is one thing I don't unterstand, a parity check is a read only process, right? So why are there 37 writes on disk 4? 37 writes, 37 error, coincidence? The other data disks shows 0 writes (see screenshot starting with disk2. First column is reads, second writes and third errors.) Link to comment
dgaschk Posted June 26, 2015 Share Posted June 26, 2015 The array is still accessible during parity operations. The parity check is read-only. There could be any number of other processes access the array at any time. Link to comment
Just Me Posted June 26, 2015 Author Share Posted June 26, 2015 True but I ran the parity check at night, all systems that could access the unRAID Server were powered down. So there shouldn't be any access except unRAID itself. Anyway, so it is just coincidence that there are 37 writes and 37 errors. Any ideas what I could do to find the reason for the random error? Link to comment
dgaschk Posted June 26, 2015 Share Posted June 26, 2015 What is the exact model power supply? Link to comment
Just Me Posted June 27, 2015 Author Share Posted June 27, 2015 It is a 400W Corsair power supply. CMPSU-400CX Link to comment
dgaschk Posted June 27, 2015 Share Posted June 27, 2015 Power supply has reached its limit. Get >40-50Amp 12V rail. Link to comment
Just Me Posted June 27, 2015 Author Share Posted June 27, 2015 Oh really? Before I bought this PSU a few years ago I checked the forum (or the wiki?) and it should support up to 12 drives. Anyway, this could be the reason, I've added a new drive recently and I've never had any issues before. I think for a quick fix I'll remove an old 1 TB drive from the array. If I'm back to ten drives the psu should be fine again. Thank you for your help. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.