January 5, 201610 yr I say upgrade, but I really just did a fresh install and chose the same discs as before and started up the array. Here is my old array: http://i.imgur.com/2Rk8dto.png This is my new array: http://i.imgur.com/b4Uf3ir.png How can I tell what's wrong with disk 6? Is there a log I can checkout somewhere? It seemed to be fine before (or maybe not?) EDIT: Found this in the syslog: Jan 4 17:02:21 Tower kernel: sd 1:0:0:0: [sdh] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Jan 4 17:02:21 Tower kernel: sd 1:0:0:0: [sdh] tag#0 Sense Key : 0x3 [current] [descriptor] Jan 4 17:02:21 Tower kernel: sd 1:0:0:0: [sdh] tag#0 ASC=0x11 ASCQ=0x4 Jan 4 17:02:21 Tower kernel: sd 1:0:0:0: [sdh] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 00 00 c1 60 00 00 00 08 00 00 Jan 4 17:02:21 Tower kernel: blk_update_request: I/O error, dev sdh, sector 49504 Jan 4 17:02:21 Tower kernel: ata7: EH complete Jan 4 17:02:21 Tower kernel: md: disk6 read error, sector=49440 Jan 4 17:02:21 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 Jan 4 17:02:21 Tower kernel: REISERFS (device md6): replayed 708 transactions in 10 seconds Jan 4 17:02:21 Tower kernel: blk_update_request: I/O error, dev sdh, sector 0 Jan 4 17:02:21 Tower kernel: sd 1:0:0:0: [sdh] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 4 17:02:21 Tower kernel: sd 1:0:0:0: [sdh] tag#0 CDB: opcode=0x8a 8a 00 00 00 00 00 00 01 00 d0 00 00 00 08 00 00 Jan 4 17:02:21 Tower kernel: blk_update_request: I/O error, dev sdh, sector 65744
January 5, 201610 yr Author What do I do if I get a red X next to a hard disk? Thanks, I'll take a look
January 5, 201610 yr The photo shows that you have 2 write errors. unRAID guru's (Which I am not) will tell you that unRAID takes those write errors very seriously and will take the drive out of service as soon as it sees one. (My opinion differs here a bit, as my experience with unraid is that I have had drives "fail" that have no S.M.A.R.T. errors or other problems, and then just rebuilt them, and they have continued to run for a long time further) In any case, you should post a smart report, and a log report from your server. To obtain a S.M.A.R.T. report through the GUI 1. Click on the workds "Disk 6" You will be taken to a new page showing you details of the drive. 2. Scroll down and click on "Download" button to the right of "Download SMART report" To obtain a S.M.A.R.T. report through command line Follow the instructions Herehttp://lime-technology.com/wiki/index.php/Console#smartctl To obtain a log report through GUI 1. Navigate to "Tools> System Log" 2. Click download in top right. To obtain a log report through command line Follow the instructions here https://lime-technology.com/wiki/index.php/Viewing_the_System_Log Once you provide more info, we can help you out more. Thanks
January 5, 201610 yr Author Thanks I'll definitely check that out. I actually didn't realize this before, but it seems as though my earlier array may have had the same issue, as the orb wasn't green like the others. I'm colourblind so it wasn't as obvious in v5 as it is in v6.
January 5, 201610 yr Author Attached! I'm starting to think it could be as simple as a loose SATA cable, unless you guys can find anything else in the logs. tower-smart-20160104-1725.zip tower-syslog-20160104-1730.zip
January 5, 201610 yr Your smart report was unable to run, and contains no information. I don't know enough to provide you safe information on what to do next. Hopefully someone else will chime in.
January 5, 201610 yr Community Expert Instead of posting separate SMART for a single drive, and posting a separate syslog, you should always go to Tools - Diagnostics and post the complete diagnostics zip. It includes the syslog and SMART for all drives and a lot of other things that might be useful in diagnosing problems. Who knows, you may have other drives about to give problems and that could cause problems while trying to recover from your current one. While it is true that unRAID and the gurus take write errors very seriously, the gurus don't suggest replacing a drive unnecessarily. It is often something other than a bad drive. However, a drive that has been disabled for write errors must be rebuilt because it doesn't actually have valid data anymore. The valid data is in the parity array because the failed writes to the disabled disk were used to update parity anyway and the data that didn't get written can be recovered. Note that a failed write is not synonymous with a file not getting written. It could be a part of a file, or even worse, part of the filesystem that keeps track of files.
January 5, 201610 yr Author Thanks guys. I moved the server to another room a few months ago, I'm thinking a cable may have come loose. I have a bunch of swappable bays so I might try and rotate the drive to another spot to see if that clears up the issue. I'll also post additional logs if it doesn't help. I'm logging off now but will try this tomorrow.
January 5, 201610 yr Community Expert Also, any writes to that disk that happened after it was disabled can be recovered too. unRAID will continue to accept writes for a disabled drive because it can recover them. And you can still read the drives data even though unRAID will not read from it until it is rebuilt. It gets the drives data from the parity array instead. Just saw your reply while I was typing this. Fixing cables and anything else that might be wrong is important, but it will not re-enable the drive. It must be rebuilt either to a new drive or to itself. The wiki I linked will tell you everything you need so please read it.
January 5, 201610 yr Community Expert And just in case I haven't been clear on this. Since your drive has been disabled for some time, there is probably a lot of the drive's data that is not actually on the drive. As soon as a write fails, unRAID disables the drive and never uses it again for reads or writes to that drive until it is re-enabled by rebuilding it. Any reads or writes for the drive are handled by parity calculations with all of the other disks.
January 5, 201610 yr Community Expert Looking at your screens, your v5 array had disk6 disable, did you upgrade to v6 like that? How did you upgraded? By doing a new config? Because on your v6 screen you have parity building and also disk6 disable, so if you did a new config and tried do sync parity before replacing disk6 you could have invalidate your parity.
January 5, 201610 yr Author Looking at your screens, your v5 array had disk6 disable, did you upgrade to v6 like that? Yes, someone had to actually point out to me that the orb wasn't green in my screenshot. Being severely colour blind, it may have been red (I think?) for some time now. The one thing I had noticed recently was that none of my drives were ever spinning down, they all seemed to be active all the time. Would this be a symptom of reading from parity because of an offline drive? How did you upgraded? By doing a new config? I did a new config. I have backups of my v5 setup, but chose to set everything up from scratch. Because on your v6 screen you have parity building and also disk6 disable, so if you did a new config and tried do sync parity before replacing disk6 you could have invalidate your parity. Ah shhhhiiii... Feeling a little stupid about not doing anything about this before the upgrade .... On the bright side the X's in v6 are really easy for me to differentiate. I've moved the drive to another SATA/power connection and while it shows a little differently now, the SMART report seems to fail. I'm attaching a diagnostics log for you guys. I've got work to do now, but it sounds like I've got some reading to do tonight... tower-diagnostics-20160105-0548.zip
January 5, 201610 yr Community Expert Wait for some help, maybe someone has an idea how best to proceed, if for example disk6 has been disable for a month, all writes to that disk were being emulated by parity + all other disks, since you did a new parity sync I don’t think there’s a way to get that data back, it can however be possible to get all or some data from disk6 as it was when it failed the first time.
January 5, 201610 yr Community Expert Well the SMART report for disk 6 shows 17 pending sectors. I notice that disk 1 also has 1 pending sectors. Pending sectors are never good as they mean that there are sectors that cannot be read reliably, and this can affect whether a rebuild is going to be completely successful if another drive fails. It is possible that the drives are actually fine and if you run a pre-clear cycle the pending sectors may be cleared. However you need to get any data off them before that can happen. From what has been said it is possible that you do not have good parity so that disk 6 cannot be rebuilt successfully? Disk 5 has not been marked as failed but when since it has a none-zero value for pending sectors once disk 6 has been put back into operation you will want to work on getting the pending sector on disk 5 cleared as well.
January 5, 201610 yr Community Expert One thing you could try without nothing to lose is doing a new config again but this time trusting parity, then stop array, unassign disk6 and start array, disk6 will again be emulated, try and see if you can read any data from it, chances of success will depend on what this new parity sync changed.
January 5, 201610 yr Author One thing you could try without nothing to lose is doing a new config again but this time trusting parity, then stop array, unassign disk6 and start array, disk6 will again be emulated, try and see if you can read any data from it, chances of success will depend on what this new parity sync changed. So it does show the disk as emulated, but it's also showing no disk content when I try to browse it. Hmmm.
January 5, 201610 yr Community Expert That what I was afraid off, parity was damaged on latest sync, is it unmountable like earlier v6 screenshot? I don’t know if you can successfully run reiserfsck on an emulated disk.
January 5, 201610 yr I don’t know if you can successfully run reiserfsck on an emulated disk. Yes you can. No different than running it on a physical one
January 5, 201610 yr Author That what I was afraid off, parity was damaged on latest sync, is it unmountable like earlier v6 screenshot? I don’t know if you can successfully run reiserfsck on an emulated disk. Yeah it's still unmountable. However the red X is gone and I've just got that warning icon on it now. It gives me the option to rebuild, would it be worth trying that at this point? Attached is the current state of the array through the UI.
January 5, 201610 yr Community Expert Parity should be green, did you check the trust parity option? You have to do a new config and before starting array check the option to trust parity, right next to the start button. Then stop the array, unassign disk6 and start array again, every disk has to be green except disk6.
January 5, 201610 yr Community Expert At least v6 will notify you in the future before you let things deteriorate like this.
January 5, 201610 yr Author Parity should be green, did you check the trust parity option? You have to do a new config and before starting array check the option to trust parity, right next to the start button. Then stop the array, unassign disk6 and start array again, every disk has to be green except disk6. Thanks I installed from scratch again, and parity is happy now. Disk 6 is back to an X, though. I'm starting to think it was a legit disk failure. EDIT: Just to be clear the red X was there even when it was still assigned to the array.
January 5, 201610 yr Author At least v6 will notify you in the future before you let things deteriorate like this. That's great. I was even impressed by the change in icons for colourblind people, we are such a minority that no one else normally cares!
Archived
This topic is now archived and is closed to further replies.