camprman Posted August 19, 2016 Share Posted August 19, 2016 I've been very lucky over the years, but my luck has finally ran out. I knew I had a drive failing but being on a fixed income, I had to wait to buy a new drive. The drive has now failed, and without looking or thinking I bought 2 new 4tb drives. Right now my parity drive is 3tb and the drive that has gone down is also 3tb. From all that I've read it appears that since I didn't think before I ordered my only option is to do a swap-disable. Right now the emulated drive is functional as expected, but I just don't have the room to try to copy everything off before doing the swap. I'm more than a little nervous, and just wonder if there is anything I can do short of copying before doing the swap-disable? I suppose you could say I'm a little overwhelmed at the moment. Quote Link to comment
JorgeB Posted August 19, 2016 Share Posted August 19, 2016 And why do you want to copy before the swap? The swap-disable procedure doesn't delete any data. Quote Link to comment
camprman Posted August 19, 2016 Author Share Posted August 19, 2016 Just my fear of losing files. I'm not afraid that the swap-disable feature will delete data so much as I'm worried about Mr. Murphy. Quote Link to comment
JorgeB Posted August 19, 2016 Share Posted August 19, 2016 Well, in that case either your only option is copy your data (assuming you don't have backups), you can use the extra disk you've got and assign it temporarily to the cache slot (or use the unussigned devices plugin). Quote Link to comment
trurl Posted August 19, 2016 Share Posted August 19, 2016 Are you absolutely sure the drive has failed? Go to Tools - Diagnostics and post complete diagnostics zip. Quote Link to comment
camprman Posted August 19, 2016 Author Share Posted August 19, 2016 Here you go. The device is sdl. tower-diagnostics-20160819-1805.zip Quote Link to comment
John_M Posted August 20, 2016 Share Posted August 20, 2016 Aug 13 10:57:16 Tower emhttp: ST3000DM001-1CH166_Z1F303M7 (sdl) 2930266584 I don't see a SMART report for this disk but there are lots of errors in the syslog such as this: Aug 17 12:43:24 Tower kernel: blk_update_request: critical medium error, dev sdl, sector 2446590216 so it looks pretty sick. You also have an overheating CPU. Your syslog has many instances like this: Aug 18 02:19:18 Tower kernel: CPU5: Core temperature above threshold, cpu clock throttled (total events = 1559278) Aug 18 02:19:18 Tower kernel: CPU1: Core temperature above threshold, cpu clock throttled (total events = 1559275) Aug 18 02:19:18 Tower kernel: CPU3: Package temperature above threshold, cpu clock throttled (total events = 3700436) You need to check the fan and blow the dust out of the heatsink. Quote Link to comment
JorgeB Posted August 20, 2016 Share Posted August 20, 2016 Like John pointed out, disk10 dropped offline and there's no SMART report, if you reboot it should come alive and you could grab a SMART report, but by the type of errors it really looks like a bad disk. A few more observations: When disk10 failed there was an error writing to super.dat, you should stop the array to recreate it, if there's a power cut or unexpected server reboot you'll lose all disk assignments, this is especially bad when there's a disable disk. There was one read error on both disks 8 and 9, since it happened just after a controller reset I'd venture a guess that it was the cause and the disks are fine, if RobJ sees this maybe he'll know for sure, SMART for disk8 looks fine, disk9 had some issues in the past, due to the known high failure rate of these disks I would replace it at the first sign of trouble. Due to those read errors both disks 8 and 9 were remounted read-only, you want to run reiserfsck on both. Quote Link to comment
camprman Posted August 22, 2016 Author Share Posted August 22, 2016 Thank you johnnie.black. That would have been the absolute worst. I ran reiserfsck on both disk8 and disk9 with no corruptions and not having to do any rebuilds. New drives should be here tomorrow. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.