loady Posted April 18, 2019 Share Posted April 18, 2019 (edited) I use my server remotely 90% of the time, when I am onsite it is usually to attach an unassigned device to back up date I have been working with (1000's of gb), the last few times I visited and done what I needed I have noticed that a drive was showing as missing when I later connected remotely, I later discovered that a sata power plug had come loosed when sliding the side case shut, when plugged back in it was all fine. This has happened again for the third time (yes, I am going to alter the hardware structure), however, the last instance, two plugs got pulled, I plugged back in and one of the drive was showing as missing, I identified it and added it back into the array, now this time it is seeing it as a new disk and has started a data rebuils, approx. 2 days to complete, im guessing I cant stop the train now ? why did the drive not just get added back into the array and carry on as normal ? EDIT: rebuild has balanced out to 8 hours …. phew. guess I should let it ride out ? Edited April 18, 2019 by loady Quote Link to comment
Frank1940 Posted April 18, 2019 Share Posted April 18, 2019 Yes, let it ride and see if it completes successfully. By the way, the situation you described is not unusual. By any chance, did someone tie and dress all of the SATA (data and power) up so that they would 'look pretty'? This is usually a recipe for disaster as moving one cable the slightest amount will often loosen one or more connectors. If I get inside my server case to do anything, the last thing I do before closing up the case to check each SATA connector to make sure it is securely pushed in tight-- working from inside out. And I have quick-change drive enclosures so that the case does not need to be opened for drive changes. Quote Link to comment
loady Posted April 19, 2019 Author Share Posted April 19, 2019 14 hours ago, Frank1940 said: Yes, let it ride and see if it completes successfully. By the way, the situation you described is not unusual. By any chance, did someone tie and dress all of the SATA (data and power) up so that they would 'look pretty'? This is usually a recipe for disaster as moving one cable the slightest amount will often loosen one or more connectors. If I get inside my server case to do anything, the last thing I do before closing up the case to check each SATA connector to make sure it is securely pushed in tight-- working from inside out. And I have quick-change drive enclosures so that the case does not need to be opened for drive changes. I plant to do the same and yes, they do look pretty. I let it ride out, however when I came back to check this morning the disk is now disabled, from what I can see the data rebuild finished but it is saying contents emulated Quote Link to comment
JorgeB Posted April 19, 2019 Share Posted April 19, 2019 4 minutes ago, loady said: the disk is now disabled, from what I can see the data rebuild finished but it is saying contents emulated If it finished the disk would't be disabled, unless it got disabled again after the rebuild, either way you should post the diagnostics. Quote Link to comment
loady Posted April 19, 2019 Author Share Posted April 19, 2019 (edited) 1 minute ago, johnnie.black said: If it finished the disk would't be disabled, unless it got disabled again after the rebuild, either way you should post the diagnostics. yes..errmm..have not posted a diags for a while..theres a button somewhere now for it ? Edited April 19, 2019 by loady Quote Link to comment
itimpi Posted April 19, 2019 Share Posted April 19, 2019 (edited) 16 minutes ago, loady said: oes..errmm..have not posted a diags for a while..theres a button somewhere now for it ? Tools->Diagnostics to get the diagnostics zip file. Edited April 19, 2019 by itimpi Quote Link to comment
loady Posted April 19, 2019 Author Share Posted April 19, 2019 7 minutes ago, itimpi said: Tolls->Diagnostics to get the diagnostics zip file. thanks. LOTS of errors in this one warptower-diagnostics-20190419-1003.zip Quote Link to comment
JorgeB Posted April 19, 2019 Share Posted April 19, 2019 Disk dropped offline so there's no SMART report, but looks more like a connection problem, assuming SMART is OK replace cables and try again. Quote Link to comment
loady Posted April 19, 2019 Author Share Posted April 19, 2019 36 minutes ago, johnnie.black said: Disk dropped offline so there's no SMART report, but looks more like a connection problem, assuming SMART is OK replace cables and try again. ok, checked all cables, I even changed the power cable for a spare one, rebooted and its still disabled warptower-diagnostics-20190419-1103.zip Quote Link to comment
JorgeB Posted April 19, 2019 Share Posted April 19, 2019 15 minutes ago, loady said: and its still disabled That's expected, you'll need to rebuild again. Quote Link to comment
loady Posted April 19, 2019 Author Share Posted April 19, 2019 1 minute ago, johnnie.black said: That's expected, you'll need to rebuild again. unfortunatlely, its not offering a rebuild ? 7 minutes ago, johnnie.black said: That's expected, you'll need to rebuild again. Quote Link to comment
JorgeB Posted April 19, 2019 Share Posted April 19, 2019 https://wiki.unraid.net/Troubleshooting#Re-enable_the_drive Quote Link to comment
loady Posted April 19, 2019 Author Share Posted April 19, 2019 1 hour ago, johnnie.black said: https://wiki.unraid.net/Troubleshooting#Re-enable_the_drive It's rebuilding again, so if something goes wrong will syslog catch it ?..I'm thinking last time it stopped for some reason. Quote Link to comment
JorgeB Posted April 19, 2019 Share Posted April 19, 2019 1 hour ago, loady said: so if something goes wrong will syslog catch it ? Yes, grab diags before rebooting/shutdown. Quote Link to comment
trurl Posted April 19, 2019 Share Posted April 19, 2019 5 hours ago, loady said: so if something goes wrong You should configure Notifications so you get alerted immediately by email or other agent. Quote Link to comment
loady Posted April 20, 2019 Author Share Posted April 20, 2019 No joy, I came back this morning and the disk is again disabled, from what I can see from notification from the "fix common problems" plugin, it comes back online and is in normal operation, the it immediately is disabled again and starts another data rebuild which gets cancelled, not by me, here is the log, I have not yet rebooted the server. warptower-diagnostics-20190420-0835.zip Quote Link to comment
JorgeB Posted April 20, 2019 Share Posted April 20, 2019 Unfortunately log is spammed with these errors: Apr 19 18:20:41 warptower nginx: 2019/04/19 18:20:41 [error] 4639#4639: *79539 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost" Apr 19 18:20:41 warptower nginx: 2019/04/19 18:20:41 [error] 4639#4639: MEMSTORE:00: can't create shared message for channel /disks Apr 19 18:20:42 warptower nginx: 2019/04/19 18:20:42 [crit] 4639#4639: ngx_slab_alloc() failed: no memory Apr 19 18:20:42 warptower nginx: 2019/04/19 18:20:42 [error] 4639#4639: shpool alloc failed Apr 19 18:20:42 warptower nginx: 2019/04/19 18:20:42 [error] 4639#4639: nchan: Out of shared memory while allocating message of size 10171. Increase nchan_max_reserved_memory. No idea what they mean but syslog rotated and missed the disk errors, but it dropped offline so reboot and post new diags so we can check SMART. Quote Link to comment
loady Posted April 20, 2019 Author Share Posted April 20, 2019 Can't get smart if disk is disabled? Quote Link to comment
JorgeB Posted April 20, 2019 Share Posted April 20, 2019 You can, just not with the disk offline, like currently, you need to reboot. Quote Link to comment
loady Posted April 20, 2019 Author Share Posted April 20, 2019 When I reboot it still says disabled, is disabled also offline Quote Link to comment
JorgeB Posted April 20, 2019 Share Posted April 20, 2019 disk disable and offline are not the same thing, post new diags Quote Link to comment
loady Posted April 20, 2019 Author Share Posted April 20, 2019 38 minutes ago, johnnie.black said: disk disable and offline are not the same thing, post new diags Ok, turned off and on and heres the fresh diags warptower-diagnostics-20190420-1112.zip Quote Link to comment
JorgeB Posted April 20, 2019 Share Posted April 20, 2019 That disk has SMART disable by default, on the console type: smartctl -s on /dev/sdf Then grab and post new diags. Quote Link to comment
loady Posted April 20, 2019 Author Share Posted April 20, 2019 2 minutes ago, johnnie.black said: That disk has SMART disable by default, on the console type: smartctl -s on /dev/sdf Then grab and post new diags. warptower-diagnostics-20190420-1122.zip Quote Link to comment
loady Posted April 20, 2019 Author Share Posted April 20, 2019 drive seems to be healthy ? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.