TODDLT Posted February 20, 2019 Share Posted February 20, 2019 (edited) Below is a lot of history of this issue. Summary of current status can be found Here: https://forums.unraid.net/topic/78318-drives-dropping-out-of-array-into-ud-split-from-preclear-results/?do=findComment&comment=725770 History Below: This was my first ever pre-clear fail. I started two drives pre-clearing at once last night. Toshiba N300's. This morning I have a failed email. I can't get the full log to open from the main page in unRAID, but have attached the preview snapshot and the 3 reports I could find on the server. It looks like No Space Left on Device error. I am assuming this is a dead drive but wanted to make sure it's not a cable issue or something deserving a re-try. The cable seats look good. Also what is the "no such file or directory" error below? Thanks all/anyone 988YK07LFAXG.resume 3602201014017.sreport 988GK045FAXG.resume Edited February 27, 2019 by TODDLT Quote Link to comment
trurl Posted February 20, 2019 Share Posted February 20, 2019 9 minutes ago, TODDLT said: It looks like No Space Left on Device error. 9 minutes ago, TODDLT said: Also what is the "no such file or directory" error below? I am guessing these are the same cause and you have filled up /tmp Post diagnostics which might clear this up as well as let us take a look at SMART report for that disk if it is responding. Quote Link to comment
TODDLT Posted February 20, 2019 Author Share Posted February 20, 2019 4 hours ago, trurl said: I am guessing these are the same cause and you have filled up /tmp Post diagnostics which might clear this up as well as let us take a look at SMART report for that disk if it is responding. The pre-clear ran for 4 or 5 hours at least before I went to sleep and it failed sometime after that. I also know the drive was actually spinning at the time. If it lost communication with the drive it was while in flight. I'll post this evening if I can get the disk to respond. I'll have to wait for the 1st drive to complete one cycle (of 2) before I kill everything to check connections if you think that might be the issue but I believe it will have finished by the time I get home. thanks. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 11 hours ago, trurl said: I am guessing these are the same cause and you have filled up /tmp Post diagnostics which might clear this up as well as let us take a look at SMART report for that disk if it is responding. I think it's not able to read or see the drive now. It looks like it did at one time but when I click on "start short self test" it just blinks at me and does nothing. No self tests have been logged, and capabilities says "Cannot read capabilities" There is a green dot on the drive in unassigned devices. My other drive is only 50% done with the post read so likely will be between midnight and 2 AM before it's done. I'll either reboot tonight or in the AM and see if it finds the drive. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 (edited) 23 hours ago, trurl said: I am guessing these are the same cause and you have filled up /tmp Post diagnostics which might clear this up as well as let us take a look at SMART report for that disk if it is responding. OK Life got a little strange this morning, please let me know if you have any ideas. Last night after the Good drive completed a successful preclear I stopped the 2nd cycle, shut down the server, and checked cables. Restarted and everything looked normal. One drive showed a pre-cleared status and the other not. However, I could not go into the drives via unassigned devices and run a smart test. The same result.. It blinks but does nothing. So I started the 2nd cycle on one, and restarted a clean 1st cycle on the other. Went to bed. This morning, I again have an email message saying the pre-read failed. This time, only 15 minutes into the pre-read. Here is where it goes bizarre. I go to pull up the main page on the server and it is very slow to respond. Eventually after a couple tries I get the array to show up, BUT: - Unassigned devices section doesn't want to resolve. When it finally does respond, it shows both pre-clears still running. the "good drive" is only running at half speed (93MB/sec). the supposed bad drive is running 190 MB/sec (despite getting a failed out email message). - Some of my array drives are showing up in Unassigned Devices not up in the array. - My Cache drives are showing up in unassgined devices and blinking in and out of the cache drive section right in front of my eyes. - My boot device is showing up in unassigned devices, not in the boot device. - The server is not accessible via windows but the array shows online in the window. A few notes about my configuration that may play in here. - All of my array spinners are connected via LSI controller cards. - The cache drives are both direct to the mother board. The two eSata ports being used for the preclears are connected to MB sata ports. -- Two array SSD's are connected to the MB ports and those show up steady in the array, not in unassigned devices. I stopped the array ---- offline all the devices show in their proper place I started the array - everything comes back to normal and only the correct devices are in unassigned devices. However, the shares are still not accessible. Both pre-clears still running. Then LSI connected drives start appearing in unasssigned devices again. I stop both pre-clears and it looks like to ALL goes back to normal. My shares are now visible again. Drives are now in the proper place again. Nothing is "moving" around the window. I will throw into the mix that I have pre-cleared two drives at once before with this exact hardware configuration when I bumped up to 6 TB parity 4-6 months ago. There have been updates to both unassigned devices and preclear plugins since that time. Then I get a red error from "Fix Common Problems" I am used to the SSD warning, but new items now show up. one regarding the /tmp file. Please see the attached image. I have now removed the ca.cleanup plugin (and have never actually used it before). What do you make of the error message because that seems related to the original comment you made about the tmp file being full. Edited February 21, 2019 by TODDLT Quote Link to comment
trurl Posted February 21, 2019 Share Posted February 21, 2019 23 hours ago, trurl said: Post diagnostics which might clear this up as well as let us take a look at SMART report for that disk if it is responding. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 I just tried to pull a diagnostics file prior to reboot and the array started blinking between "started" and "undefined" The diagnostics is not showing anything. I closed the diagnostics window and now ALL my drives are in unassigned devices. I'm just going to reboot and assuming this is the tmp file being full? Quote Link to comment
trurl Posted February 21, 2019 Share Posted February 21, 2019 Some of these problems might suggest a problem with flash. Put your flash in your PC and let it checkdisk. Are you using a USB2 port for flash? After reboot post diagnostics. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 5 minutes ago, trurl said: Some of these problems might suggest a problem with flash. Put your flash in your PC and let it checkdisk. Are you using a USB2 port for flash? After reboot post diagnostics. shut down. Pulled flash and ran chkdsk in Windows. no errors found. This is a my original flash and has some age on it. I have a replacement I bough but never bothered to swap over. If you think this is an issue I can make the swap. Quote Link to comment
trurl Posted February 21, 2019 Share Posted February 21, 2019 12 minutes ago, trurl said: Are you using a USB2 port for flash? After reboot post diagnostics. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 16 minutes ago, trurl said: Some of these problems might suggest a problem with flash. Put your flash in your PC and let it checkdisk. Are you using a USB2 port for flash? After reboot post diagnostics. It's a USB3 actually, been that way for some years. Was pulling diagnostics. svr-diagnostics-20190221-0833.zip Quote Link to comment
trurl Posted February 21, 2019 Share Posted February 21, 2019 You have 2 Toshiba disks that aren't giving a SMART report: 988GK045FAXG 988YK07LFAXG You can try to enable SMART reporting on each with this command, substitute the correct letter for X: smartctl -s on /dev/sdX Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 (edited) 45 minutes ago, trurl said: You have 2 Toshiba disks that aren't giving a SMART report: 988GK045FAXG 988YK07LFAXG You can try to enable SMART reporting on each with this command, substitute the correct letter for X: smartctl -s on /dev/sdX OK attached are the two smart reports. Thanks I didn't know how to get this working. Those are the new drives. The one that failed is K045FAXG TOSHIBA_HDWN160_988YK07LFAXG-diagnostics-20190221 (sdg).txt TOSHIBA_HDWN160_988GK045FAXG-diagnostics-20190221 (sdh).txt Edited February 21, 2019 by TODDLT Quote Link to comment
trurl Posted February 21, 2019 Share Posted February 21, 2019 Those look OK. You might try again after checking connections. Make sure you check power connections all the way back to the PSU. Quote Link to comment
trurl Posted February 21, 2019 Share Posted February 21, 2019 Probably unrelated but why are you running Unraid 6.6.3? Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 2 minutes ago, trurl said: Those look OK. You might try again after checking connections. Make sure you check power connections all the way back to the PSU. Do you think the tmp file filling up was due to a bad connection to the drive? Power is via a split connector going to both drives. It seems unlikely that one of the two would fail only though I will look for loose connections. I'll verify SATA's where the plug into the MB and try again. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 2 minutes ago, trurl said: Probably unrelated but why are you running Unraid 6.6.3? .... because I'm behind on updates.... Thanks I hadn't checked in a while. I'll do the update too (now). Quote Link to comment
mathomas3 Posted February 21, 2019 Share Posted February 21, 2019 I havent looked into your reports... but I did have something like this at one point in my setup... It ended up being my power supply was too small... had to upgrade to a 1200watt... Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 6 minutes ago, mathomas3 said: I havent looked into your reports... but I did have something like this at one point in my setup... It ended up being my power supply was too small... had to upgrade to a 1200watt... hmmm.. I have 14 HDD's connected and 4 SSD's. 2 of the connected HDD's are "spares" and spun down. My hardware configuration is current (see signature). Yes, that's a lot but I have a 750W Corsair PSU. I've had this many connected before and had no issues. It starts up fine meaning 14 HDD's spin up together. Do you think this is a PSU issue with one failing and one not? Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 I just started the preclear on the bad drive only. I have to leave in about 10 minutes but will see how this goes and check back an hour or so later. If it's still working, I'll start the 2nd preclear and see if that changes anything. (power drain) When I first started to engage preclear from unassigned devices, none of the drive info was populated at the startup window. I did it a 2nd time and all the drive info appeared. Quote Link to comment
mathomas3 Posted February 21, 2019 Share Posted February 21, 2019 I would suggest that you upgrade the PSU... I had 10 drives with a 650 and it would power up but a parity check would kill it... and you have many more then I had... I currently have a 1200 watt for 8 drives. Quote Link to comment
mathomas3 Posted February 21, 2019 Share Posted February 21, 2019 2 hours ago, TODDLT said: hmmm.. I have 14 HDD's connected and 4 SSD's. 2 of the connected HDD's are "spares" and spun down. My hardware configuration is current (see signature). Yes, that's a lot but I have a 750W Corsair PSU. I've had this many connected before and had no issues. It starts up fine meaning 14 HDD's spin up together. Do you think this is a PSU issue with one failing and one not? I was having some strange errors like you are having... and I would expect that until you will continue to have more of these strange errors to continue till you upgrade it Quote Link to comment
JorgeB Posted February 21, 2019 Share Posted February 21, 2019 8 minutes ago, mathomas3 said: I currently have a 1200 watt for 8 drives. That's overkill, though nothing wrong with it, I use 450W up to 8 drives, 500/550W up to 12/14 and 650W up to 20/22. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 OK, 2 hours of preclear and no issues on the drive that previously reported bad twice. Spun up the whole array, and no change. Started the preclear on the other drive that passed one cycle and it's running now. the only Oddity is the "good" drive is running at half speed. I it's 30% through the pre-read (2nd cycle) and I think I'll restart it and see what happens. On the 1st cycle it ran over 200 MB / Sec at the front end and averaged 180 MB / Sec overall. 30% into preclear and it's running 95MB/Sec. Quote Link to comment
TODDLT Posted February 21, 2019 Author Share Posted February 21, 2019 OK, restarted the preclear of the original "good" drive and the speed is back up to normal. I even did a full array spin up with both pre-clears running and things keep moving along. I'm going to leave it alone for the afternoon and we'll see what happens. Thanks all for the input, I'll drop a note when I get home this evening/night. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.