Compass Posted October 30, 2016 Share Posted October 30, 2016 Hi All, Well it's finally happened, after many years of trouble free service.....the dreaded red X. I've been playing around with my server over the last few days re-ordering disks etc as I just upgraded to V6.2.2 and have bought 3 new 6TB drives to add to the array, only 1 has been precleared so far, but not added to the array, but now my Disk 2 has the X(device is disabled, contents emulated) On the dashboard page it's saying it's faulty. I am planning on adding 2 of the new 6TB as parity....my current parity is 4TB. I can see the data(movies) on disk2 from my windows PC as both disk and user shares. However in one of disk 2's disk shares there is a folder called hsperfdata_abc not sure what this is, never seen it before, my user settings are secure, I'm the only user. Ive checked and re-checked all the connections...all are good. I have attached sys log. I don't have enough room on my other array drives to move the data over to them...so am wondering what my options might be? 1: Move the data over my network to my windows pc 2: Add the already precleared 6TB as my cache drive(not sure if I can do that as it's bigger than my parity, but technically not in the array)....move the data to that and hold it there untill I can replace the failed drive. 3: Wait for you smart people to tell me what to do.... Thanks In Advance tower-syslog-20161031-0656.zip Quote Link to comment
JorgeB Posted October 30, 2016 Share Posted October 30, 2016 For V6 always post the complete diagnostics: Tools -> Diagnostics Quote Link to comment
Compass Posted October 30, 2016 Author Share Posted October 30, 2016 Ok thanks...done The disk in question ends in 6241 tower-diagnostics-20161031-0802.zip Quote Link to comment
JorgeB Posted October 30, 2016 Share Posted October 30, 2016 SMART for disk2 looks fine, server was rebooted so no info on the syslog of what happened, you have two choices: Rebuild disk2 using the same disk (in this case probably a good idea to check/replace all cables and running an extended SMART test before rebuilding). Do a parity swap, use a new 6TB for parity and old parity to rebuild disk2. 1 Quote Link to comment
Compass Posted October 30, 2016 Author Share Posted October 30, 2016 SMART for disk2 looks fine, server was rebooted so no info on the syslog of what happened, you have two choices: Rebuild disk2 using the same disk (in this case probably a good idea to check/replace all cables and running an extended SMART test before rebuilding). Do a parity swap, use a new 6TB for parity and old parity to rebuild disk2. Ive tried running disk 2 off the motherboard and the SASLP, so with different cables, and ended up with the same result....currently running extended SMART and see what that says and will post. Does the Parity Swap procedure work on V6.2.2? Quote Link to comment
JorgeB Posted October 30, 2016 Share Posted October 30, 2016 Does the Parity Swap procedure work on V6.2.2? Yes, if you do it keep old disk2 intact until rebuild finishes, it can still be useful if something goes wrong. Quote Link to comment
Compass Posted October 30, 2016 Author Share Posted October 30, 2016 Does the Parity Swap procedure work on V6.2.2? Yes, if you do it keep old disk2 intact until rebuild finishes, it can still be useful if something goes wrong. Righto thanks again...the extended SMART is going to take awhile...will let you know the outcome Quote Link to comment
Compass Posted October 30, 2016 Author Share Posted October 30, 2016 Does the Parity Swap procedure work on V6.2.2? Yes, if you do it keep old disk2 intact until rebuild finishes, it can still be useful if something goes wrong. Righto thanks again...the extended SMART is going to take awhile...will let you know the outcome Extended SMART looks ok, WDC_WD20EARS-00S8B1_WD-WCAVY6506241-20161031-1506.txt Quote Link to comment
JorgeB Posted October 30, 2016 Share Posted October 30, 2016 It does, you can try to rebuild using the same disk or the parity swap, whichever you prefer. Quote Link to comment
Compass Posted October 31, 2016 Author Share Posted October 31, 2016 It does, you can try to rebuild using the same disk or the parity swap, whichever you prefer. I did the parity swap, but now disk 1 is down....see diagnostics...haven't shut the server down yet.....bugger The disk in question is WDC_WD20EARS-00MVWB0_WD-WMAZA0523138-20161101-1916 during the parity swap/check it came back millions of errors This is the error message on the Disk1 page scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46 tower-diagnostics-20161101-1922.zip Quote Link to comment
JorgeB Posted October 31, 2016 Share Posted October 31, 2016 Parity copy completed successfully but disk1 dropped out a couple of hours after the rebuild of disk2 started, so disk2 needs to be rebuilt again. Looks like disk1 timed out, eventually making the SASLP crash, so I would start by powering down, reseating that controller, check cables to disk1, power up and post new diags so we can see SMART for disk1 and decide best way to proceed. Keep old disk2 intact, it may be needed. Quote Link to comment
Compass Posted October 31, 2016 Author Share Posted October 31, 2016 Parity copy completed successfully but disk1 dropped out a couple of hours after the rebuild of disk2 started, so disk2 needs to be rebuilt again. Looks like disk1 timed out, eventually making the SASLP crash, so I would start by powering down, reseating that controller, check cables to disk1, power up and post new diags so we can see SMART for disk1 and decide best way to proceed. Keep old disk2 intact, it may be needed. See attached....thanks again for your help tower-diagnostics-20161101-2036.zip Quote Link to comment
JorgeB Posted October 31, 2016 Share Posted October 31, 2016 That file system corruption is expected, since disk2 wasn't completely rebuilt and disk1 is being emulated using a corrupt disk2, the main problem is that disk1 has pending sectors so it may be impossible to rebuild disk2, to confirm do an extended SMART test on disk1 and post the results. Quote Link to comment
Compass Posted October 31, 2016 Author Share Posted October 31, 2016 That file system corruption is expected, since disk2 wasn't completely rebuilt and disk1 is being emulated using a corrupt disk2, the main problem is that disk1 has pending sectors so it may be impossible to rebuild disk2, to confirm do an extended SMART test on disk1 and post the results. Ok...started the extended on disk 1 Quote Link to comment
Compass Posted November 1, 2016 Author Share Posted November 1, 2016 That file system corruption is expected, since disk2 wasn't completely rebuilt and disk1 is being emulated using a corrupt disk2, the main problem is that disk1 has pending sectors so it may be impossible to rebuild disk2, to confirm do an extended SMART test on disk1 and post the results. Ok...started the extended on disk 1 See attached...failed WDC_WD20EARS-00MVWB0_WD-WMAZA0523138-20161101-2151.txt Quote Link to comment
JorgeB Posted November 1, 2016 Share Posted November 1, 2016 That’s what I was afraid of, it’s not going to be possible to rebuild disk2, fortunately the old disk2 seems to be OK, as long as you didn’t write anything to it after it was disable all data from that disk should be OK, for disk1 you can try two things, depending on how important and/or easy it is to replace its data: -if any missing data is easily replaceable, do a new config with all the old disks except disk1 (include old parity and old disk2), use a spare in disk1's place, let parity sync then mount old disk1 in your cache slot (or using the unassigned devices plugin) and copy all data to the array, with some luck you should be able to copy most of it. -if data from disk1 is very difficult to replace you can try this option first but only if there were no writes to disk2 after it was disable, do a new config with all the old disks except parity, use the new 6TB parity, before starting the array check “parity is already valid”, start array, stop array, unassign old disk1, assign a spare to its slot, start array to begin rebuild. If the rebuild is not completely successful you can still use the first option. Quote Link to comment
Compass Posted November 1, 2016 Author Share Posted November 1, 2016 That’s what I was afraid of, it’s not going to be possible to rebuild disk2, fortunately the old disk2 seems to be OK, as long as you didn’t write anything to it after it was disable all data from that disk should be OK, for disk1 you can try two things, depending on how important and/or easy it is to replace its data: -if any missing data is easily replaceable, do a new config with all the old disks except disk1 (include old parity and old disk2), use a spare in disk1's place, let parity sync then mount old disk1 in your cache slot (or using the unassigned devices plugin) and copy all data to the array, with some luck you should be able to copy most of it. -if data from disk1 is very difficult to replace you can try this option first but only if there were no writes to disk2 after it was disable, do a new config with all the old disks except parity, use the new 6TB parity, before starting the array check “parity is already valid”, start array, stop array, unassign old disk1, assign a spare to its slot, start array to begin rebuild. If the rebuild is not completely successful you can still use the first option. There have been no writes to any of the disks since these issues started.(Thats if you mean I've added stuff to the array) Is it possible to assign disk 1 to the cache slot first to see if the data is recoverable? Before doing any of the above? I will have to preclear the other 2 6TB drives before I do anything else which will take a few days. Quote Link to comment
JorgeB Posted November 1, 2016 Share Posted November 1, 2016 Is it possible to assign disk 1 to the cache slot first to see if the data is recoverable? Before doing any of the above? You can, but not knowing what option you're going to do you can have a UUID collision, also your current array has filesystem corruption so it can crash, I would suggest the following: -take a screenshot of all assignments -tools -> new config -> Retain array configuration: -> select "all" -> check "Yes I want to do this" -> Apply -on the main page change the assigned disk2 to the old disk WDC_WD20EARS-00S8B1_WD-WCAVY6506241, you'll probably need to reasign old disk1 also -remaining assignments as they are, including new parity -before starting the array check "parity is already valid" -start array All your data should come online, including disk1, you can check disk1 contents but don't try to copy anything from it, also don't write anything to the array. If all looks good and you want to do option 2 you just need to stop the array, unassign old disk1, assign a spare, e.g., old parity disk, and start array to rebuild. This way you don't need to preclear anymore disks for now and it will leave old disk1 intact in case you need to do option 1. Quote Link to comment
Compass Posted November 1, 2016 Author Share Posted November 1, 2016 Is it possible to assign disk 1 to the cache slot first to see if the data is recoverable? Before doing any of the above? You can, but not knowing what option you're going to do you can have a UUID collision, also your current array has filesystem corruption so it can crash, I would suggest the following: -take a screenshot of all assignments -tools -> new config -> Retain array configuration: -> select "all" -> check "Yes I want to do this" -> Apply -on the main page change the assigned disk2 to the old disk WDC_WD20EARS-00S8B1_WD-WCAVY6506241 -remaining assignments as they are, including new parity -before starting the array check "parity is already valid" -start array All your data should come online, including disk1, you can check disk1 contents but don't try to copy anything from it, also don't write anything to the array. If all looks good and you want to do option 2 you just need to stop the array, unassign old disk1, assign a spare, e.g., old parity disk, and start array to rebuild. This way you don't need to preclear anymore disks for now and it will leave old disk1 intact in case you need to do option 1. Thanks...already 11hrs into preclearing one of the other 6TB drives....will let that finish and the proceed with the above. Out of interest I used the Unassigned devices plugin and mounted the old disk2 and all the data looks ok. Again haven't written too or moved any files on the array whilst all this is happening. Quote Link to comment
Compass Posted November 4, 2016 Author Share Posted November 4, 2016 Is it possible to assign disk 1 to the cache slot first to see if the data is recoverable? Before doing any of the above? You can, but not knowing what option you're going to do you can have a UUID collision, also your current array has filesystem corruption so it can crash, I would suggest the following: -take a screenshot of all assignments -tools -> new config -> Retain array configuration: -> select "all" -> check "Yes I want to do this" -> Apply -on the main page change the assigned disk2 to the old disk WDC_WD20EARS-00S8B1_WD-WCAVY6506241 -remaining assignments as they are, including new parity -before starting the array check "parity is already valid" -start array All your data should come online, including disk1, you can check disk1 contents but don't try to copy anything from it, also don't write anything to the array. If all looks good and you want to do option 2 you just need to stop the array, unassign old disk1, assign a spare, e.g., old parity disk, and start array to rebuild. This way you don't need to preclear anymore disks for now and it will leave old disk1 intact in case you need to do option 1. Thanks...already 11hrs into preclearing one of the other 6TB drives....will let that finish and the proceed with the above. Out of interest I used the Unassigned devices plugin and mounted the old disk2 and all the data looks ok. Again haven't written too or moved any files on the array whilst all this is happening. Ok...Ive done the following -printed a screenshot of all assignments -stopped the array -tools -> new config -> Retain array configuration: -> select "all" -> check "Yes I want to do this" -> Apply -on the main page change the assigned disk2 to the old disk WDC_WD20EARS-00S8B1_WD-WCAVY6506241 -remaining assignments as they are, including new parity -before starting the array check "parity is already valid" however there is still a message next to the parity disk that "all data on this disk will be erased when array is started" Do I still go ahead and start the array? will it start a parity sync straight away? Quote Link to comment
JorgeB Posted November 4, 2016 Share Posted November 4, 2016 however there is still a message next to the parity disk that "all data on this disk will be erased when array is started" That is normal, but if "parity is already valid" is checked it will not be rebuilt, you can start the array. Quote Link to comment
Compass Posted November 4, 2016 Author Share Posted November 4, 2016 however there is still a message next to the parity disk that "all data on this disk will be erased when array is started" That is normal, but if "parity is already valid" is checked it will not be rebuilt, you can start the array. Yep cool....50% into the rebuild...I did option 2...after checking the contents of Disk 1, it all looked good, I stopped the array, re-assigned Disk 1 with my old parity disk to rebuild onto, restarted the array and it started rebuilding but Disk 1 has an orange triangle next to it(with 'device contents emulated' when hovered over) but it seems to be being written to and there are no errors being recorded...fingers crossed the orange triangle disappears after it's finished? The old Disk 2 seems to fine now too....weird...time will tell. Quote Link to comment
JorgeB Posted November 4, 2016 Share Posted November 4, 2016 ...restarted the array and it started rebuilding but Disk 1 has an orange triangle next to it(with 'device contents emulated' when hovered over) but it seems to be being written to and there are no errors being recorded...fingers crossed the orange triangle disappears after it's finished? That is normal during the rebuild, it will change to green once it finishes. Quote Link to comment
Compass Posted November 4, 2016 Author Share Posted November 4, 2016 ...restarted the array and it started rebuilding but Disk 1 has an orange triangle next to it(with 'device contents emulated' when hovered over) but it seems to be being written to and there are no errors being recorded...fingers crossed the orange triangle disappears after it's finished? That is normal during the rebuild, it will change to green once it finishes. It turned to green and seems to be working however I'm getting this REISERFS error(see screenshot attachment) Have attached diagnostics too Also is there a 'global' security setting? I've got all my USER Shares as Secure and my Disk Shares as Public but can't add anything to either type of share? tower-diagnostics-20161105-1913.zip Quote Link to comment
JorgeB Posted November 4, 2016 Share Posted November 4, 2016 You need to check filesystem on disk1. https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Drives_formatted_with_ReiserFS_using_unRAID_v5_or_later Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.