Rajahal Posted October 10, 2010 Share Posted October 10, 2010 I want to remove two empty data drives from my array. Normally I would just verify parity, remove the drives, then initconfig. However, for kicks, I decided I want to try the method of zeroing out the drives before removing them, thereby avoiding the lengthy (and slightly risky) parity sync. I figure this is documented in the wiki somewhere, but it is late and I am tired, hence I can't find it. Link to comment
SSD Posted October 10, 2010 Share Posted October 10, 2010 How to remove a disk without losing parity protection Link to comment
Rajahal Posted October 10, 2010 Author Share Posted October 10, 2010 Great, thanks! I should be able to zero both drives at once, right? Or is there some reason I should do them one at a time? Link to comment
SSD Posted October 11, 2010 Share Posted October 11, 2010 Zeroing disks involves updating parity. Doing multiples may be pretty rough on the parity drive and slow things down so much that it might make more sense to run them sequentially. Maybe you can do some experimenting and let us know if running in parallel is faster. Link to comment
Rajahal Posted October 12, 2010 Author Share Posted October 12, 2010 Will do, thanks. Link to comment
Rajahal Posted October 13, 2010 Author Share Posted October 13, 2010 OK, well that didn't go as expected. Syslog attached. It is basically half HDIO_GET_IDENTITY errors (from the Supermicro cards) and half reiserfs IO errors. Here's what happened. I followed your instructions in the link above (though specifying the block size as 2048 as Joe suggested). The disk did NOT immediately show up as 'unformatted' as expected. Instead, it immediately showed up as having 0 free space (see disk9 in the screenshot below). However, the writes to that drive and the parity drive were incrementally increasing in unison, so I let it continue. When I came back hours later, the writes have all stopped, so I assume it is done. I'm tempted to contine with the proceedure as normal, but I figured I should get some feedback before I do. Screenshot (disk9 is the zeroed drive, and I also plan on zeroing disk8 as well): Syslog attached. EDIT: OK, weird. Now I'm finding that I can't write to the array. Not just the zeroed disk, but any disk. Reading from the array seems normal, though. Link to comment
Joe L. Posted October 13, 2010 Share Posted October 13, 2010 Syslog attached.. No, it is not attached. Link to comment
Rajahal Posted October 13, 2010 Author Share Posted October 13, 2010 Huh, weird, I could have sworn I attached it. Anyway, I'll have to do it when I get home tonight, I don't have it with me. I'm not sure how helpful it will be, though, since it just has two errors repeated over and over. Link to comment
Rajahal Posted October 14, 2010 Author Share Posted October 14, 2010 OK, syslog attached for real this time. syslog_10-12-10.zip Link to comment
Joe L. Posted October 14, 2010 Share Posted October 14, 2010 Disk9 ( /dev/md9 ) has been mounted as read-only to prevent further corruption.. It is need of file-system repair. See here in the wiki for instructions : http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems Link to comment
Rajahal Posted October 14, 2010 Author Share Posted October 14, 2010 Thanks Joe. reisferfsck failed too. It immediately output: reiserfs_open: the reiserfs superblock cannot be found on /dev/md9 Failed to open the filesystem. If the partition tables has not been changed, and the partition is valid and really contains a reiserfs partition, then the superblock is corrupted and you need to run this utility with --rebuild-sb. HOWEVER, now on the unRAID web interface disk 9 is showing up as unformatted, as it should. So, keeping in mind that this is an empty drive and there's nothing that I care to save on it, should I run --rebuild-sb or should I continue with the drive zeroing process? Oh, I should also mention that I had to hard reset the server to get it to allow me to unmount (umount) disk9. When I typed in those commands it pretty much locked up (lost access to the web page and telnet), so I hard reset it. A parity check is not running, as expected. However, the reads on the unformatted disk9 are increasing at the same rate as all the other disks. I figured a parity check would ignore all unformatted disks. Is this expected behavior? Link to comment
Joe L. Posted October 14, 2010 Share Posted October 14, 2010 Thanks Joe. reisferfsck failed too. It immediately output: reiserfs_open: the reiserfs superblock cannot be found on /dev/md9 Failed to open the filesystem. If the partition tables has not been changed, and the partition is valid and really contains a reiserfs partition, then the superblock is corrupted and you need to run this utility with --rebuild-sb. HOWEVER, now on the unRAID web interface disk 9 is showing up as unformatted, as it should. So, keeping in mind that this is an empty drive and there's nothing that I care to save on it, should I run --rebuild-sb or should I continue with the drive zeroing process? If it is an empty drive (not yet formatted) then you can ignore the error message in the log that it cannot be mounted.. There is no way it can until it has a file-system on it. Oh, I should also mention that I had to hard reset the server to get it to allow me to unmount (umount) disk9. When I typed in those commands it pretty much locked up (lost access to the web page and telnet), so I hard reset it. A parity check is not running, as expected. However, the reads on the unformatted disk9 are increasing at the same rate as all the other disks. I figured a parity check would ignore all unformatted disks. Is this expected behavior? Parity has absolutely no idea of the contents of a drive. It will use the disk exactly identically for parity calculations regardless if it is formatted or not. To parity, it is just a set of bits. It is included in the parity calculations as long as it is assigned on the devices page. You can press the "Format" button if you wish to format disk9 You can do that even as it is calculating parity. (obviously anything previously on it will be gone, but you said it was empty, so probably no problem) Joe L. Link to comment
Rajahal Posted October 15, 2010 Author Share Posted October 15, 2010 Well I continued with the proceedure and everything seemed to go normally. The server is working on the final parity check now. I don't know why this 500 GB drive started to freak out, but it is safely out of the array now so I'm not worried about it. Too bad I didn't get to play with removing both drives simultaneously, but I didn't want to risk it given the troubles I had with the 500 GB drive. Once I get both drives out, maybe I'll be able to run some tests on my test server (if it isn't already sold by then, someone is sniffing around it). Link to comment
Rajahal Posted October 16, 2010 Author Share Posted October 16, 2010 I just started zeroing my second drive (the 640 GB one). It looks like this one is going the exact same course as the 500 GB drive - same errors, same symptoms. Could this be a result of using the SuperMicro cards? If so, we might want to add the 'reiserfsck' step to the official procedure for those using these cards. Link to comment
schmegg Posted October 16, 2010 Share Posted October 16, 2010 just curious Rajahal, how long did the zeroing process take you (using the 2048 block size)? Link to comment
Rajahal Posted October 16, 2010 Author Share Posted October 16, 2010 Not sure since I didn't really time it, but definitely less time than a parity check. So probably something like 5-6 hours. That will change depending on the size of the drive, of course. Link to comment
SSD Posted October 19, 2010 Share Posted October 19, 2010 I just started zeroing my second drive (the 640 GB one). It looks like this one is going the exact same course as the 500 GB drive - same errors, same symptoms. Could this be a result of using the SuperMicro cards? If so, we might want to add the 'reiserfsck' step to the official procedure for those using these cards. Please provide the details of what you had to do and I will update the original post. Link to comment
Rajahal Posted October 19, 2010 Author Share Posted October 19, 2010 OK, here's the differences: Step 3 looks different. Instead of showing up as unformatted, the drive shows up as having 0 free space (screenshot earlier in this thread). Refreshing the page shows the writes to the zeroed drive and the parity drive increasing, so that's the only way to know that it is working. Likewise, the only way to know that the zeroing is finished is to refresh the page several times and see that the writes for that drive haven't changed. After the zeroing completed, I was unable to write to the array (any disk, not just the zeroed disk). After looking at my syslog, Joe said that the disk had been mounted as read-only. Running reiserfsck on that disk fixed it. So I guess you should add a step right after the zeroing finishes (so between steps 5 and 6) that anyone using a SuperMicro AOC-SASLP-MV8 card should run reiserfsck on the zeroed disk. Also warn them that they won't be able to write to their array until the reiserfsck finishes. The rest of the procedure works as you have described. One final note - when I was going through the procedure, I at first was a bit confused by step 2. I didn't know if I should follow the syntax in the screenshot, or the 2048k block size syntax that Joe recommended. I ended up using the 2048k syntax, and it worked fine. If that is the recommended syntax to use, then I would suggest updating the screenshot (you can just type it in, you don't actually have to run the command). Basically, don't give people two options when one will suffice. Thanks very much for updating this! When I have some more time, I'll try to build a new array and test out zeroing two drives at once, just to see what happens. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.