May 10, 201016 yr Author The problem I've got here is kinda different: sometimes, when the server is shutdown uncleanly, the "format issue" appears because of the filesystem verification. After the fs check, the disk is mounted, but the files on that disk aren't included at "/mnt/user" folder, so no samba access to these files. I have to reboot the server to get this solved. That should be impossible. What do you mean by "format issue" in this case? The format button shouldn't affect the drives assigned to the array; the format process should be done prior to the addiction, and only with selected non-array disks. The problem with this approach is that adding a disk which has not been set to "all zeros" to an existing array, will necessitate running parity-sync on that array after the new disk has been added. During the time parity-sync is running, the array is then vulnerable to a single-disk error.
May 10, 201016 yr The problem I've got here is kinda different: sometimes, when the server is shutdown uncleanly, the "format issue" appears because of the filesystem verification. After the fs check, the disk is mounted, but the files on that disk aren't included at "/mnt/user" folder, so no samba access to these files. I have to reboot the server to get this solved. That should be impossible. What do you mean by "format issue" in this case? Tom, In releases prior to 4.5.3, if a file-system journal is replayed because of a non-clean shutdown, it can take as long as 5 or 6 minutes for that file-system to be "mounted" During that time, if you refresh the browser, it said "Unformatted" for the file-systems not yet mounted. I'm pretty sure the "Format" button is shown as a result. Once the file-system journal is re-played, and the file-system mounted, then the format button would go away. On those older releases, it was possible to format a valid data drive... it was just less likely, since most times the file-system would mount within a fraction of a second. It sounds as if the user-file-system and shares were already in place, but the disk that had not yet been mounted did not get picked up by the user-shares, and that is why the user rebooted. I'm pretty sure you now (in 4.5.3) have the display saying "Mounting" as the journal is re-played. It is just that now,for some people, all the disks fail to mount, with apparently no delay. (I've not seen it here on my older MD1200 array) Joe L.
May 10, 201016 yr It is just that now,for some people, all the disks fail to mount, with apparently no delay. (I've not seen it here on my older MD1200 array) I've seen this on my larger system and I did not add any new drives. I cannot make it happen on unRAID basic no matter what I do and I'm too chicken of doing this to my large array with valid data.
May 11, 201016 yr Joe L. , thats exactly what's happen here, but I'm quite sure that the problem resides during user shares export to samba. If I have a user share that is allocated under only one disk, and this disk gets checked at boot, this user share isn't exported to samba. Tom, IMHO the format procedure should be done in two parts, like it is in WHS: the first should be like pre_clear script, and the second should be the addition of the drive itself. This approach has some advantages, like the protection of array drives to format procedure and a minimum offline period.
May 12, 201016 yr In releases prior to 4.5.3, if a file-system journal is replayed because of a non-clean shutdown, it can take as long as 5 or 6 minutes for that file-system to be "mounted" The reason that it take so long to replay the journal is because it is fighting with the parity check. I'd like to see unRAID wait until the journal replay has finished (or if that is hard to know a timed delay) before the automatic parity check starts. A few times (before I had my UPS) I ran a manual parity check after the automatic one and there were a couple of sync errors corrected. I have a feeling that it had something to do with the timing of the replaying of the journaled entries vs the parity check.
May 12, 201016 yr Author Tom, IMHO the format procedure should be done in two parts, like it is in WHS: the first should be like pre_clear script, and the second should be the addition of the drive itself. This approach has some advantages, like the protection of array drives to format procedure and a minimum offline period. That's pretty much exactly how it works now. EDIT: Agreed it could be more intelligent - that is, array is off-line while new disks are being cleared. Two reasons for this: first, when code was written hard drives were much smaller and clearing didn't take all that long. Second, I didn't want to deal with potential problems where clearing process took enough resources to cause 'glitching' in video streams. Also, at the moment a cleared disk would be added to the array, the driver would stall for a bit, again possibly introducing glitching. Anyway that's the thinking behind the current method.
May 14, 201016 yr Tom, I think that the "resources race" can be controlled using something like ionice, and for sure its preferable to have a small glitch at the end of the process than a large amount of off-line time.
May 14, 201016 yr Author ... for sure its preferable to have a small glitch at the end of the process than a large amount of off-line time. Haha I should send you some of my emails complaining about glitches
May 14, 201016 yr ... for sure its preferable to have a small glitch at the end of the process than a large amount of off-line time. Haha I should send you some of my emails complaining about glitches Everybody has different priorities, and it's very hard to administrate them. Users can be at least very boring sometimes, I included. Thanks a lot for your effort, Tom!
May 31, 201016 yr I just upgraded from 4.5.3 TEST to 4.5.4, no issues. I was holding off on installing my new larger parity drive until this fix came out. The upgrade went perfect. I then put my old parity drive in and added it to the array. It cleared it, and then formatted it just fine. All tested pefect and the system seems stable. I reviewed my syslog and found only one line I didn't like after a clean bootup. Tower kernel: ACPI Warning: Incorrect checksum in table [OEMB] - B9, should be B8 (20090903/tbutils-314) Any idea what this is? Also, it could have been there before the upgrade and I didn't notice it. Thanks!
Archived
This topic is now archived and is closed to further replies.