unRAID Server Release 5.0-beta8d Available


Recommended Posts

Download | Release Notes

 

(Edit: download link now refers to -beta8d.)

 

There's a critical bug fix in this release having to do with data rebuild.  There is a corner case that comes up where, if during a data rebuild of a disabled disk (or disk replaced with a larger one), if a write-request occurs for another disk in the same stripe for the disk currently being rebuilt, it's possible the data for the disk being rebuilt is not actually written.  Later, this will cause a Parity Check 'sync error'.

 

If you are running any previous 5.0-beta release, please upgrade to this release.

 

In addition, I will be updating the 4.7 series (to 4.7.1) with the same parity sync bug fix.  Please do not use this thread to request any other changes in 4.7.1 - the decision is already made that the only change from 4.7 to 4.7.1 will be this driver bug fix.

 

Since it took a very long time to reproduce this bug and figure out what was happening (by running multiple full-disk parity sync's), I had a lot of time to make more improvements and fix other bugs.  :P  Here are some highlights:

 

a) New feature called "cache-only" shares along with a re-written mover script.  The idea with a cache-only share is that the entire share exists only on the cache drive and is never moved to the array.  When a new share is created, and you have a cache disk, you are able to select 'only' for the "Use cache disk" setting.  In this case the top-level share name directory will only be created on the cache disk.  To support this, the 'mover' script is a bit different: it will never move top-level directories on the cache disk which don't also exist on the array.  Note to plug-in developers: if you have created a custom 'mover' script, please examine what is happening with the mover script of 5.0-beta8, since your custom script could now interfere with proper operation of the 'cache-only' share feature.

 

b) Reading of disk temperatures has been further improved - please report in this thread if you still can't see your temperatures reported correctly, or if spin-up/down is not working correctly (but see next paragraph).

 

Speaking of spin-up/down, there was a bug introduced in -beta6 that would cause a configuration value to be set wrong.  If after booting this release, if spin-up/down delay is not working, please look at the "Spin down delay" setting for each disk and make sure it's set correctly, then re-test.

 

Link to comment
  • Replies 243
  • Created
  • Last Reply

Top Posters In This Topic

I've upgraded from 5.0b7 to 5.0b8, and I can't get to the web interface.  I logged on to the server, and ran the emhttp command by hand and get a segmentation fault.

 

It is on the network, as I can telnet to it.

 

here's what's in the syslog for that

 

Jul  7 19:20:32 Tower emhttp: unRAID System Management Utility version 5.0-beta8
Jul  7 19:20:32 Tower emhttp: Copyright (C) 2005-2011, Lime Technology, LLC
Jul  7 19:20:32 Tower emhttp: Plus key detected, GUID: 0781-5406-0000-060512030038
Jul  7 19:20:32 Tower emhttp: rdevName.22 not found
Jul  7 19:20:33 Tower emhttp: diskFsStatus.1 not found
Jul  7 19:20:33 Tower kernel: emhttp[5598]: segfault at 0 ip b75ac760 sp bfc50c80 error 4 in libc-2.11.1.so[b7533000+15c000]

Link to comment

Couple of questions actually. Does the known issue in this release just produce sync error, or does it produce data loss in the event of a disk rebuild?

 

From what Tom says, my understanding is that there is a risk of the reconstructed data being corrupt, but that would be evident on a 'Parity check', and that this could be corrected by repeating the rebuild.

 

Also from what I read, does this mean that MBR unknown issue is a thing of the past?

 

From the release notes, it would appear that a lot of the 'MBR unknown' checks have been removed.  However, from what is written, it is not clear whether multiple partitions will still cause the checks to fail.

Link to comment

Upgraded my 5.0b2 to 5.0b8 just now.

 

Everything seems normal.  

Upon first startup, first three drives (cache plus two Samsung F4s) started immediately.  

Remaining two drives (both Seagate ST32000542AS) showed "resizing" on main menu.  

After a few moments, all drives are now online.

Temperatures look normal.  Parity reports okay.

 

Footnote/update:  At first, my network shares appeared normal but were inaccessible within Windows Explorer.  (I could see them, but clicking on them gave an error in Windows)

Restarted my Windows machine and everything's normal- shares are back and they're functional. 

 

 

Link to comment

I've installed it... I can't get to the web interface.  I logged on to the server, and ran the emhttp command by hand and get a segmentation fault.

 

It is on the network, as I can telnet to it.

 

I have exactly the same problem. Restarted, but emhttp never came up. Telnet in and start it manually, and get segfault.

 

Attached syslog.

syslog.txt

Link to comment

Couple of questions actually. Does the known issue in this release just produce sync error, or does it produce data loss in the event of a disk rebuild?

Depends on the case; there is a chance of data loss unfortunately.  :o

 

Also from what I read, does this mean that MBR unknown issue is a thing of the past?

Time will tell.

Link to comment

I've upgraded from 5.0b7 to 5.0b8, and I can't get to the web interface.  I logged on to the server, and ran the emhttp command by hand and get a segmentation fault.

 

It is on the network, as I can telnet to it.

 

here's what's in the syslog for that

 

Jul  7 19:20:32 Tower emhttp: unRAID System Management Utility version 5.0-beta8
Jul  7 19:20:32 Tower emhttp: Copyright (C) 2005-2011, Lime Technology, LLC
Jul  7 19:20:32 Tower emhttp: Plus key detected, GUID: 0781-5406-0000-060512030038
Jul  7 19:20:32 Tower emhttp: rdevName.22 not found
Jul  7 19:20:33 Tower emhttp: diskFsStatus.1 not found
Jul  7 19:20:33 Tower kernel: emhttp[5598]: segfault at 0 ip b75ac760 sp bfc50c80 error 4 in libc-2.11.1.so[b7533000+15c000]

 

Looks to be a problem with 'Plus' key - I'll fix it ASAP and post -beta8a.

Link to comment

From the release notes, it would appear that a lot of the 'MBR unknown' checks have been removed.  However, from what is written, it is not clear whether multiple partitions will still cause the checks to fail.

As long as there exists a "partition 1" on the Cache drive, the Cache drive will not be re-partitioned.  Also note that the Cache drive can have any of these file systems: reiserfs, ntfs (readonly), ext2/3/4 (though 'lost+found' will appear as a share).  If ever unRaid formats the Cache drive however, currently will always build reiserfs.

Link to comment

Thank u Tom, looking forward to upgrading to this version hopefully by tonight. Just a quick FYI, the announcement page is still showing 5.0beta7 as the latest beta, since u just recently posted, u may not have had a chance to updated it yet, but just incase with everything going on, a friendly reminder.

Link to comment

Upgraded my 5.0b2 to 5.0b8 just now.

 

Everything seems normal.  

Upon first startup, first three drives (cache plus two Samsung F4s) started immediately.  

Remaining two drives (both Seagate ST32000542AS) showed "resizing" on main menu.  

During startup it could say, "Mounting", or "Resizing", or have numeric value, depending on timing.  "Resizing" is just the second phase of 'mounting' where it attempts to resize (make larger) the file system in the event this is a start following a disk upgrade.  If it's the same size disk as before, it may say "Resizing" but really it's not doing anything.

 

Update:  Shares appear normal (they show up in Windows Explorer) but clicking on any of them in Windows explorer gives an error- '....refers to a location that is unavailable'

This is a Windows problem, but how are these shares configured (ie, public, secure or private)?  There have been no changes in this area.

Link to comment

Thank u Tom, looking forward to upgrading to this version hopefully by tonight. Just a quick FYI, the announcement page is still showing 5.0beta7 as the latest beta, since u just recently posted, u may not have had a chance to updated it yet, but just incase with everything going on, a friendly reminder.

 

What announcement page?

Link to comment

Thank u Tom, looking forward to upgrading to this version hopefully by tonight. Just a quick FYI, the announcement page is still showing 5.0beta7 as the latest beta, since u just recently posted, u may not have had a chance to updated it yet, but just incase with everything going on, a friendly reminder.

 

What announcement page?

I think madburg means the News section at the top of the forum's main page.

Link to comment
This is a Windows problem, but how are these shares configured (ie, public, secure or private)?

 

You're correct- a Windows peculiarity.  I've since found that restarting my Windows machine cleared things up, and I've updated my initial message with that information.  Thanks!

Link to comment

From the release notes, it would appear that a lot of the 'MBR unknown' checks have been removed.  However, from what is written, it is not clear whether multiple partitions will still cause the checks to fail.

As long as there exists a "partition 1" on the Cache drive, the Cache drive will not be re-partitioned.  Also note that the Cache drive can have any of these file systems: reiserfs, ntfs (readonly), ext2/3/4 (though 'lost+found' will appear as a share).  If ever unRaid formats the Cache drive however, currently will always build reiserfs.

Ha! a read-only NTFS cache drive... Not too useful as a "cache" drive, is it?

 

Will it mount as writable if we load the ntfs-3g driver?

 

Joe L.

Link to comment
As long as there exists a "partition 1" on the Cache drive, the Cache drive will not be re-partitioned.

 

Yes, that I understood.

 

I was thinking about checks on the partition table area of a data drive.  Does unRAID expect the entires for partitions 2-4 to be all zeros?

Link to comment

From the release notes, it would appear that a lot of the 'MBR unknown' checks have been removed.  However, from what is written, it is not clear whether multiple partitions will still cause the checks to fail.

As long as there exists a "partition 1" on the Cache drive, the Cache drive will not be re-partitioned.  Also note that the Cache drive can have any of these file systems: reiserfs, ntfs (readonly), ext2/3/4 (though 'lost+found' will appear as a share).  If ever unRaid formats the Cache drive however, currently will always build reiserfs.

Ha! a read-only NTFS cache drive... Not too useful as a "cache" drive, is it?

 

Will it mount as writable if we load the ntfs-3g driver?

 

Joe L.

 

The idea was that if you had an NTFS-formatted drive you could plug into the Cache slot and be able to access files contained therein.  It's a work-in-process but I thought I'd mention it in case someone plugs one in (or plugs in an ext2/3/4).  ;)

Link to comment

As long as there exists a "partition 1" on the Cache drive, the Cache drive will not be re-partitioned.

 

Yes, that I understood.

 

I was thinking about checks on the partition table area of a data drive.  Does unRAID expect the entires for partitions 2-4 to be all zeros?

Yes, this has always been the case and still is.  Actually it's more strict than that: the partition table area of the MBR plus the MBR signature bytes (that is, all bytes in the MBR from offset 446-511) must be an exact format.  The change made was that code used to also requires offset 0-445 to be "all zeros", but that check is gone now (except when looking for a "factory-cleared" signature).

Link to comment

How do we avoid this bug? Not write anything while the replacement drive is being rebuilt?

By updating to 5.0-beta8 or 4.7.1 (which isn't out yet).

 

Barring that, yes, with a data rebuild in progress, don't write.  With a "upgrade existing disk to larger one" operation, it's potentially unavoidable.

 

Something interesting about this bug: it has existed since day 1 and is also in older releases of the original linux 'md' driver (later releases of the linux 'md' driver have changed greatly and I didn't go and analyze if the bug is still there).

Link to comment

How do we avoid this bug? Not write anything while the replacement drive is being rebuilt?

By updating to 5.0-beta8 or 4.7.1 (which isn't out yet).

 

Barring that, yes, with a data rebuild in progress, don't write.  With a "upgrade existing disk to larger one" operation, it's potentially unavoidable.

 

Something interesting about this bug: it has existed since day 1 and is also in older releases of the original linux 'md' driver (later releases of the linux 'md' driver have changed greatly and I didn't go and analyze if the bug is still there).

 

Just thought I'd mention that practical experience with this bug indicates that, if it happens at all, it will affect a small number of sectors very early in the disk.  Users have run complete parity checks after a rebuild, but normally the sync errors will occur only within the first 0-2 minutes of the check.  Although it is comforting to know that this defect is fixed, practially speaking it has never been a serious problem or caused data loss.

Link to comment

Thank u Tom, looking forward to upgrading to this version hopefully by tonight. Just a quick FYI, the announcement page is still showing 5.0beta7 as the latest beta, since u just recently posted, u may not have had a chance to updated it yet, but just incase with everything going on, a friendly reminder.

 

What announcement page?

I think madburg means the News section at the top of the forum's main page.

Yes sorry "NEWS" section

Link to comment

a) New feature called "cache-only" shares along with a re-written mover script.  The idea with a cache-only share is that the entire share exists only on the cache drive and is never moved to the array.  When a new share is created, and you have a cache disk, you are able to select 'only' for the "Use cache disk" setting.  In this case the top-level share name directory will only be created on the cache disk.  To support this, the 'mover' script is a bit different: it will never move top-level directories on the cache disk which don't also exist on the array.  Note to plug-in developers: if you have created a custom 'mover' script, please examine what is happening with the mover script of 5.0-beta8, since your custom script could now interfere with proper operation of the 'cache-only' share feature.

OK, a quick look and in 5.0beta8  (and likely onward...)

  • the unMENU conditional sync mover package is no longer needed at all, since no sync command now exists.  It should do no harm though. It will edit the script adding logic to spin down disks that will never get invoked, so that logic will not have any effect.
     
     
     
  • The unMENU package to exclude directories with a leading underscore will not have any effect, since there is no longer any logic to exclude directories with any leading character.  The script will do no harm though, it will just have no effect at all.
     
    The new logic is the directory must exist as a user-share before the mover script will move anything to it.  Since there are no user-shares with leading underscores, or leading periods, odds are they will stay on the cache drive. (I've not tested this, but it is my theory)  In any case, the unMENU script to exclude directories with underscores will not do any harm, nor have any effect.
     
     
     
  • The third unMENU package to eliminate extra syslog entries will edit the mover script, but have no effect either, it will relocate the "-print" argument from the end of one line to the beginning of the next, essentially making no change.

 

All the above were deduced by examining the new mover script as un-packed/un-compressed from the new beta8 release.  I do not have a cache drive nor have I even loaded or booted my server on the new beta8 release.  I put this here since odds are others will ask the same question, again and again...

 

If you currently have them installed, you can use the package manager in unMENU to disable re-install on re-boot and they will not be installed the next time you reboot.  They had their place in earlier releases, but now are no longer needed.

 

Joe L.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.