unRAID Server Release 4.7 "final" Available


limetech

Recommended Posts

I am a relatively new unRaid user but I am very surprised to have accidentally discovered this thread about a potential threat to the integrity of my data. - Or have I mis-read the posts??

 

I realise that it is explicitly a flaw in the underlying Unix/Linux core and I am also aware that I have no means of assessing the level of threat that this situation represents, but I do feel that I should not have been allowed to become aware of it all simply by chance!

 

Surely the correct thing to do is to post a sticky message in Announcements so that intending and existing users of unRaid can make an informed choice about which way they want to proceed?

 

Anything else is simply brushing it all under the carpet in my eyes.

Link to comment
  • 2 months later...
  • Replies 414
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I am a relatively new unRaid user but I am very surprised to have accidentally discovered this thread about a potential threat to the integrity of my data. - Or have I mis-read the posts??

 

I realise that it is explicitly a flaw in the underlying Unix/Linux core and I am also aware that I have no means of assessing the level of threat that this situation represents, but I do feel that I should not have been allowed to become aware of it all simply by chance!

 

Surely the correct thing to do is to post a sticky message in Announcements so that intending and existing users of unRaid can make an informed choice about which way they want to proceed?

 

Anything else is simply brushing it all under the carpet in my eyes.

 

Wow. I am shocked to find this thread as well. I have never wondered this far down it since last summer when I purchased unraid and was under the false impression that 4.7 was in fact a stable release. So we are nearly a year on now and we do not have a fix for this? :( With due respect, I have to be blunt here and sorry if this offends, but this is not acceptable at all. Can we have a statement about the future and wehther there will even be a 4.7.1 release to address these issues or a stable 5.X release anytime soon?

The whole reason I chose unraid and 4.7 was because it was proven and stable. It's not like this is free, we have paid for a licenced product here of which the primary job is to store data redundantly, and it fails at task number 1.

 

I am also not an experienced unraid user and would like to know if I am understanding this correctly. So these bugs in 4.7 "stable", will only exhibit themselves if we do one of the following:

 

1: Replace a drive

2: Rebuild/reconstruct a drive

3: Add a new drive to the array?

4: Recalculate parity if it previously failed?

 

Any other instances?

 

 

 

Link to comment

I too have been wanting this fix as I run two 4.7 servers and I don't want to move to 5.0. In the mean time, I strictly follow one rule: If a disk rebuild or disk upgrade is in progress I do not write to the array. The probability of the bug being triggered is small, but the severity is great, so luckily the work around is easy.

 

Does this bug affect parity rebuilds too?

It would affect any rebuild, but a parity rebuild, followed by a parity "check" would detect it, and correct parity.

 

The problem is when re-constructing a data drive, as there is no equivalent "check"

 

As stated, the work-around is NOT to write to a data drive when re-constructing that drive in the array.

There is second, equally serious bug in the 4.7 version of unRAID as shown here:

http://lime-technology.com/forum/index.php?topic=16523.0

and here

http://lime-technology.com/forum/index.php?topic=16471.0

and here:

http://lime-technology.com/forum/index.php?topic=15385.0

 

Attempting to re-construct a super.dat file will result in the MBR of existing data drives being re-written, often pointing to the wrong starting sector.  The result, drives that show as un-formatted (until the partitioning is corrected in the MBR) and a potential loss of all data if the unRAID owner does something on their own that wipes the drive.

 

An un-writable super.dat, or a complete replacement of the flash drive will result in this bug showing itself in the 4.7 series.

 

Joe L.

 

Hi Joe L or anyone else that can help,

 

Can you explain to me what you mean by:

 

1: "attempting to reconstruct a super.dat file"

Do you mean altering files on the usb/flash drive of the unraid OS? Why would you do this?

 

2: "An un-writable super.day". What is this super.dat file? Where is it? Why is it? Why would it get corrupted/altered or be unwritable. I do not know what it does so excuse my questions.

 

Regards

Link to comment

Keep in mind that when 4.7 was released, the data corruption bug was not discovered yet. It wasn't a part of UNRAID itself but the underlying OS - and it affected (I believe) every Linux OS.

 

Doesn't speak to why it's taken this long for a patch though.

 

Is there evidence of this affecting other Linux OS's? Has anyone EVER seen or experienced this outside of unraid? I have not heard of it.

Link to comment

Is there evidence of this affecting other Linux OS's? Has anyone EVER seen or experienced this outside of unraid? I have not heard of it.

Look here:

http://www.spinics.net/lists/raid/msg33994.html

The problem referenced deals only with RAID6-specific code. RAID4 is quite different, and significantly simpler.

 

[ Objection sustained ... please continue, counselor ... ]

 

 

I think it affects the raid5 code. raid5.c is mentioned and I believe unRAID is based on the code in raid5.c

Link to comment

I think it affects the raid5 code. raid5.c is mentioned and I believe unRAID is based on the code in raid5.c

Although unRAID appears to be built on a RAID4 layout, (I now see that) the raid5.c driver is used for RAID4, RAID5, and RAID6 layouts.

 

[ So, objection overruled(!!) on cross- ... ]  Thanks.

 

Link to comment

I think it affects the raid5 code. raid5.c is mentioned and I believe unRAID is based on the code in raid5.c

Although unRAID appears to be built on a RAID4 layout, (I now see that) the raid5.c driver is used for RAID4, RAID5, and RAID6 layouts.

 

[ So, objection overruled(!!) on cross- ... ]  Thanks.

The bug has been there for years... took unusual circumstances to fall into it.  You have to be writing to a given block at the exact same time you are ALSO writing it to compute parity(or a disk block being re-constructed).  The "write" of the file that was not yet flushed from the disk buffer to the physical disk is ignored, and the block marked for writing forgotten about. 
Link to comment

I think it affects the raid5 code. raid5.c is mentioned and I believe unRAID is based on the code in raid5.c

Although unRAID appears to be built on a RAID4 layout, (I now see that) the raid5.c driver is used for RAID4, RAID5, and RAID6 layouts.

 

[ So, objection overruled(!!) on cross- ... ]  Thanks.

The bug has been there for years... took unusual circumstances to fall into it.  You have to be writing to a given block at the exact same time you are ALSO writing it to compute parity(or a disk block being re-constructed).  The "write" of the file that was not yet flushed from the disk buffer to the physical disk is ignored, and the block marked for writing forgotten about.

 

While we are waiting for the drama surrounding 5.0 to settle, maybe a 4.7 release to fix a known data-loss situation would be in order?  Just sayin'

Link to comment
  • 2 weeks later...

I think it affects the raid5 code. raid5.c is mentioned and I believe unRAID is based on the code in raid5.c

Although unRAID appears to be built on a RAID4 layout, (I now see that) the raid5.c driver is used for RAID4, RAID5, and RAID6 layouts.

 

[ So, objection overruled(!!) on cross- ... ]  Thanks.

The bug has been there for years... took unusual circumstances to fall into it.  You have to be writing to a given block at the exact same time you are ALSO writing it to compute parity(or a disk block being re-constructed).  The "write" of the file that was not yet flushed from the disk buffer to the physical disk is ignored, and the block marked for writing forgotten about.

 

While we are waiting for the drama surrounding 5.0 to settle, maybe a 4.7 release to fix a known data-loss situation would be in order?  Just sayin'

 

 

Can we assume the priority is 5-RC2 now then given the activity we see on the version 5-RC1 thread now? Or is a 4.7.1 version still being worked on with the above required fixes?

Link to comment

RC1 is working fine for me  :o.  I haven't done a parity check though.  I will run one later tonight.

 

Hello;

 

Did you have any of the unmenu packages installed? did you disable those (and how did you do that, cleanly, since there is no embedded remove function) and did you just install the barebones RC1?

 

If you had unmenu before, did you continue to use your previous installation and did everything come up as before - or should one rather remove and reinstall unmenu from scratch?

 

Thanks for sharing your experience.

 

 

 

Link to comment

RC1 is working fine for me  :o.  I haven't done a parity check though.  I will run one later tonight.

 

Hello;

 

Did you have any of the unmenu packages installed? did you disable those (and how did you do that, cleanly, since there is no embedded remove function) and did you just install the barebones RC1?

 

If you had unmenu before, did you continue to use your previous installation and did everything come up as before - or should one rather remove and reinstall unmenu from scratch?

 

Thanks for sharing your experience.

to "remove" unMENU packages, use the package manage to remove the "Re-Install on Reboot" for those you do not wish to be re-installed.  Then, reboot.

 

unMENU works on every version of unRAID so far, although the more recent versions are needed to support some features in the myMain plugin.  (not sure if all will work without some help on the pre 4.3 versions of unRAID)

 

To prevent unMENU from running at all, edit your config/go script to remove all but the lines

#!/bin/bash

# Start the Management Utility

/usr/local/sbin/emhttp &

 

(save a copy so you can  revert back to any additions you might have made.)

 

Joe L.

Link to comment

RC1 is working fine for me  :o.  I haven't done a parity check though.  I will run one later tonight.

 

Hello;

 

Did you have any of the unmenu packages installed? did you disable those (and how did you do that, cleanly, since there is no embedded remove function) and did you just install the barebones RC1?

 

If you had unmenu before, did you continue to use your previous installation and did everything come up as before - or should one rather remove and reinstall unmenu from scratch?

 

Thanks for sharing your experience.

 

I had several unmenu packages installed.  I didn't stopped any of them.  I just followed the instruction on the wiki to install RC1, without stopping any of the plugging installed from unmenu packages.  Everything is working fine with my stuff.  No errors.

 

However, the only errors I've got are from dropped packets from my network, but I don't think those are related.

 

I had clean powerdown, bwm, monthly parity check, pci utilities, and bubba installed.

Link to comment
  • 2 weeks later...

i have a unraid box that i setup for a friend that one drive failed in the array, it has been working fine since january. i did a smart report on the drive and from what i can tell it is ok but for some reason when i try and rebuild the array it gets to 45-50% after about a day then the rebuild rate drops down to below 10k. i have tried rebuild this twice but still the same problem. we have reseated all the cables for the second rebuild but still not successful in the rebuild. if you can help diagnosing the problem it would be much appreciated, i have attached the syslog and smart report on the disabled disk...

 

thanks

James

 

Steve-syslog-5-13-2012.txt

Smart-Report-5-13-2012.txt

Link to comment

i have a unraid box that i setup for a friend that one drive failed in the array, it has been working fine since january. i did a smart report on the drive and from what i can tell it is ok but for some reason when i try and rebuild the array it gets to 45-50% after about a day then the rebuild rate drops down to below 10k. i have tried rebuild this twice but still the same problem. we have reseated all the cables for the second rebuild but still not successful in the rebuild. if you can help diagnosing the problem it would be much appreciated, i have attached the syslog and smart report on the disabled disk...

 

Both the attached SMART report and the syslog look great, no issues at all.  I was expecting to see numerous disk errors with lots of timeouts, which is what you usually see when the speed drops badly, but there are none at all.  It has been running the rebuild of Disk2 for a day and a half now, so seems like it should be done by now, but obviously is not.  On the Webgui, it still looks like it is making progress?  And the temps are reasonable?  Give it a little more time before giving up.

 

You will need to check SMART reports for ALL of the drives, not just Disk 2.  Any one of the drives could be the bottleneck.

 

Edit:  Support issues should really be in one of the support forums, not a release announcement thread.

Link to comment

I released a 4.7 with the bug fixed -- see my sig.

 

What bug fixed? Your sig has a link to something about SATA controller cards with a patched kernel?

 

Start reading here

 

 

 

Why has your work not been recognised and made a sticky? Is it because Tom has not / will not associate with a kernel that is not tested by him for version 4.7? I am not meaning to sound rude I am just trying to understand this. 4.7 is the current Production stable release for unraid and it is broken based on the bug that you have now fixed. Why is this not big news? When you originall said you had fixed the bug and I checked your thread, I did not understand that it was not just for the primary thread title (sata controller card compatability) and infact patched the 4.7 "parity rebuild bug" < am I calling it an accurate name? Well done for doing this and thankyou for your efforts. Like I say, I don't think there is much awareness on it. Is it that people do not want to run an unofficially supported version patched by someone other than limetech/Tom ?

 

Regards

Link to comment

The bug is extremely rare and requires only a correcting parity check to remedy. It is gone in version 5.

Not always true.  It affects BOTH parity calculation (where your statement is true)

AND it could affect the re-construction of a failed drive. 

 

In this second case your statement is not true, as there is no easy to run process to re-construct a failed drive a second time, and where a interim parity sync would eliminate the ability to correctly re-construct the failed drive a second time.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.