unRAID6-beta7/8 POSSIBLE DATA CORRUPTION ISSUE: PLEASE READ


limetech

Recommended Posts

tldr: If you are running unRAID 6-beta7 or 6-beta8, there is a risk of data corruption when writing to ReiserFS-formatted volumes.  We recommend that either you boot version 6-beta6 or do not write to ReiserFS formatted volumes!!

 

UPDATE:  BETA 9 WITH REISERFS FIX HAS BEEN POSTED FOR DOWNLOAD!!

 

During the course of preparing version 6-beta9, we updated the linux kernel to 3.16.2 in order to pick up a number of btrfs improvements and fixes.  During the process of building the kernel we discovered numerous source file corruptions that seemed to happen as a result of the ‘make’.  After considerable angst, hardware swaps, and late nights, we finally found the problem is a regression introduced into linux kernel 3.16:

http://www.gossamer-threads.com/lists/linux/kernel/1995761

 

These patches obviously didn’t get into 3.16.2 and we’re not waiting for 3.16.3 – we will have 6.0-beta9 out ASAP with patches applied and tested.

 

The corruption seems to happen much more frequently with 3.16.2, which is not present in any unRAID OS release.  I have not seen any corruption on array devices using –beta7 or –beta8, but the possibility exists.

 

EDIT:

 

Link to -beta6:

http://dnld.lime-technology.com/beta/unRAIDServer-6.0-beta6-x86_64.zip

 

Rather than downgrading, might be easier to just not write to ReiserFS array devices until I can get -beta9 out, which is consuming 99% of my attention right now (the other 1% is monitoring this thread).

 

Link to comment
  • Replies 239
  • Created
  • Last Reply

Top Posters In This Topic

Sounds like a good reason to migrate over to XFS to me...  I added a new drive to my array yesterday, but for the life of me couldn't find anywhere to change the FS to XFS when adding the drive (I had pre-cleared beforehand, could that have been the issue?).

With array stopped, click on the device link on the Main page and you will be able to set the desired file system type.

Link to comment

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

 

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

Link to comment

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

 

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

 

The best thing you can do to protect yourselves right now is this:

 

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

 

XFS and BTRFS filesystems remain unaffected by this issue.

Link to comment

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

 

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

 

The best thing you can do to protect yourselves right now is this:

 

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

 

XFS and BTRFS filesystems remain unaffected by this issue.

 

I am doing this by disabling mover for the time being.

Link to comment

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

 

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

 

The best thing you can do to protect yourselves right now is this:

 

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

 

XFS and BTRFS filesystems remain unaffected by this issue.

 

Just finished migrating my disks to XFS. Data migration were always ReiserFS->XFS. No danger here, right?

Link to comment

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

 

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

 

The best thing you can do to protect yourselves right now is this:

 

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

 

XFS and BTRFS filesystems remain unaffected by this issue.

 

Just finished migrating my disks to XFS. Data migration were always ReiserFS->XFS. No danger here, right?

 

You should be fine.

Link to comment

Thanks for the heads up. I set the mover to only move on the 1st of the month. It sounds like this is a case where reiserfs is not being as activity maintained as it once was and things are falling through the cracks? Would you advise migrating our data to a different FS?

 

I advise that no one running beta 7 or 8 move data onto or off of reiserfs until after we release beta 9 with the fix.  That said, as far as encouraging the migration to ReiserFS, it would be easy for me to point at ReiserFS say, "see!  This is why XFS and BTRFS are better and we included them," that's not really true.  The truth of the matter is that file systems are complex beasts and issues like this can occur from time to time.  It's happened to pretty much every file system out there.  None are infallible to having issues occur.  So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it.  That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

 

EDIT:  clarified first comment to reflect only beta 7/8.

Link to comment

Thanks for the heads up. I set the mover to only move on the 1st of the month. It sounds like this is a case where reiserfs is not being as activity maintained as it once was and things are falling through the cracks? Would you advise migrating our data to a different FS?

 

It was a bug introduced by someone thinking they were simplifying some code - why they are touching several-year-old proven code is beyond me.

 

This kind of thing has happened before, e.g., the famous (or infamous) ext4 corruption bug of 2012:

http://www.phoronix.com/scan.php?page=news_item&px=MTIxNDQ

 

I guess the reply would be similar to Robin Seggelmann's when asked about the 'heartbleed' bug: "oops!"

Link to comment

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it.  That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

 

That is such a salesman answer  :P Maybe the better question is on your personal server what FS are you using?  ;D

 

P.S. I wont personally hold LT responsible if my server blows up (blah blah legal stuff).  ::)

Link to comment

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it.  That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

 

What is such a salesman answer  :P Maybe the better question is on your personal server what FS are you using?  ;D

 

I'm running almost entirely ReiserFS in my production system at home.  I only use BTRFS for cache devices.  I assigned XFS to two array disks that I just recently got for testing.

Link to comment

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it.  That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

 

What is such a salesman answer  :P Maybe the better question is on your personal server what FS are you using?  ;D

 

I'm running almost entirely ReiserFS in my production system at home.  I only use BTRFS for cache devices.  I assigned XFS to two array disks that I just recently got for testing.

So would putting this
mv /usr/local/sbin/mover /usr/local/sbin/mover.old

into the go script and rebooting, stop mover from running?

Link to comment

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it.  That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

 

What is such a salesman answer  :P Maybe the better question is on your personal server what FS are you using?  ;D

 

I'm running almost entirely ReiserFS in my production system at home.  I only use BTRFS for cache devices.  I assigned XFS to two array disks that I just recently got for testing.

So would putting this
mv /usr/local/sbin/mover /usr/local/sbin/mover.old

into the go script and rebooting, stop mover from running?

 

You can disable the cache function under "Share Settings" in the webGui.

Link to comment

What form is this corruption likely to take? Specific files that are written, or something that might possibly affect other files that are not being written as well?

 

I waited until just this last weekend to go from beta6 to beta8, but my cache is not used for caching user shares. All user share data goes directly to the array.

 

I guess I could set them to use cache, but what does cache do if you overwrite something in a user share that is already in the array? Does it write to cache and leave the old version on the array until mover runs, or does it delete the old version from the array immediately, or does it overwrite the file on the array directly?

 

Never really thought about how caching works with file updates before.

 

Link to comment

What form is this corruption likely to take? Specific files that are written, or something that might possibly affect other files that are not being written as well?

 

I waited until just this last weekend to go from beta6 to beta8, but my cache is not used for caching user shares. All user share data goes directly to the array.

 

I guess I could set them to use cache, but what does cache do if you overwrite something in a user share that is already in the array? Does it write to cache and leave the old version on the array until mover runs, or does it delete the old version from the array immediately, or does it overwrite the file on the array directly?

 

Never really thought about how caching works with file updates before.

 

We're still trying to learn more about this issue and will share as we discover, but for now, here's what we know:

 

It's more likely to affect small files than larger ones, but writing to the filesystem in general can potentially cause corruption to other files on the device.  That is why we are recommending that everyone stop writing to their reiserfs disks until beta 9 can be released with the appropriate fix.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.