unRAID6-beta7/8 POSSIBLE DATA CORRUPTION ISSUE: PLEASE READ

limetech · September 9, 2014

tldr: If you are running unRAID 6-beta7 or 6-beta8, there is a risk of data corruption when writing to ReiserFS-formatted volumes. We recommend that either you boot version 6-beta6 or do not write to ReiserFS formatted volumes!!

UPDATE: BETA 9 WITH REISERFS FIX HAS BEEN POSTED FOR DOWNLOAD!!

During the course of preparing version 6-beta9, we updated the linux kernel to 3.16.2 in order to pick up a number of btrfs improvements and fixes. During the process of building the kernel we discovered numerous source file corruptions that seemed to happen as a result of the ‘make’. After considerable angst, hardware swaps, and late nights, we finally found the problem is a regression introduced into linux kernel 3.16:

http://www.gossamer-threads.com/lists/linux/kernel/1995761

These patches obviously didn’t get into 3.16.2 and we’re not waiting for 3.16.3 – we will have 6.0-beta9 out ASAP with patches applied and tested.

The corruption seems to happen much more frequently with 3.16.2, which is not present in any unRAID OS release. I have not seen any corruption on array devices using –beta7 or –beta8, but the possibility exists.

EDIT:

Link to -beta6:

http://dnld.lime-technology.com/beta/unRAIDServer-6.0-beta6-x86_64.zip

Rather than downgrading, might be easier to just not write to ReiserFS array devices until I can get -beta9 out, which is consuming 99% of my attention right now (the other 1% is monitoring this thread).

smdion · September 9, 2014

Can you make the links live for Beta 6 on the Downloads page?

heffe2001 · September 9, 2014

Sounds like a good reason to migrate over to XFS to me... I added a new drive to my array yesterday, but for the life of me couldn't find anywhere to change the FS to XFS when adding the drive (I had pre-cleared beforehand, could that have been the issue?).

limetech · September 9, 2014

Sounds like a good reason to migrate over to XFS to me... I added a new drive to my array yesterday, but for the life of me couldn't find anywhere to change the FS to XFS when adding the drive (I had pre-cleared beforehand, could that have been the issue?).

With array stopped, click on the device link on the Main page and you will be able to set the desired file system type.

limetech · September 9, 2014

Can you make the links live for Beta 6 on the Downloads page?

Took down the Beta section from there; Edited first post to include link to -beta6.

jphipps · September 9, 2014

After the fix is in place, is there any utilities that can be used to detect the corruption?

needo · September 9, 2014

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

jonp · September 9, 2014

After the fix is in place, is there any utilities that can be used to detect the corruption?

We are investigating, but as of this time, no.

jonp · September 9, 2014

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

The best thing you can do to protect yourselves right now is this:

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

XFS and BTRFS filesystems remain unaffected by this issue.

needo · September 9, 2014

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

The best thing you can do to protect yourselves right now is this:

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

XFS and BTRFS filesystems remain unaffected by this issue.

I am doing this by disabling mover for the time being.

gfjardim · September 9, 2014

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

The best thing you can do to protect yourselves right now is this:

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

XFS and BTRFS filesystems remain unaffected by this issue.

Just finished migrating my disks to XFS. Data migration were always ReiserFS->XFS. No danger here, right?

jonp · September 9, 2014

Ouch! This is the risk with running betas! We knew the job was dangerous when we took it! Considering this is my second round of betas with LimeTech (6 and 5) and this is the first "possible" data corruption issue that has occurred I still consider this pretty good.

Unfortunately I have been running 6b7 and 6b8 for so long and recently replaced my parity drive so if my data is corrupted then it is. Ill wait patiently for b9.

The best thing you can do to protect yourselves right now is this:

DO NOT WRITE ANY DATA TO A REISERFS DISK IN THE ARRAY

XFS and BTRFS filesystems remain unaffected by this issue.

Just finished migrating my disks to XFS. Data migration were always ReiserFS->XFS. No danger here, right?

You should be fine.

archedraft · September 9, 2014

Thanks for the heads up. I set the mover to only move on the 1st of the month. It sounds like this is a case where reiserfs is not being as activity maintained as it once was and things are falling through the cracks? Would you advise migrating our data to a different FS?

jonp · September 9, 2014

Thanks for the heads up. I set the mover to only move on the 1st of the month. It sounds like this is a case where reiserfs is not being as activity maintained as it once was and things are falling through the cracks? Would you advise migrating our data to a different FS?

I advise that no one running beta 7 or 8 move data onto or off of reiserfs until after we release beta 9 with the fix. That said, as far as encouraging the migration to ReiserFS, it would be easy for me to point at ReiserFS say, "see! This is why XFS and BTRFS are better and we included them," that's not really true. The truth of the matter is that file systems are complex beasts and issues like this can occur from time to time. It's happened to pretty much every file system out there. None are infallible to having issues occur. So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it. That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

EDIT: clarified first comment to reflect only beta 7/8.

limetech · September 9, 2014

Thanks for the heads up. I set the mover to only move on the 1st of the month. It sounds like this is a case where reiserfs is not being as activity maintained as it once was and things are falling through the cracks? Would you advise migrating our data to a different FS?

It was a bug introduced by someone thinking they were simplifying some code - why they are touching several-year-old proven code is beyond me.

This kind of thing has happened before, e.g., the famous (or infamous) ext4 corruption bug of 2012:

http://www.phoronix.com/scan.php?page=news_item&px=MTIxNDQ

I guess the reply would be similar to Robin Seggelmann's when asked about the 'heartbleed' bug: "oops!"

archedraft · September 9, 2014

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it. That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

That is such a salesman answer Maybe the better question is on your personal server what FS are you using?

P.S. I wont personally hold LT responsible if my server blows up (blah blah legal stuff). ::)

jonp · September 9, 2014

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it. That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

What is such a salesman answer Maybe the better question is on your personal server what FS are you using?

I'm running almost entirely ReiserFS in my production system at home. I only use BTRFS for cache devices. I assigned XFS to two array disks that I just recently got for testing.

eroz · September 9, 2014

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it. That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

What is such a salesman answer Maybe the better question is on your personal server what FS are you using?

I'm running almost entirely ReiserFS in my production system at home. I only use BTRFS for cache devices. I assigned XFS to two array disks that I just recently got for testing.

So would putting this

mv /usr/local/sbin/mover /usr/local/sbin/mover.old

into the go script and rebooting, stop mover from running?

jonp · September 9, 2014

So all of this said, I guess the more direct response to your question is that I wouldn't advise for it or against it. That would be like moving to another country after you get in a car accident once thinking "I won't get in another car accident if I lived THERE."

What is such a salesman answer Maybe the better question is on your personal server what FS are you using?

I'm running almost entirely ReiserFS in my production system at home. I only use BTRFS for cache devices. I assigned XFS to two array disks that I just recently got for testing.

So would putting this
mv /usr/local/sbin/mover /usr/local/sbin/mover.old
into the go script and rebooting, stop mover from running?

You can disable the cache function under "Share Settings" in the webGui.

trurl · September 9, 2014

What form is this corruption likely to take? Specific files that are written, or something that might possibly affect other files that are not being written as well?

I waited until just this last weekend to go from beta6 to beta8, but my cache is not used for caching user shares. All user share data goes directly to the array.

I guess I could set them to use cache, but what does cache do if you overwrite something in a user share that is already in the array? Does it write to cache and leave the old version on the array until mover runs, or does it delete the old version from the array immediately, or does it overwrite the file on the array directly?

Never really thought about how caching works with file updates before.

jonp · September 9, 2014

What form is this corruption likely to take? Specific files that are written, or something that might possibly affect other files that are not being written as well?

I waited until just this last weekend to go from beta6 to beta8, but my cache is not used for caching user shares. All user share data goes directly to the array.

I guess I could set them to use cache, but what does cache do if you overwrite something in a user share that is already in the array? Does it write to cache and leave the old version on the array until mover runs, or does it delete the old version from the array immediately, or does it overwrite the file on the array directly?

Never really thought about how caching works with file updates before.

We're still trying to learn more about this issue and will share as we discover, but for now, here's what we know:

It's more likely to affect small files than larger ones, but writing to the filesystem in general can potentially cause corruption to other files on the device. That is why we are recommending that everyone stop writing to their reiserfs disks until beta 9 can be released with the appropriate fix.

trurl · September 9, 2014

If we go back to beta6 is there a simple way we can install docker using the loopback method? Maybe something in the go file? Or should we just install it the old way for now and wait on beta9?

gfjardim · September 9, 2014

Tom or Jon, is

mount -o remount,ro /dev/md<number>

a safe way o avoiding this bug?

limetech · September 9, 2014

Tom or Jon, is
mount -o remount,ro /dev/md<number>
a safe way o avoiding this bug?

Yeah that should work. Very close to having -beta9 ready.

jonp · September 9, 2014

If we go back to beta6 is there a simple way we can install docker using the loopback method? Maybe something in the go file? Or should we just install it the old way for now and wait on beta9?

Beta 9 is imminent. Wait for it..

unRAID6-beta7/8 POSSIBLE DATA CORRUPTION ISSUE: PLEASE READ

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation