Jump to content

Data Loss?


liujason

Recommended Posts

I've been using unraid plus 4+ years, and this is the first time that I notice this. I'm a bit panicking right now.

 

Today, when I try to open lightroom to edit some older pictures (pictures from July 2014, and February 2015), almost 1 in 8 raw files (CR2) it opens with corrupted images. I had to resort to the JPG version, and even some JPG versions are corrupted as well. It is weird, that when lightroom indexed the files at the time, they were fine, but today when I try to open them again, small portions of the files become corrupted.  :-[ :-[ (by "corrupted", I mean, some images open with distortion, some with very bright parts, some just simply black.)

 

I have a 4 disk + 1 parity setup. running latest 6.0 public release. I have been doing monthly parity checks diligently for the last several years. It is always 0 errors. (Last checked on Wed 01 Jul 2015 11:43:52 AM EDT (four days ago), finding 0 errors. )

 

What is happening here? what can I do to make sure the files are ok?

 

Thanks!

 

Jason

 

Link to comment

What file system (ReiserFS, XFS, BTFS) are you using on the disks which hold the corrupted files?

 

If ReiserFS, did you ever use 6.0 Beta 7 or Beta 8 ... Those versions of Unraid had a problem in the ReiserFS driver which could corrupt existing files.

 

Unfortunately this sort of corruption problem is unlikely to be fixable.

 

Keeping checksums and backups of files are usually the best means of recovering from such problems in the future.

 

 

Link to comment

All disks are ReiserFS. It is very likely that I used Beta 7 and 8. I always try to keep using the latest versions. (Turns out the earliest version 6 I used was 14b, see later post in the thread. )

 

Is there any tools available to detect what files are impacted?

 

Keeping checksums and backups of files are usually the best means of recovering from such problems in the future.

 

Unfortunately, some of my backups are also corrupted in unraid, as I'm heavily trusting and relying on unraid to accurately and permanently store files. Are you suggesting me using a different solution to backup files in unraid?

 

Again, is there anything available to figure out the impact? Thanks!

 

Jason

Link to comment

I would suggest reading Gary's post here:

 

http://lime-technology.com/forum/index.php?topic=31020.msg279579#msg279579

 

Very informative.

 

Is there any tools available to detect what files are impacted?

 

Bottom line is that unless you have checksums of your data there isn't a way that I know of to figure out which files are affected. Unless you have separate backups of all your files that you can compare to the files on your unRAID server.

Link to comment

Bottom line is that unless you have checksums of your data there isn't a way that I know of to figure out which files are affected. Unless you have separate backups of all your files that you can compare to the files on your unRAID server.

 

Sounds like there are unknown% of permanent damage to all the files.  :(

 

Thanks for the link, but I am not sure how backup would help in this case.

 

If I were to diligently copy all files into a backup drive, say monthly, how would I know if the files that I am copying out of Unraid is not corrupted without opening them? It would be equally frustrated finding my backed up files are already corrupted.

 

 

Link to comment

Bottom line is that unless you have checksums of your data there isn't a way that I know of to figure out which files are affected. Unless you have separate backups of all your files that you can compare to the files on your unRAID server.

 

Sounds like there are unknown% of permanent damage to all the files.  :(

 

Thanks for the link, but I am not sure how backup would help in this case.

 

If I were to diligently copy all files into a backup drive, say monthly, how would I know if the files that I am copying out of Unraid is not corrupted without opening them? It would be equally frustrated finding my backed up files are already corrupted.

The normal process is to create checksums for all your files at the time you make the backup, and store these checksums with the backup.    You can then also create checksums for the files on the server at any point, and compare these to the backups.    That way you know if the files have changed since you made the backup.

 

There are a number of utilities in the User Customizations section of the forum to help with creating and maintaining checksums.

Link to comment

If I were to diligently copy all files into a backup drive, say monthly, how would I know if the files that I am copying out of Unraid is not corrupted without opening them? It would be equally frustrated finding my backed up files are already corrupted.

 

That is the beauty of having checksums of your files, what I do is verify all my files on my unRAID server are good before I manually back them up to my backup hard drives. That way if any of the files report as corrupt I can find that file on my backup drives and copy back the uncorrupted file to unRAID. Once I have verified that all my files are good, I then run a program that only copies the files from unRAID that have changed to my backup hard drives.

 

Anyways, as far as determining which of your files are corrupt, do you happen to have mostly pictures? You could try loading them in some sort of picture organizer and seeing which ones look goofy... If music, try playing the music and hopefully if it's corrupt it wont play... Try the same with movies...

Link to comment

Anyways, as far as determining which of your files are corrupt, do you happen to have mostly pictures? You could try loading them in some sort of picture organizer and seeing which ones look goofy...

 

Thanks archedraft for trying to add a bucket of water to my fire rather than point to blame me for not backing up or have no checksum. I guess I'm not the only one who relies solely on unraid for backing up their files.

 

The most precious files are years of pictures, videos (not movies, but home made videos scraped from VCR tapes), voice recordings (again, home made voice recordings scraped from cassette tapes), documents (tax files, investment records, receipts for tax purposes in PDFs, etc.). There are a lot of them (perhaps 1TB out of my 9TB are those types of files).

 

Still really frustrated this could happen. I feel my data is more safe without unraid right now - sounded like it was unraid broke my data.

Link to comment

Anyways, as far as determining which of your files are corrupt, do you happen to have mostly pictures? You could try loading them in some sort of picture organizer and seeing which ones look goofy...

 

Thanks archedraft for trying to add a bucket of water to my fire rather than point to blame me for not backing up or have no checksum. I guess I'm not the only one who relies solely on unraid for backing up their files.

 

The most precious files are years of pictures, videos (not movies, but home made videos scraped from VCR tapes), voice recordings (again, home made voice recordings scraped from cassette tapes), documents (tax files, investment records, receipts for tax purposes in PDFs, etc.). There are a lot of them (perhaps 1TB out of my 9TB are those types of files).

 

Still really frustrated this could happen. I feel my data is more safe without unraid right now - sounded like it was unraid broke my data.

 

No worries, I lost a full TB of data before I started backing up my data so I know how you feel (granted it was completely my fault that the data was deleted in the first place). Honestly, unless you have the original files stored on a memory card / tape or something it may not even be worth all that time going back through the data. If you do happen to have the original files then I would at least check over that data  because you can actually replace the files.

Link to comment

If you do happen to have the original files then I would at least check over that data  because you can actually replace the files.

 

No I don't. Usually my work flow goes like [camera] -> [unraid NFS], or [scan] -> [unraid NFS]. Some documents I might still have the hard copy. As I still don't know the impact of the files (1 in 8 raw files that I opened are corrupted), it is hard to know what files to recover.

Link to comment

What is the checksum workflow that you'd recommend?

Personally, I use Corz Checksum http://corz.org/windows/software/checksum/ because it is pretty robust, fast, has quite a bit of options, and Gary told me to. I have looked at the ones in the customization section but decided against using them just because I do not know enough about how they work. Sure I could learn but Corz works great for me. How I use Corz is right click on my file share and click verify checksums. Then once it's finished I right click and click create checksums and then synchronize so that any new files get added to the existing checksum file. Once I have done this for all my shares and I am happy with the results, I copy over the files to my backup drives.

 

and is there a recommended script/tool?

I believe you can set up Corz with the Windows Scheduler and have it run on any interval you want. I have not set this up before.

 

Can I suggest to make auto checksum as part of Unraids core functionality?

+1, Would be pretty nice feature

Link to comment

I'm a mac user, but my wife uses a windows 7 laptop(almost convinced her to switch to ChromeOS for what she needs). I'll borrow her laptop and try the tool. Thanks!

 

(but I guess the checksum will be run on already corrupted files. Sigh...)

 

You could attempt to setup a Windows Virtual Machine (although it is way overkill for only running one program). Honestly, I wish Corz would update his Linux version and add in some of the great features that can be found on the Windows version.

Link to comment

... I guess I'm not the only one who relies solely on unraid for backing up their files.

 

The most precious files are years of pictures, videos (not movies, but home made videos scraped from VCR tapes), voice recordings (again, home made voice recordings scraped from cassette tapes), documents (tax files, investment records, receipts for tax purposes in PDFs, etc.). There are a lot of them (perhaps 1TB out of my 9TB are those types of files).

 

Still really frustrated this could happen. I feel my data is more safe without unraid right now - sounded like it was unraid broke my data.

Unless you have more than one copy of a file you do not have a backup regardless of the systems involved.

 

I have important files like those you have listed, including Lightroom, but the originals all stay on the system where they were created and they get backed up nightly to unRAID. I also back them up monthly to external drives for storage offsite.

 

unRAID is a backup only if it is an additional copy of the files.

Link to comment

unRAID is a backup only if it is an additional copy of the files.

 

True. Lessons learned by the hard way.

 

But it sounded like without doing checksums on the files, there is no way to truly know that the files being backed up are not corrupted. Offsite backed up files might have already been corrupted. Like if I didn't know my files are corrupted by opening them last night, and back them up to an external hard drive, the files are already broken in the first place.

 

I really think checksum here is the key. Regardless how many copies of backups one keep, if they are corrupted, it is only the corrupted files gets backed up once and once again.

 

I'm seriously considering switching to Google Drive (1TB/ $10/month)  for storage without having to worry about backups.

 

Link to comment

... I guess I'm not the only one who relies solely on unraid for backing up their files.

 

It depends on what you mean by that.  If you mean you do NOT backup because you depend on the fault-tolerant features of UnRAID, you do NOT have a backup.    If you're backing up to a 2nd UnRAID server, then that's fine.

 

But as already discussed, your files seem to already be corrupted, so there's no way to recover at this point if you don't have the originals.

 

What I do (as outlined in the post archedraft linked to)  is create checksums using Corz for every file I copy to my UnRAID server.  At the same time, I copy the files, with their checksums to my backups.    So at any time I can simply right-click, "Verify checksums" to validate any file(s) I may have questions about;  and I can do the same thing to validate my backups as well.

 

About once/year I do a complete checksum verification of my backups (takes several days of "computer time" -- not a lot of mine); and any time there's a detected parity error (VERY rare) I do the same for the primary array.

 

ALL of my data (40+ TB) is thoroughly backed up.  It's all stored on fault-tolerant UnRAID servers [2 main ones -- a "Media" server and a "Misc" server];  is then backed up on a fault-tolerant UnRAID backup server;  and in addition everything is stored on a set of hard disks that are kept in a fireproof data-rated safe.  ALL of these have checksums stored with the data ... and I have a set of PDF files with the directories of the disks.    I do NOT plan to every lose any data  :) :)  [but it could, of course, still happen -- NO backup, no matter how good, is absolutely 100% reliable.]

 

 

 

Link to comment

NO backup, no matter how good, is absolutely 100% reliable.

 

Thanks for your reply Gary. Would you consider cloud based solutions (Google Drive for example) a fairly reliable backup? (instead of setting up multiple backup systems, running checksums, etc?)

Link to comment

NO backup, no matter how good, is absolutely 100% reliable.

 

Thanks for your reply Gary. Would you consider cloud based solutions (Google Drive for example) a fairly reliable backup? (instead of setting up multiple backup systems, running checksums, etc?)

Not necessarily - cloud based backups do not always protect you against files getting corrupted locally.  As soon as a local file gets corrupted that change gets synced to the cloud so the cloud copy is also corrupted. 

 

You need a backup solution that has some sort of 'snapshot' concept of what your data is at a certain point in time so that if necessary you can revert to that version.

Link to comment

Sorry, thought of another question. If my files got corrupted randomly, then why my monthly parity check didn't catch the corruption??

Depends on how the corruption happened. If corrupt data was written that way for whatever reason, all parity would know is that some data was written and it would be updated to reflect the new data.
Link to comment

Not necessarily - cloud based backups do not always protect you against files getting corrupted locally.  As soon as a local file gets corrupted that change gets synced to the cloud so the cloud copy is also corrupted. 

Good call. So I probably just use google drive from my computer, instead from unraid (as it seems to be the source of corruption).

 

 

You need a backup solution that has some sort of 'snapshot' concept of what your data is at a certain point in time so that if necessary you can revert to that version.

 

When I googled 'snapshot', I found snapraid... seems like a competitor. Does anyone have experience with snapraid?

 

Depends on how the corruption happened. If corrupt data was written that way for whatever reason, all parity would know is that some data was written and it would be updated to reflect the new data.

 

One post suggests it was caused by a bad ReiserFS driver (in beta 7 or 8 ) that caused this corruption. Still, I would think it would caught by parity check error?

Link to comment

One post suggests it was caused by a bad ReiserFS driver (in beta 7 or 8 ) that caused this corruption. Still, I would think it would caught by parity check error?

This is exactly the sort of scenario I would expect parity would not notice. It knows nothing of file systems, only bits. If reiser wrote corrupt data, parity would just update so it agreed with that corrupt data.
Link to comment

One post suggests it was caused by a bad ReiserFS driver (in beta 7 or 8 ) that caused this corruption. Still, I would think it would caught by parity check error?

No - it was a bug at the Linux level that wrote data to the wrong sectors on a disk.    Parity is file system agnostic and just makes sure that whatever bit pattern is written to a particular sector on a disk can be reconstructed if that disk fails.  It does not know that higher level code in the system has written the wrong data to a sector.

 

This particular bug was not unRAID specific - it could occur on any Linux system running reiserfsck with that particular mix of kernel/driver.  However because during its beta cycle unRAID was tending to use the cutting edge versions of Linux it is likely that unRAID users were amongst the first to stress test this Linux version heavily enough to uncover this sort of bug.  That is the risk you take with using beta software which means you should be extremely wary of using a beta release on any system for which you do not have a fall-back position if problems are found.

Link to comment

When I googled 'snapshot', I found snapraid... seems like a competitor. Does anyone have experience with snapraid?

The few people I know who have looked at snapRAID were not very impressed with it.

 

BTW:  ANY Linux based system using ReiserFS with that particular mix of kernel/driver would have suffered from data corruption.

 

Snapshot is a much more genetic concept than a particular system type or software.  In the context of backups it typically means having a copy of your data as it was at a particular point in time that is held off the system that for which it it provides the backup.  Ideally there should also be some concept of off-site in case of physical damage (e.g fire/flood) but not everyone goes that far, preferring to take the risk.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...