Jump to content

Data Loss?


liujason

Recommended Posts

Interesting discovery: tl;dr version - It looks like I didn't use beta 7 or 8 after all, and the first beta version 6.0 I used was beta 14b on March 4th. Files prior to March 4th are corrupted, and after March 4th are ok.

 

After some digging, and match to the timing with the release notes (http://lime-technology.com/wiki/index.php/Release_Notes), my last backup of unraid config files from the flash drive was 5.0.5 on March 4th, 2015. Assuming I updated to version 6 on that day, the version I was using should be 6.0-beta14b.

 

Further more, I double checked the raw files around that date and found - prior to March 4th, majority of them are broken (ballpark gestimate is around 70% raw files and 10% the jpgs are broken). HOWEVER, raw files SINCE March 4th are fine! The cut-over date corresponds well with the upgrade around March 4th.

 

Now I'm again puzzled on what caused the files prior to March 4th be corrupted.

 

Link to comment

Clearly SOMETHING resulted in file corruption on the 4th.  Whether it was UnRAID, or an errant program that simply wrote to your file system and UnRAID dutifully wrote what was requested and updated parity to "protect" the corrupted files, is one of life's little mysteries.

 

r.e. your question about cloud backups => A cloud backup is like any other backup.  If you backup a corrupted file, the backup will also be corrupted.  That's why I write every file I copy to UnRAID to my backup at the same time -- so they both have the same source.    The backups are NEVER used otherwise.

 

A cloud backup is certainly an alternative to a backup server or a set of backup disks => and does have the advantage of being "off site" ... so you're protected against physical destruction; fire; floods; etc.

 

Link to comment

Is it only Lightroom files that are affected? If so then maybe it has something to do with that software rather than unRAID.

 

Is Lightroom trying to use the files from the server? What happens if you copy the files to the Lightroom machine and then try to use them?

Link to comment

Is it only Lightroom files that are affected? If so then maybe it has something to do with that software rather than unRAID.

 

Is Lightroom trying to use the files from the server? What happens if you copy the files to the Lightroom machine and then try to use them?

 

I really feel honoured that so many great minds are helping me to figure this out. trurl, garycase, archedraft, itimpi, I really appreciate you guys help.

 

No. It is not just lightroom. The files are indeed corrupted. I could not open them in any way. It is not just the camera raw files, also videos, some JPGs, PDFs, even some text files are affected (open text in sublime shows corrupted characters.)

 

I dug up a copy of one of the disks that contains the most precious files when I upgraded the drive, dated 2012 October. I checked couple corrupted files are fine in the backup. I mounted the ReiserFS drive in Ubuntu through VMware, and currently in the process of copying all the files with 'cp -rf'. It is a long process (10MB/s, should take about 29 hours). I double checked to make sure systems don't go to sleep. I hope it doesn't mess up anything this time. (come think of it, I could hook the backup drive directly to unraid via USB and run the cp command with nohup... oh well, it is past mid night...)

 

Like Gary said, something happened around the 4th. Or it could be well before the 4th that damaged the files. I don't want to blame unraid bluntly, but the fact that the corrupted files have the same modified date time along with the rest of the batch of good files indicates the problem isn't happening outside of the unraid setup. If it were my computer messed files up, or updated the files in any way, the system should have kept a different modified date. Same applies to apps running on unraid. CR2 (camera raw files) are never updated, and the modified date/time stays at capture time. So to me, it seems the files were altered at the system level in unraid.

 

 

 

 

Link to comment

Around the date you have narrowed the corruption down ....

 

Did you do a disk rebuild replacing a disk with a replacement?

Did you select "trust parity" and cancel the parity check?

 

Have you run a parity check recently?    If not I would suggest a non-correcting check to start with to see if your parity disk is for some reason out of sync with the data disks.

 

Link to comment

Interesting discovery: tl;dr version - It looks like I didn't use beta 7 or 8 after all, and the first beta version 6.0 I used was beta 14b on March 4th. Files prior to March 4th are corrupted, and after March 4th are ok.

 

After some digging, and match to the timing with the release notes (http://lime-technology.com/wiki/index.php/Release_Notes), my last backup of unraid config files from the flash drive was 5.0.5 on March 4th, 2015. Assuming I updated to version 6 on that day, the version I was using should be 6.0-beta14b.

 

Further more, I double checked the raw files around that date and found - prior to March 4th, majority of them are broken (ballpark gestimate is around 70% raw files and 10% the jpgs are broken). HOWEVER, raw files SINCE March 4th are fine! The cut-over date corresponds well with the upgrade around March 4th.

 

Now I'm again puzzled on what caused the files prior to March 4th be corrupted.

I have millions of jpg and cr2 files on my unRaid server created over 4 years. I just went thru a bitrot check verifying that my backup servers have the same files as my working server.  I found less than a dozen corrupted files.

 

My workflow using unRaid 5

 

Camera to Windows Lightroom

 

Within an hour backup all image files to unRaid via cwRsync.

 

Once per day backup working unRaid to backup servers using rsync.

 

Once per week backup to off site unRaid servers (2 rotated in this role)

 

I only wished that I had implemented jbatrlett, bonienl, and weebo's system years ago. Thankfully I didn't find much corruption.  The corz system was not to my liking since it created additional check sum files and required a Windows computer. The last thing I wanted was more files to manage. The system I refer to above is Linux based, and stores the checksums in the extended attributes of the file. This was only released in the last year.  I hope to get checksums fully into my workflow soon.

Link to comment

+1

 

Is this already in the developers pipeline?

 

If not, is there/here a comprehensive setup/howto what can be done today?

I'm following the bespoken threads and it seems there is no consensus what is best (store the checksum in

extended attributes or better in a sql file or something else).

In addition to that it is not a solution for the daily life of an average unRAID user.

(I'm copying my data once a week or later to the server).

 

Thoughts on backing up to a second server :

When performing a backup to the second server it could be smart to do a "dry run" with rsync and check the log

of what files will be backed up.

Then, one should be able to estimate how much data respectively what files are reasonably backed up or not.

If old files (usually photo and video) suddenly get updated on the backup side the user should be

aware that something is wrong.

Thoughts on this approach?

 

I would also appreciate an rsync GUI to setup backups between 2 unRAID servers.

Presuming many of us run the second server as a backup for the first one.

While rsync comandline is really powerful it is a PITA in terms of usability.

When used on irregular basis and/or by inexperienced users it's use probably

poses a higher risk than doing nothing (no backup).

 

@all eagar beta testers: I think it is not smart to work with beta software on the one and only productive system!

IMHO even release candidates still bear a certain amount of risk.

Of course $h!7 happens due to bad hardware also but that is a risk you have to take.

 

Meh, initially I just wanted to post a "+1"  ;D

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...