unRAID6-beta7/8 POSSIBLE DATA CORRUPTION ISSUE: PLEASE READ


limetech

Recommended Posts

squid and everybody else who uses md5 or any other kind of checksum

 

what are you using to make these?

how automated is this?

as i have like a million files on these servers.. would hate to do this one by one

 

i thought i sawsome script but didn't have the time to try it yet but this would be my first priority after upgrading to beta9

seems that script was to find duplicates ...

 

Link to comment
  • Replies 239
  • Created
  • Last Reply

Top Posters In This Topic

use corz checksum.  It will take *forever* for your first run (mine took me about a week straight, and I had one computer doing the checksums on one server, and another running it on the other server)

 

After that, simple updates will take a couple of minutes.  I run it every week to automatically create the md5's for the new files.

Link to comment

thx for the input here so far - may i say some ppl have probs, others not. with other words (beside the precautions suggested) some ppl here blow this issue way out of proportion.

there is a risk that you have a prob at this point, but all may be fine (always considering stopping senseless risky file moving at this point)

it is not ebola

 

cheers, L

Link to comment

Yeah, its from windows :(  Corz used to have a linux version, I don't think there is anymore.

 

I actually originally made my md5 files using a script that organized them in the same fashion as corz using md5sum via an ubuntu box.  It was tons faster.  But, for simplicity sake, I now just do the updates via my windows box.

 

 

Link to comment

it does exist.  it is just very limited.  I'm going to try to use it to run my verification

 

EDIT: by limited I mean some of the more useful "management" functions are not there.  But create and verify are and they run much faster on the machine vice over the network once I are doing more than two disks at a time (two saturates my network with my 2TB drives).

 

It is in the zip you download in the "extra" folder iirc and you also need to recreate a sym link

 

You can see the conversation here http://corz.org/windows/software/checksum/checksum-tricks.php#section-Requests scroll down and look for jumperalex.

 

EDIT2: ACK I forgot that I used BLAKE2 hashes and unraid doesn't have those installed.  I had meant to convert to straight md5 but forgot.  Which is good since that would have happened while on beta8 :-/  Verify now running again over windows.  Time for bed

Link to comment

 

I have not started looking at video files yet, but have noticed that while I moved a ton of TV shows around using MC it's left phantom folders on the source drive all over the place. I don't know if this means specifically that there was an issue - but this is not common. Usually moving this way is very clean.

 

I would say the risk is obviously relational to the amount of data movement you've done over the last 6 weeks. Even no movement has risk, but it grows and grows the more you move.

 

Mmm... I am noticing something slightly similar... I am moving files to freshly formatted XFS drives and I notice my free space seems to go down rather quick.. I wasn't paying attention to it thinking I will find it using some way of duplicate scanning.. But I figured I'd report it here after all. Maybe its related.

Link to comment

In the middle of checking all of my md5 checksums, and what I've found so far is that none of my media files (mkv) are corrupt.  However, it looks like most of my .nfo, .tbn, and .jpg are mismatching the stored checksum.  I can't however remember if I've ever told xbmc to overwrite them during a backup...  But at least none of the mkv's are damaged.  I can always just delete all of the auxillary files, and xbmc will recreate them.

 

That being said, the checking process will probably take me days if not weeks to complete.  But things look promising so far.

 

You could always open a few of the files listed as corrupt.  Tbn files are JPEG files FWIW. The nfo files are XML files if I recall correctly.

Link to comment

squid and everybody else who uses md5 or any other kind of checksum

 

what are you using to make these?

how automated is this?

as i have like a million files on these servers.. would hate to do this one by one

 

i thought i sawsome script but didn't have the time to try it yet but this would be my first priority after upgrading to beta9

seems that script was to find duplicates ...

 

I'm working on posting a script I wrote that generates SHA256 strings off of the files and stores it as part of the extended attributes of the file.

Link to comment

I have only checked my music library because of the ease with foorbar. I did move my music library from one disk to another while on beta8 but before this announcement. If the 19,000+ songs I have over 6000 now report minor issues, or unrecoverable issues. (about 5700 minor, and 300 unrecoverable).

 

I thought we weren't supposed to be posting in this thread!  ;)  ;D

 

I didn't check many MP3s, because I don't have many in my personal data, but of those I did check, some were "corrupt", but in fact, I don't think they were. They were minor issues related to reporting incorrect length etc.

 

I checked WAVs, FLACs and MP3s. Zero FLACs were corrupt, a few WAVs had issues, but were also not corrupt -- they were just unusual WAVS belonging to a set of samples that probably had weird stuff embedded that FooBar couldn't handle.

 

My assessment of my tests is that none of the data I checked was corrupted. I haven't been running Beta 8 very long though.

 

Some examples of the Foobar "error" reports:

 

1 item could not be correctly decoded.

List of undecodable items:
"\\micro\dimeforscale movie podcast\DFSMC 032C Popeye\TB REC\tb_robin_1.flac"

 

-- That one was already truncated before writing it to the server.

 

9 items could not be correctly decoded.
103 items decoded with minor problems.

List of undecodable items:
"\\micro\dimeforscale movie podcast\DFSMC 017 The Room\HH recording\huell-part-1.mp3"
.
.
.
"\\micro\dimeforscale movie podcast\DFSMC 003 The Cutting Edge\BB review\80590^RecordScratch.mp3"

 

I don't think these MP3s are damaged. I checked some and they are not truncated, they are just odd formats that Foobar doesn't like. e.g. the call recording is 16kHz, 256kbps and recorded by a Skype call recorder.

Link to comment

and a big thx to tom and other known culprits!!! thx for b9 - lets hope that takes care of the known probs. a double hand clapping for accelerating this release so much - my understanding was that b9 takes a little longer than 12/09. problem recognized - and in short time solved... my respect!!!

beta's are beta's - shit happens. i respect the commitment of the LT team to get probs solved quick once they appear!

now lets hope that it was the 100% solution  ::)

 

cheers, L

Link to comment

I ask because I don't use windows at all--am I the only one??  ;)

 

No, you're not the only one.  This household uses *nix exclusively.  Like you, I went to investigate Coez Checksum, only to find it always comes with the "for Windows" text appended!

 

http://corz.org/linux/software/checksum/ - Checksum for Linux..

 

Oh WOW that is really new.  As in since I hounded him to update / fix it just a few weeks ago (cause it was broken). checksum-sh date is 7 Sep 2014 :)

 

More recently, people have started asking me about the Linux side of checksum, as the popularity of NAS boxes increases, I guess. So I've started thinking about it again.

 

OK sorry I know it is a bit OT, I'm just pleased he saw the benefits of working on the linux version again. 

Link to comment

Fwiw, I have checksums of my important data that where created before beta 7 or 8 and whenever beta 9 comes out I'll start running the checks. I'll report back with my findings. I upgraded to beta 8 right after it was released so my server is (unfortunately) a pretty good candidate to run the checksums to see if there was any corruptions. Another positive is I verified all my checksums about a month ago so if there are any corrupt files it is highly likely to be from the beta 8.

 

Just finished checking my files and did not receive any corrupt file messages. This is great new for me and I am hopeful that others are finding similar results.

 

Some background: I never used beta 7, I upgraded to beta 8 as soon as it was released, I did not move or restructure my files while using beta 8.

Link to comment

Well corz checksum has completed verification and as best i can tell no problems.  All "errors" are for missing hashes (new files) and "changed" which I can confirm are part of my rotating VM backups (backup1 becomes backup2 and a new backup1 is created but checksum has no way to know).

 

Except for one single file: a .torrent that is oddly "changed" but I know for a fact it was written when I was running beta5a and all my other .torrents from when I loaded beta8 (skipped 6 and 7) still function.

 

The new files are of course at-risk files since they are my most recent writes, but fortunately I don't have a lot of those and as mention below, they are almost all "large" files.  When I get the chance I''ll check them using one of the tools created to look for mismatched sizes or whatever.

 

Full Disclosure: The primary writes to my ReiserFS array are .torrents, .mkvs (from torrents and rips), backups of my VM's, and Acronis backups from my PC and laptops.  All Plex writes (so all my small writes) occur within an ArchVM on an ext4 SSD.  So there is every chance that my use case wasn't going to generate a lot of problems based on it being a "small file problem"

Link to comment

I have beta9 running now.

 

Also only two more drives (8TB) to convert to xfs , all the other drives (8) have been done.

 

Is there a quick reliable way to do this converting? Can you write a simple guide how you did this? The long and tedious way of offloading, formatting etc... Is there a more clever way without breaking parity. I have 2 servers, one with 8 and one with 18 disks. Not really looking forward to this task. But I guess someday it has to be done....

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.