Constant Errors and log full

czealley · August 27, 2016

Hi all,

I keep getting errors filling my Log files and cannot figure out what is causing it?

The error:

Aug 11 09:09:04 Tower kernel: BTRFS warning (device loop0): csum failed ino 2761 off 6340608 csum 990492611 expected csum 3904854184.

Not sure what information you need.

1 Parity

6 Hdds

3 Cache driver (all Samsung SSDs)

Core i7.

Network: bond0: IEEE 802.3ad Dynamic link aggregation (1 onboard + 1 tplink gigabit. connected to Netgear prosafe 24 port smart switch)

UnRaid OS 6.1.9

Squid · August 27, 2016

Hi all,

I keep getting errors filling my Log files and cannot figure out what is causing it?

The error:

Aug 11 09:09:04 Tower kernel: BTRFS warning (device loop0): csum failed ino 2761 off 6340608 csum 990492611 expected csum 3904854184.

Not sure what information you need.

1 Parity

6 Hdds

3 Cache driver (all Samsung SSDs)

Core i7.

Network: bond0: IEEE 802.3ad Dynamic link aggregation (1 onboard + 1 tplink gigabit. connected to Netgear prosafe 24 port smart switch)

UnRaid OS 6.1.9

You should post your diagnostics, but odds on its a problem with your docker.img file. You can try a scrub on it, but your best course of action is to just nuke it and then reinstall your apps via CA's previous apps section

czealley · August 30, 2016

added the diagnostic download.

I have been having issues a lot with the docker file. I have deleted it at least a dozen times.

tower-diagnostics-20160503-1724.zip

Ambrotos · September 10, 2016

I just wanted to chime in and say that I see exactly the same thing. Every once in a while I'll have a docker app (usually Plex, but not always) crash and error when I try to restart it. I'll simply delete the docker.img, then re-add all my dockers from the my-* templates, and continue on until it happens again. It's honestly pretty annoying.

I manage 5 systems (mine, my test machine, my father's, and 2 friends') and I've seen this problem on 3 of them. Interestingly, the 3 machines that I've seen this problem on all have Samsung SSDs for cache drives (1x 750 EVO, 2x 840 EVO). The other machines are using platter drives for their cache drives. Maybe it's coincidence. I've seen this in all of the recent builds of 6.x that I've been running, though it's tough to remember exactly which builds at this point. Certainly I can say that it's present in 6.1.9 and 6.2RC5, since those are the two versions I'm currently running on various machines and it just happened again tonight.

I've attached the diagnostics file from my main machine, which just had a burst of CSUM errors as of this evening at 21:34.

Anyone else seeing this? Anyone have any suggestions?

Cheers,

-A

nas-diagnostics-20160909-2147.zip

Squid · September 10, 2016

I just wanted to chime in and say that I see exactly the same thing. Every once in a while I'll have a docker app (usually Plex, but not always) crash and error when I try to restart it. I'll simply delete the docker.img, then re-add all my dockers from the my-* templates, and continue on until it happens again. It's honestly pretty annoying.

I manage 5 systems (mine, my test machine, my father's, and 2 friends') and I've seen this problem on 3 of them. Interestingly, the 3 machines that I've seen this problem on all have Samsung SSDs for cache drives (1x 750 EVO, 2x 840 EVO). The other machines are using platter drives for their cache drives. Maybe it's coincidence. I've seen this in all of the recent builds of 6.x that I've been running, though it's tough to remember exactly which builds at this point. Certainly I can say that it's present in 6.1.9 and 6.2RC5, since those are the two versions I'm currently running on various machines and it just happened again tonight.

I've attached the diagnostics file from my main machine, which just had a burst of CSUM errors as of this evening at 21:34.

Anyone else seeing this? Anyone have any suggestions?

Cheers,

-A

The problem here isn't with the docker.img file, but rather on the cachedrive itself. The errors pile up every day @3:40 am (which is when your mover starts)

It could also be caused by the unclean shutdown that you system performed. While I don't personally use btrfs for any of my drives (prefer XFS), I have heard on these threads that that filesystem doesn't particularly like just hitting the reset button / power switch and really wants to do a proper shutdown to avoid errors.

You're going to have to do a scrub on the cache drive https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Checking_and_fixing_file_systems

Ambrotos · September 10, 2016

Hm. Seems I may have provided a misleading example. True, I was forced to do a hard shutdown a couple days ago after the unRAID system locked up due to a kernel panic. But that's definitely something I try to avoid at all costs. I have a 9kVA UPS and NUT/Powerdown plugins installed deliberately to try to minimize this. This is also the only system of mine that's still running btrfs for the cache drive. I too have moved pretty much exclusively to xfs, but I haven't gone through the hassle of migrating this system yet. I guess it might be time.

Anyway, apologies for my lack of attention to detail. At this point, when I see a csum error in the log I guess I just assume it's a loop0 csum error. I've attached a log from one of my friends' servers which IS running xfs on all drives, and still sees csum errors.

Cheers,

-A

tower-diagnostics-20160910-0822.zip

Constant Errors and log full

Recommended Posts

czealley

Link to comment

Squid

Link to comment

czealley

Link to comment

Ambrotos

Link to comment

Squid

Link to comment

Ambrotos

Link to comment

Join the conversation