(SOLVED) SQUASHFS errors (Failed to read block) and locked up /mnt/cache

Johnny Joker · January 5

Hello people!

I've been struggling for a few weeks now to get to the bottom of this behavior.

My system is running UNRAID 6.12.6 stable.

In the beginning of December I noticed that some of my Docker Containers were missing. Checking the logs and notification history I noticed that all missing containers were updated that night and were missing since then (orphaned image left behind in the Docker overview).

Trying to reinstall it it just gave me a "Command failed".

I then tried to reboot the system but it locked up trying to unmount /mnt/cache (Error 32, device is busy).

I've tried using the various commands for unmounting loop2 manually but nothing seemed to work (I don't have the logs from that anymore) and in the end i had to power off the system by holding the power button.

After a reboot the missing containers could be reinstalled and everything looked okay, but a few hours later the SQUASHFS Errors came back so I tried another USB port. I'm using a USB splitter directly attached to my Mainboards header but tried an external port nonetheless.

I'm not sure about the speed of the internally attached one but the external is definitely 2.0 - same behavior.

I plugged it into my PC and could access it after letting Windows repair it, but still decided it would be a good idea to replace the USB stick since it has been running for ~5 Years and SanDisk for the boot drive isn't necessarily recommend.

Okay, so I took a new (old, but rarely ever used) Kingston USB Stick and redeployed it using the installer/imaging tool + the latest backup I downloaded from connect.myunraid.net .

This all went well, it was up and running but then it happened again (missing containers), and after checking the logs I've noticed the SQUASHFS errors reappeared.

Since then I've ran MEMTEST twice (the bundled AND the official Version) and all 4 passes completed successfully. Also the CPU is not overclocked. I've also let the system check/verify the parity and all is good there. Scrubbing/Checking the Docker Volume and the Cache drive comes back clean (no errors).

Attached diagnostics have been created while the errors are happening. 2 containers (netdata & jackett) have orphaned images after tonight's Auto-Update.

I've stopped the array to recreate the error I mentioned above (unable to unmount /mnt/cache) but this time, contrary to my belief, it stopped everything alright.

After starting the array again (no reboot) the Docker service is not starting.

Reboot -> everything is coming back up and the missing containers can be installed.

Could you please check my logs and try to figure out what the problem is?

I reckon that I would have to install a clean USB (without using the backup) and copy the configs?

Cheers, Joker

jraid-diagnostics-20240105-1131.zip

Edited January 6 by Johnny Joker
SOLVED

JorgeB · January 5

Jan  4 04:16:01 jRAID kernel: SQUASHFS error: xz decompression failed, data probably corrupt
Jan  4 04:16:01 jRAID kernel: SQUASHFS error: Failed to read block 0xbfea4: -5
Jan  4 04:16:01 jRAID kernel: SQUASHFS error: Unable to read fragment cache entry [bfea4]

This usually indicate a flash drive problem.

Johnny Joker · January 5

Hey JorgeB, thanks for checking.

I'm now recreating the USB stick I'm currently using (Kingston DataTravler 2.0 16GB) with the backup I took before shutting down.

If it still happens afterwards I think I would/should get a brand new one and replace the flash drive again?

Is it possible that I brought that error over to the new stick with the created backup?

JorgeB · January 5

1 hour ago, Johnny Joker said:

If it still happens afterwards I think I would/should get a brand new one and replace the flash drive again?

Yep.

1 hour ago, Johnny Joker said:

Is it possible that I brought that error over to the new stick with the created backup?

Seems unlikely

Johnny Joker · January 6

I'll mark this as solved.

The system is now running stable for the last ~24h.

Thanks!

Johnny Joker · February 1

The error came back so I just bought a new HP 32GB USB2 Stick (the smallest the shop had) and am replacing the Stick.

Current Diagnostics attached.

jraid-diagnostics-20240201-1348.zip

JorgeB · February 1

1 minute ago, Johnny Joker said:

and am replacing the Stick.

And you still see the same errors with the new flash drive?

Johnny Joker · February 1

Sorry, I was a bit unclear - I just replaced the stick a few minutes ago and the system is up. So far no errors yet. Diagnostics were taken before replacing the Stick.

I'll check tomorrow morning since the last few days I've seen the log full with them in the morning.

If this stick too will have those errors I'll probably start swapping hardware (mainboard and RAM - not the CPU though since I know the other one I got is funky due to having it OCd in the past; I have almost identical hardware for my HTPC which is kind of obsolete since I got a FireTV Stick ) if you don't have any other ideas.

Edited February 1 by Johnny Joker

JorgeB · February 1

Every time I've seen those quashfs errors they have always been flash drive related, so hopefully it will fix it.

Johnny Joker · February 6

No errors so far.

Server locked up once so hard that I couldn't even access it via ssh and had to turn it off but I think it was something else (Palworld Dedicated server gone haywire).

(SOLVED) SQUASHFS errors (Failed to read block) and locked up /mnt/cache

Recommended Posts

Johnny Joker

Link to comment

JorgeB

Link to comment

Johnny Joker

Link to comment

JorgeB

Link to comment

Johnny Joker

Link to comment

Johnny Joker

Link to comment

JorgeB

Link to comment

Johnny Joker

Link to comment

JorgeB

Link to comment

Johnny Joker

Link to comment

Join the conversation