Jump to content

Possibly failing cache drive? "Your drive is either completely full or mounted read-only" (it's not full)


Recommended Posts

This is the second time in a few days that I've hit this error. Fix Common Problems will alert me that there's errors. I'll get two: "Your drive is either completely full or mounted read-only" but my drives are not full and something about my Docker.img being full but it's not.

 

Both times my Docker service will fail and on my Docker tab I'll see:

 

Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 712
Couldn't create socket: [111] Connection refused
Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 898

Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 712
Couldn't create socket: [111] Connection refused
Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 967

 

I've attached the diagnostics after the 2nd time this error has happened.

 

Also both times I try to stop the array and it gets stuck on "Retry unmounting user share(s)..." and no amount of trying to umount myself or find and kill any processes fixes it. The only thing I can do to un-stuck it is run "reboot" in console which gets detected as an unclean shutdown.

 

My best guess is one of my cache drives is dying. One of them is an older one that has 161 TB written (347424289620 total lbas). The other is only around 30 TB.

apollo-diagnostics-20230307-0858.zip

Link to comment

You have got this in your syslog:

Mar  6 18:40:40 Apollo kernel: BTRFS critical (device sdb1): corrupt leaf: block=4926130962432 slot=84 extent bytenr=4950498230272 len=49152 unknown inline ref type: 129

which indicates corruption the on 'cache' pool.  The SMART information for that drive does not indicate an issue however.

Link to comment
22 minutes ago, itimpi said:

You have got this in your syslog:

Mar  6 18:40:40 Apollo kernel: BTRFS critical (device sdb1): corrupt leaf: block=4926130962432 slot=84 extent bytenr=4950498230272 len=49152 unknown inline ref type: 129

which indicates corruption the on 'cache' pool.  The SMART information for that drive does not indicate an issue however.

 

I saw that, and yeah I ran short SMART tests with no error. Attributes all look fine except the excessive lbas written. I'm seeing a brand new replacement SSD would only be $65 so I'll probably just replace it anyway. But I am curious:

  1. Can an SSD be dying and not report any SMART/Attribute errors?
  2. Is it possible it's not dying and my btrfs pool just needs to be re-balanced or something?

 

But I'm also not convinced 161 TBW is enough to kill a drive when I'm reading on Samsung's site "600 TBW for 1 TB model" (My cache is two Samsung 860 EVO 1TB).

Edited by s449
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...