[SOLVED] Filled cache drive, everything went crazy??


ct1996

Recommended Posts

Hello, I just recently added a 500gb SSD to my cache pool (existing 128gb SSD), and formatted it as JBOD in BTRFS following the outlined procedure. Everything was great, and I had about 580gb available, with the other stuff used by my docker image/appdata .


Anyways, I was downloading a bunch of stuff today and didn't keep an eye on my drive, and now everything is goin' haywire. Currently, only plex and my torrent docker are working OK. Plex does its temp transcoding on the array (I was gonna change this but its whats saving me right now i'm sure lol). And rtorrent doesn't do any writes, so i'm betting thats why it's okay.

I manually started mover, deleted some torrents that i'll re-download later, and now I have about 80gb of free space. My log is spittin' these out like crazy though: 

 

Sep 11 01:01:19 Tower shfs: share cache full
Sep 11 01:01:19 Tower shfs: share cache full
Sep 11 01:01:20 Tower kernel: mdcmd (452131): spindown 7
Sep 11 01:01:20 Tower emhttpd: error: mdcmd, 2639: Input/output error (5): write
Sep 11 01:01:20 Tower kernel: md: do_drive_cmd: disk7: ATA_OP e0 ioctl error: -5
Sep 11 01:01:21 Tower kernel: mdcmd (452132): spindown 7
Sep 11 01:01:21 Tower emhttpd: error: mdcmd, 2639: Input/output error (5): write
Sep 11 01:01:21 Tower kernel: md: do_drive_cmd: disk7: ATA_OP e0 ioctl error: -5
Sep 11 01:01:22 Tower kernel: mdcmd (452133): spindown 7
Sep 11 01:01:22 Tower emhttpd: error: mdcmd, 2639: Input/output error (5): write
Sep 11 01:01:22 Tower kernel: md: do_drive_cmd: disk7: ATA_OP e0 ioctl error: -5

 

Trying to update a docker image shows the cache is in read-only (or at least thats what it says)


I really don't wanna restart the server, but will if necessary.. I'm continuing to let mover run, and hoping that's enough.

Anybody with more experience please advise?? I was reading about potentially running a balance command, but not sure if that applies here.. Thanks

 

 

Edit: Everything broken now. Gonna let my appdata finish backing up then try the balance command - hopefully its OK for a multi drive cache..

Edited by ct1996
Link to comment
31 minutes ago, johnnie.black said:

All the syslog files are filled with the same repeating error, but the start of the problem is not visible, reboot, use the server for a few minutes and post new diags.

Sorry to be obnoxious - this is with the mover running and that command I just posted also "running" - right? Its safe to restart? 
I'm nervous because mover *seems* to be working, and I have some sensitive data i'd rather let finish moving if restarting is gonna mess up my cache (I know how sensitive BTRFS can be)

Edit:

that command finished, its output is : Done, had to relocate 158 out of 586 chunks (log still being spammed however)
Edit2: 
Its late, i'm gonna head to bed and try restarting tomorrow. Hopefully it will be fixed, as mover will have ran and the balance is already done. I'm praying a restart fixes it at this point, lol.

Thanks for your help and assistance so far.
 

Edited by ct1996
Link to comment
17 hours ago, johnnie.black said:

You won't be able to restart while a balance is running, do it when it finishes.

 

You can stop the mover by typing:

 


mover stop


 

Thanks a bunch for your help. The unraid box is now up and running again perfectly. I didn't even have to restore appdata or do anything, just a restart after mover/balancer ran fixed everything. You're the man!!

Is there any way to prevent this happening again? Or should I just be extra careful to not let my cache drive pool get full?

Link to comment
2 minutes ago, ct1996 said:

 

Thanks a bunch for your help. The unraid box is now up and running again perfectly. I didn't even have to restore appdata or do anything, just a restart after mover/balancer ran fixed everything. You're the man!!

Is there any way to prevent this happening again? Or should I just be extra careful to not let my cache drive pool get full?

What do you have set for the Min Free Space for the cache drive?    When the free space drops below this value then Calibre will stop writing new files to the cache drive and instead write them directly to the array.   This may impact performance for such files as they are now constrained by array write speeds but should at least keep things under control.

Link to comment
1 minute ago, itimpi said:

What do you have set for the Min Free Space for the cache drive?    When the free space drops below this value then Calibre will stop writing new files to the cache drive and instead write them directly to the array.   This may impact performance for such files as they are now constrained by array write speeds but should at least keep things under control.

They were previously at :

Warning disk usage: 99%

Critical disk usage: 100%

 

That was probably the issue. I set it Warning: 80%, Critical: 90%, hopefully that is much safer!

Link to comment
7 minutes ago, ct1996 said:

They were previously at :

Warning disk usage: 99%

Critical disk usage: 100%

 

That was probably the issue. I set it Warning: 80%, Critical: 90%, hopefully that is much safer!

That is not the Min Free Space setting - those is the utilisation warning settings.    The Min Free Space is a size value (e.g. 20G) and is recommended to be more that the largest file you are likely to write.

Link to comment
1 minute ago, itimpi said:

That is not the Min Free Space setting - those is the utilisation warning settings.    The Min Free Space is a size value (e.g. 20G) and is recommended to be more that the largest file you are likely to write.

Ah, I dug around and found the setting under Global Share settings. It is at 20GB, i'll change that to 90GB when I stop the array next 🙂

Thankyou for the advice

Link to comment
On 9/11/2018 at 3:02 AM, ct1996 said:

added a 500gb SSD to my cache pool (existing 128gb SSD), and formatted it as JBOD in BTRFS following the outlined procedure. Everything was great, and I had about 580gb available

Just thought I would add a couple of FAQ links regarding this bit. 

 

"Single" is the setting to get all of the capacity added together:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/#comment-480421

 

This one might also be relevant:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/#comment-480420

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.