Large copy/write on btrfs cache pool locking up server temporarily


Recommended Posts

1 hour ago, allanp81 said:

What changed in Unraid itself though?

I suspect it is a side-effect of moving onto later revisions of the Linux kernel and packages rather than Unraid specific code.   I very much doubt that Limetech are ignoring this issue, but we will see when we get the next beta whether they have found anything.

Link to comment

Mine actually got better when i turned off NCQ (set it to NO), changed Scheduler to None and changed md_num_stripes to 8192.

Don't know which one gave the best effect as i turned it all on at the same time, but now everything works even tho i have big transfers to btrfs cache drives and running mover.

Link to comment
  • 2 weeks later...
On 4/23/2020 at 2:43 PM, johnnie.black said:

The problem is that it doesn't affect everyone, I have a pool of MX500 for more than a year working without any issues.

I actually used to have this problem, then it went away for months, now it's back.  No significant hardware changes in that time.

Link to comment
12 hours ago, johnnie.black said:

Anyone having this issue check if the docker image is on the array, if yes move it to cache or just outside the array, there are reports this can help, though probably mostly if the i/o wait is copying to the array,  e.g:

 

 

Should the docker image be on the cache anyways?

Link to comment
4 hours ago, johnnie.black said:

Yes, same for appdata, but not everyone is doing it.

Doesn't unraid default the appdata location to /mnt/user/appdata ? Is that share created as a cache only share by default? I don't remember because I manually set all of mine a long time ago

Link to comment
2 minutes ago, aptalca said:

Doesn't unraid default the appdata location to /mnt/user/appdata ? Is that share created as a cache only share by default? I don't remember because I manually set all of mine a long time ago

Based on looking at diagnostics, I think it is probably cache-prefer along with domains and system, but I haven't actually tested that myself since I set all that up maybe even before cache-prefer was implemented.

Link to comment

I also experienced this same issue with my two Samsung_SSD_840_EVO_250GB drives that I had in BTRFS RAID1 for my cache pool. Large file transfers would cause insane iowait. System load would shoot up to 60-80. Docker containers and the Unraid web-ui would become entirely unresponsive.

 

Since moving the pool from raid1 to single, the issue seems to be gone. With 6.9 reportedly having the option for multiple cache pools, I would really like to revisit the option for a mirrored cache pool.

Link to comment
  • 4 weeks later...

Thought id throw my 5 cents in. So i had this issue with some Crucial brand 512gig nvme sticks i had in a mirror and decided when i was doing a bit of an upgrade to my unraid box that i would try some different ssd's to see if i can get past this btrfs problem. Got some Silicon Power 512GB SSD's and threw them in a mirror and sadly the same problem. So guess im just getting really unlucky with my choice of nand in these things or yeah its just bloody random lol.

Link to comment
  • 2 weeks later...

@limetech please look into this issue. I have a single MX500 SSD BTRFS Encrpyted installed and everything comes to a halt when there are concurrent filestreams handled. The UI shows that the SSD is reading 500mb/s continuously. Average load is 27. This mayor issue is at least two years old. 

Link to comment
  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.