Large writes to cache drive causing unresponsive behavior


Recommended Posts

Hello.

 

Started to get an issue similar in scope to the issues users were having here, more specifically the issue i am having started with large writes to the cache drive started to slow down to a crawl or near stop and cause dockers and vm's to become unresponsive (the unraid ui often still works)

 

Everything was working fine for a week or two after one of the last changes i made being the addition/upgrade of 10Gbe. large transfers were fast and fantastic and well worth the upgrade but now things slow to a crawl then dockers and the vm's will freeze or possibly crash while the cache disk(s) attempt to write the large data. iowait crawls up to a very large value of like 70% as reported by netdata/top.

 

The 10Gbe change does not seem to be at fault since even transfers limited at 1Gbe will still cause the issue albeit not as fast.

 

Debug steps i have attempted so far with no positive results:

  • Checked logs, Nothing outstanding seen in relation to drive read/write errors or health issues seen
  • Converted from the original cache drive setup: brtfs raid 1 to single drive XFS
  • Checked Ethernet cables
  •  Checked Sata cables
  • Attempted to run trim on drives

 

History:

  • Upgraded from 6.6.6 to 6.7.2 shortly after release; had no issues
  • Two weeks ago, Upgraded to 10Gbe, no issues until now.

 

Hardware setup:

  • Intel i7 4790k
  • Asus Z97M-PLUS
  • 32GB Ram
  • LSI 9211-8i (Latest IT-Mode firmware, used with array drives)
  • 1x Parity drive
  • 4x Drives in array
  • 1x unassigned random 512gb ssd for scratch use and other vm's
  • (originally) 2x Sandisk SSD Plus 1TB's in brtfs raid1 for cache
  • (now) 1x Sandisk SSD Plus 1TB XFS for cache
  • 10Gbe Aquantia AQtion 10G Pro NIC, (AQN-107)

 

Any help would be highly appreciated, i will try to post the debug data when i can tonight.

 

Its very odd and confusing how everything was working fine up until now and for things to go bad with no outward changes or visible errors. and i'd like to go back to using a parity backed cache if possible.

Edited by stephen_m64
Link to comment
14 minutes ago, stephen_m64 said:

i will try to post the debug data when i can tonight

I won't speculate on what is happening until we get the complete diagnostics zip.

 

But I question the wisdom of caching very large writes. Cached data is best moved to the array during periods of inactivity. And cached data can only be moved to the array at the slower speeds of writing to the parity array. Probably makes more sense to transfer very large amounts of data directly to the parity array, so nothing needs to be moved.

 

 

Link to comment

True, and transfers to the array directly seem to be fine. But the entire point of 10Gbe is to reduce bottlenecks of 1Gbe with multiple users and allow for the quick offload of large files to the cache so it can move the large files at its leisure. 

 

Working with or moving files at near native ssd speeds is a featured goal. (And looked to work well/fine for about two weeks)

 

You wont "question the wisdom of caching" after transferring +32gb files over 10Gbe to ssd's vs 1Gbe on spinning disks. 😂 

Edited by stephen_m64
Link to comment
22 minutes ago, stephen_m64 said:

You wont "question the wisdom of caching" after transferring +32gb files over 10Gbe to ssd's vs 1Gbe on spinning disks.

Depends. We haven't seen your diagnostics, and you haven't mentioned whether or not you are completely filling cache faster than the data can be moved off. Since you haven't mentioned it, I assumed you didn't know, and we would have to wait on your diagnostics to fill in the details.

Link to comment
5 minutes ago, trurl said:

Depends. We haven't seen your diagnostics, and you haven't mentioned whether or not you are completely filling cache faster than the data can be moved off. Since you haven't mentioned it, I assumed you didn't know, and we would have to wait on your diagnostics to fill in the details.

I have tested with the cache drive at normal levels of use 60% full and tested with the cache drive at near empty around 60gb. with and without trim, and still see the issue. (post trim the issue will still happen but takes longer to appear as the drives written blocks are taken up.)

 

happy to answer any questions, and i will get the zip file posted tonight.

 

Note: I work in the storage industry, R&D of solid state drives. I have a good bit experience with this area, just don't know how/why this issue has appeared in unraid all of a sudden.

Link to comment
1 minute ago, stephen_m64 said:

I have tested with the cache drive at normal levels of use 60% full and tested with the cache drive at near empty around 60gb. with and without trim, and still see the issue. (post trim the issue will still happen but takes longer to appear as the drives written blocks are taken up.)

 

happy to answer any questions, and i will get the zip file posted tonight.

 

Note: I work in the storage industry, R&D of solid state drives. I have a good bit experience with this area, just don't know how/why this issue has appeared in unraid all of a sudden.

Look at your full smart report for the cache disk and see how much of your surplus flash is available. If the disk has had to mark areas of the flash invalid, and is no longer able to cycle written bytes into the garbage collect queue, then any new writes will require a zero-write prior to filling with data. This will cause the system to wait for the SSD to discard that area of the nand. This issue becomes much more apparent with lower free space as the SSD must then shuffle data around to create a contiguous section to accept the new block of data.  I see you work in the storage industry - so I'd be surprised if you hadn't looked at this already - but its worth mentioning anyways.

Link to comment

I already looked at the smart data for the drives and other then the vendor unique smart data tables, things such as reallocated sectors and the media wear out indicator tables look to be fine, the drives are actually quite new upgrades from smaller drives; with about 3.4 months of power on hours.

 

Even with a trim on the drive with it near empty (the most optimal state for writes on the drive) the issue is still seen albeit takes a short bit longer to appear, this is mentioned in above posts.

 

I hate having to take the system down over and over again but i plan to pull the drives and test them individually for failure time permitting.

 

 

Edited by stephen_m64
Link to comment
1 hour ago, Benson said:

Problem relate SSD performance, depends on how much read or write throughput and I/O respond time when drive in busy.

 

The performance of the ssd's are not really in question, they are cheaper sata ssd drives, yes, but still way faster then any spinning disk media, especially for simple workloads like a sequential write.

 

The system should not crumble and become unresponsive with large writes, as stated above this did not happen previously.

Link to comment

Fast is subjective, the speeds of the drive are decent for the price and provide speeds faster then the platter drives.

If i had the money i would have truly fast nvme drives. but that's not here nor there.

 

No disrespect but please have something constructive when replying in this thread please.

 

As requested above the debug logs are attached. i did remove a few files containing sensitive info found.

tower-diagnostics-20190801-0115.zip

Link to comment
9 hours ago, itimpi said:

It might be worth mentioning which these were?   The diagnostics are meant to be anonymised to remove this type of information.

Removed two files; Syslog had more private info in it since I have some more verbose debugging messages enabled.

 

Lsof; had some Network connections and ip addresses that need not be public.

Link to comment

Ah i missed that you had already tried to trim, in the thread you posted the HBA cards aint able to trim I guess your ssd is plugged into motherboard though.

 

I wonder what a straight benchmark of the SSD would show? Maybe trim the drives in another system and see how they go after?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.