High IOWAIT %


EarthYak

Recommended Posts

Starting today, I have noticed delays when transferring large files to my unraid server. I looked through the Syslog but found nothing highlighted at the times of the file transfers. When I copy files to the cache over the network, most recently at 2155 02/09/2019 in the logs, I can see high IOWait %s but nothing in the syslog.

 

I have balanced and scrubbed the cache today, before 2155.

 

I have experienced similar issues before and have now replaced (in the last year) the HBA, cables and SSDs and had 6+ months without an issue. Does anyone have any ideas?

 

Thanks in advance.

IOwait numbers.PNG

hydra-diagnostics-20190902-2101.zip

Link to comment

I am intermittently seeing issues like this in the syslog but so far not whilst I have been actively using the system.

 

Sep 3 12:16:20 Hydra kernel: print_req_error: I/O error, dev sdf, sector 196540536
Sep 3 12:16:22 Hydra kernel: BTRFS info (device sdf1): read error corrected: ino 968554 off 3047620608 (dev /dev/sdf1 sector 196540472)
Sep 3 12:16:22 Hydra kernel: BTRFS info (device sdf1): read error corrected: ino 515522 off 22855680 (dev /dev/sdf1 sector 637687392)
Sep 3 12:16:22 Hydra kernel: BTRFS info (device sdf1): read error corrected: ino 515522 off 58761216 (dev /dev/sdf1 sector 396601400)

Link to comment
3 minutes ago, johnnie.black said:

Those are checksum errors on the cache pool, one of the devices is dropping, see her for better pool monitoring:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=700582

 

Thanks for this info and a helpful FAQ post, I have taken on board this and set up an alert. 

 

This leaves me with two questions, what do I do with the 4 unrecoverable errors and is there a possible cause for this I should be running down?

Link to comment
8 minutes ago, EarthYak said:

This leaves me with two questions, what do I do with the 4 unrecoverable errors

Best way forward is to backup cache data, re-format and restore.

 

11 minutes ago, EarthYak said:

is there a possible cause for this I should be running down?

Those connection issues with SSDs are usually cable related, Samsung SSDs especially can be very picky with connection quality, also keep in mind that trim won't work when connected to that LSI, so if possible use the Intel SATA ports instead for both.

 

  • Thanks 1
Link to comment
1 minute ago, johnnie.black said:

Best way forward is to backup cache data, re-format and restore.

 

Those connection issues with SSDs are usually cable related, Samsung SSDs especially can be very picky with connection quality, also keep in mind that trim won't work when connected to that LSI, so if possible use the Intel SATA ports instead for both.

 

Thank you, I will give that a go over the weekend.

Link to comment
  • 2 weeks later...
On 9/3/2019 at 4:16 PM, johnnie.black said:

Best way forward is to backup cache data, re-format and restore.

 

Those connection issues with SSDs are usually cable related, Samsung SSDs especially can be very picky with connection quality, also keep in mind that trim won't work when connected to that LSI, so if possible use the Intel SATA ports instead for both.

 

This has worked, I have run the system for 4 days now without an error. I just need to find somewhere in the case to secure them now!

 

Thanks for the help.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.