Disappearing Cache Drives?


Recommended Posts

Good evening,

 

I am relatively new to unRAID and I can say so far it has been a very good experience.  There is just one issue I cannot seem to figure out.  My system will be humming along nicely and out of the blue a warning appears indicating "BTRFS cache pool missing devices".  I have a two drive cache configured and periodically one of the cache devices (not always the same one) just disappears.  I've read through the forums and found other instances; however, some of these messages are dated and the version running was much older than my 6.6.7.  The only way I can get the cache drive to return is by completely powering down the system and then turning it back on again--apparently a reboot isn't sufficient.  I have attached my Diagnostics file and am hoping someone here can point me in the right direction.  Today, April 28, 2019 at 5:29 PM EDT was the latest occurrence. 

 

Thank you,

Erik

nexus-diagnostics-20190428-1928.zip

Link to comment

There is problem with the NVMe device:

 

Apr 28 14:38:47 Nexus kernel: print_req_error: critical medium error, dev nvme0n1, sector 90558672
Apr 28 14:38:47 Nexus kernel: print_req_error: critical medium error, dev nvme0n1, sector 90558912
Apr 28 14:38:47 Nexus kernel: print_req_error: critical medium error, dev nvme0n1, sector 90559168
Apr 28 14:38:47 Nexus kernel: print_req_error: critical medium error, dev nvme0n1, sector 90558680
Apr 28 14:38:47 Nexus kernel: print_req_error: critical medium error, dev nvme0n1, sector 90559424
Apr 28 14:38:47 Nexus kernel: BTRFS info (device sdh1): read error corrected: ino 9883863 off 1115926528 (dev /dev/nvme0n1p1 sector 90558616)
Apr 28 16:19:15 Nexus crond[2667]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null
Apr 28 17:01:37 Nexus kernel: mdcmd (116): spindown 2
Apr 28 17:21:05 Nexus kernel: mdcmd (117): spindown 1
Apr 28 17:24:48 Nexus kernel: nvme nvme0: I/O 18 QID 3 timeout, aborting
Apr 28 17:24:48 Nexus kernel: nvme nvme0: I/O 19 QID 3 timeout, aborting
Apr 28 17:24:48 Nexus kernel: nvme nvme0: I/O 20 QID 3 timeout, aborting
Apr 28 17:24:48 Nexus kernel: nvme nvme0: I/O 21 QID 3 timeout, aborting
Apr 28 17:25:18 Nexus kernel: nvme nvme0: I/O 18 QID 3 timeout, reset controller
Apr 28 17:25:48 Nexus kernel: nvme nvme0: I/O 0 QID 0 timeout, reset controller
Apr 28 17:27:19 Nexus kernel: nvme nvme0: Device not ready; aborting reset
Apr 28 17:27:19 Nexus kernel: nvme nvme0: Abort status: 0x7
### [PREVIOUS LINE REPEATED 3 TIMES] ###
Apr 28 17:28:20 Nexus kernel: nvme nvme0: Device not ready; aborting reset
Apr 28 17:28:20 Nexus kernel: nvme nvme0: Removing after probe failure status: -19
Apr 28 17:28:20 Nexus kernel: nvme nvme0: Device not ready; aborting reset

This is a hardware problem, there were read errors and it eventually dropped offline.

Link to comment
  • 4 months later...
On 4/29/2019 at 1:38 AM, ErikM1970 said:

The only way I can get the cache drive to return is by completely powering down the system and then turning it back on again--apparently a reboot isn't sufficient.

nexus-diagnostics-20190428-1928.zip 188.3 kB · 3 downloads

Hi Erik,

 

I have the same problem on unRaid version 6.7.2 with a brand new Crucial P1 500GB NVME SSD.

 

Have you manage to fix this issue by any chance ?

 

With thanks,

G

Link to comment

Good morning,

 

In my situation, I ended up visiting the Supermicro website and checked to see which NVMe drives they tested/certified for use with my system board.  I was able to find a Toshiba 1TB NVMe drive that was certified and went ahead and installed it in place of the Crucial NVMe.  I can tell you that the results were much better but not perfect.  I did have one recurrence where it disappeared.  Once I power-cycled the system and scrubbed the drive pool, all was well again.  As of this posting, no more problems.  I'm hoping this will improve with future releases of UnRAID; however, for now, it is tolerable and just a minor inconvenience--as opposed to a major disruption when using the Crucial NVMe.

 

Hope this helps,

Erik

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.