Formatting Cache Drive Failed

Matt4982 · February 15, 2019

Hey guys, just isntalled an ssd in my unraid setup. I took the drive from a working laptop. When I try to format it either through unraid or the unassigned devices plugin, it fails to do so. Here's a copy of the log coming up.

Feb 15 11:45:39 Unraid kernel: ata14.00: status: { DRDY ERR }
Feb 15 11:45:39 Unraid kernel: ata14.00: error: { ABRT }
Feb 15 11:45:39 Unraid kernel: ata14.00: supports DRM functions and may not be fully accessible
Feb 15 11:45:39 Unraid kernel: ata14.00: NCQ Send/Recv Log not supported
Feb 15 11:45:39 Unraid kernel: ata14.00: supports DRM functions and may not be fully accessible
Feb 15 11:45:39 Unraid kernel: ata14.00: NCQ Send/Recv Log not supported
Feb 15 11:45:39 Unraid kernel: ata14.00: configured for UDMA/133
Feb 15 11:45:39 Unraid kernel: ata14: EH complete
Feb 15 11:45:39 Unraid kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Feb 15 11:45:39 Unraid kernel: ata14.00: irq_stat 0x40000001
Feb 15 11:45:39 Unraid kernel: ata14.00: failed command: READ DMA
Feb 15 11:45:39 Unraid kernel: ata14.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 10 dma 4096 in
Feb 15 11:45:39 Unraid kernel: res 51/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
Feb 15 11:45:39 Unraid kernel: ata14.00: status: { DRDY ERR }
Feb 15 11:45:39 Unraid kernel: ata14.00: error: { ABRT }
Feb 15 11:45:39 Unraid kernel: ata14.00: supports DRM functions and may not be fully accessible
Feb 15 11:45:39 Unraid kernel: ata14.00: NCQ Send/Recv Log not supported
Feb 15 11:45:39 Unraid kernel: ata14.00: supports DRM functions and may not be fully accessible
Feb 15 11:45:39 Unraid kernel: ata14.00: NCQ Send/Recv Log not supported
Feb 15 11:45:39 Unraid kernel: ata14.00: configured for UDMA/133
Feb 15 11:45:39 Unraid kernel: ata14: EH complete
Feb 15 11:45:39 Unraid kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Feb 15 11:45:39 Unraid kernel: ata14.00: irq_stat 0x40000001
Feb 15 11:45:39 Unraid kernel: ata14.00: failed command: READ DMA
Feb 15 11:45:39 Unraid kernel: ata14.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 11 dma 4096 in
Feb 15 11:45:39 Unraid kernel: res 51/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
Feb 15 11:45:39 Unraid kernel: ata14.00: status: { DRDY ERR }
Feb 15 11:45:39 Unraid kernel: ata14.00: error: { ABRT }
Feb 15 11:45:39 Unraid kernel: ata14.00: supports DRM functions and may not be fully accessible
Feb 15 11:45:39 Unraid kernel: ata14.00: NCQ Send/Recv Log not supported
Feb 15 11:45:39 Unraid kernel: ata14.00: supports DRM functions and may not be fully accessible
Feb 15 11:45:39 Unraid kernel: ata14.00: NCQ Send/Recv Log not supported
Feb 15 11:45:39 Unraid kernel: ata14.00: configured for UDMA/133
Feb 15 11:45:39 Unraid kernel: ata14: EH complete
Feb 15 11:45:39 Unraid kernel: sdb: unable to read partition table
Feb 15 11:45:39 Unraid unassigned.devices: Reload partition table result: BLKRRPART failed: Input/output error /dev/sdb: re-reading partition table
Feb 15 11:45:39 Unraid unassigned.devices: Formatting disk '/dev/sdb' with 'xfs' filesystem.
Feb 15 11:45:39 Unraid unassigned.devices: Format disk '/dev/sdb' with 'xfs' filesystem failed! Result:

JorgeB · February 15, 2019

That looks like a connection problem, but you should always post the complete diags.

Matt4982 · February 15, 2019

2 minutes ago, johnnie.black said:

That looks like a connection problem, but you should always post the complete diags.

Attached is the diags zip. I have tried two different pcie controllers in my Dell R510, so hopefully thats not the case.

unraid-diagnostics-20190215-1154.zip

JorgeB · February 15, 2019

You're using a controller with port multipliers, they're prone to issues, looks like there are no onboard SATA ports? If so it should work fine on a cheap 2 port Asmedia controller.

It will also work on the HBA, but trim won't work.

Matt4982 · February 15, 2019

4 minutes ago, johnnie.black said:

You're using a controller with port multipliers, they're prone to issues, looks like there are no onboard SATA ports? If so it should work fine on a cheap 2 port Asmedia controller.

It will also work on the HBA, but trim won't work.

Do you have a suggestion for a card without port multipliers? I guess I'm not entirely sure what would be the difference.

Unfortunately with the 12bay R510, the internal SATA ports are disabled and you'd have to splice to get power, so that's why I was looking at pci-e cards that did both data and power.

Edited February 15, 2019 by Matt4982
Added Amazon links

Matt4982 · February 15, 2019

These are the two cards I've tried.

https://www.amazon.com/gp/product/B01452SP1O/ref=oh_aui_search_asin_title?ie=UTF8&psc=1

https://www.amazon.com/gp/product/B00WUZPMHE/ref=oh_aui_search_asin_title?ie=UTF8&psc=1

JorgeB · February 15, 2019

3 minutes ago, Matt4982 said:

These are the two cards I've tried.

The diagnostics were with one of those?

Weird as it is reported on the syslog as a 10 port SATA controller, but it's likely the problem, but sorry can't recommend a known working one of that type, since they are not commonly used with Unraid.

You could always get an NVMe device, those would work without issues, but of course they are more expensive.

Matt4982 · February 15, 2019

28 minutes ago, johnnie.black said:

The diagnostics were with one of those?

Weird as it is reported on the syslog as a 10 port SATA controller, but it's likely the problem, but sorry can't recommend a known working one of that type, since they are not commonly used with Unraid.

You could always get an NVMe device, those would work without issues, but of course they are more expensive.

Yep, the second one is the one currently in the system. Of course the array is also running on an H200 flashed to IT mode.

JorgeB · February 15, 2019

6 minutes ago, Matt4982 said:

Of course the array is also running on an H200 flashed to IT mode.

I saw that one, and if connect the SSD there it will work, or it should work, but like mentioned no trim.

Matt4982 · February 22, 2019

Alright, so I got a new card and switched to an NVME drive. It allowed me to format after some struggles and got setup, however, the dockers are now crawling. I looked through the logs and am seeing tons of timeouts for the drive. Not sure if I'm missing something that needs to be done.

Feb 22 00:15:50 Unraid kernel: nvme nvme0: I/O 927 QID 16 timeout, completion polled
Feb 22 00:15:50 Unraid kernel: nvme nvme0: I/O 928 QID 16 timeout, completion polled
Feb 22 00:15:50 Unraid kernel: nvme nvme0: I/O 929 QID 16 timeout, completion polled
Feb 22 00:24:32 Unraid kernel: nvme nvme0: I/O 24 QID 0 timeout, completion polled
Feb 22 00:25:34 Unraid kernel: nvme nvme0: I/O 5 QID 0 timeout, completion polled
Feb 22 00:46:03 Unraid kernel: nvme nvme0: I/O 2 QID 0 timeout, completion polled
Feb 22 00:47:04 Unraid kernel: nvme nvme0: I/O 29 QID 0 timeout, completion polled
Feb 22 00:48:12 Unraid kernel: nvme nvme0: I/O 929 QID 16 timeout, completion polled
Feb 22 00:48:12 Unraid kernel: nvme nvme0: I/O 930 QID 16 timeout, completion polled

unraid-diagnostics-20190222-0836.zip

JorgeB · February 22, 2019

That NVMe device uses the SM2262 controller which has some quirks with Linux, you can try updating to v6.7-rc, it includes a patch for that.

Matt4982 · February 22, 2019

Ok, new version installed. Now getting this error over and over

Feb 22 10:24:50 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:50 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:51 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:51 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:52 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:52 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:53 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 15997 start 0
Feb 22 10:24:53 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 15997 start 0
Feb 22 10:24:53 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:53 Unraid kernel: BTRFS info (device nvme0n1p1): no csum found for inode 12700 start 0
Feb 22 10:24:54 Unraid kernel: btrfs_dev_stat_print_on_error: 122 callbacks suppressed
Feb 22 10:24:54 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7316, flush 0, corrupt 0, gen 0
Feb 22 10:24:54 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7317, flush 0, corrupt 0, gen 0
Feb 22 10:24:54 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7318, flush 0, corrupt 0, gen 0
Feb 22 10:24:54 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7319, flush 0, corrupt 0, gen 0
Feb 22 10:24:54 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7320, flush 0, corrupt 0, gen 0
Feb 22 10:24:54 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7321, flush 0, corrupt 0, gen 0
Feb 22 10:24:55 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7322, flush 0, corrupt 0, gen 0
Feb 22 10:24:55 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7323, flush 0, corrupt 0, gen 0
Feb 22 10:24:55 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7324, flush 0, corrupt 0, gen 0
Feb 22 10:24:55 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 2, rd 7325, flush 0, corrupt 0, gen 0

unraid-diagnostics-20190222-1626.zip

JorgeB · February 22, 2019

Looks like there are still issues, problem start here:

Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 3, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 4, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 5, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 6, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 7, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 8, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 9, flush 0, corrupt 0, gen 0
Feb 22 10:14:57 Unraid kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 10, flush 0, corrupt 0, gen 0

These are read errors, though not seeing any error before about the device dropping, there's this:

Feb 22 10:14:47 Unraid kernel: nvme nvme0: failed to set APST feature (-19)

Which I'm not sure if it's a reason for concern, either there's a problem with the device or still some issues with current kernel, if you can replace it with a Samsung NVMe device, those are known to work without issues.

Matt4982 · February 23, 2019

So, at this point I feel pretty hosed and might be beyond my knowledge. The appdata folder is on teh cache drive, and since its getting read errors, most of the docker containers aren't working. I tried to use the mover server, but it doesnt appear to be working. I also tried going in through midnight commander and moving the appdata, but it is timing out. I did a backup via CA Backup and Restore prior to all this, so now I'm wondering what the easiest method would be to just restoring the appdata on the array.

JorgeB · February 23, 2019

Not sure if CA Backup permits to change the restore folder, but if it doesn't it should just be a zipped file, you should be able to restore it manually to wherever you want.

Squid · February 24, 2019

Not sure if CA Backup permits to change the restore folder, but if it doesn't it should just be a zipped file, you should be able to restore it manually to wherever you want.

You just change the "source" in its settings to adjust where a restore goes

Sent via telekinesis

Matt4982 · February 24, 2019

If I move them manually from the zip, so I need to do anything special to retain the proper permissions? The zip is currently on a different source

I worry about doing a restore through can backup and restore since some of my Dockers are already off the cache drive and the backup will now be a few days old

Formatting Cache Drive Failed

Recommended Posts

Matt4982

Link to comment

JorgeB

Link to comment

Matt4982

Link to comment

JorgeB

Link to comment

Matt4982

Link to comment

Matt4982

Link to comment

JorgeB

Link to comment

Matt4982

Link to comment

JorgeB

Link to comment

Matt4982

Link to comment

JorgeB

Link to comment

Matt4982

Link to comment

JorgeB

Link to comment

Matt4982

Link to comment

JorgeB

Link to comment

Squid

Link to comment

Matt4982

Link to comment

Join the conversation