Jump to content

Shares missing, unable to write to cache


DingHo

Recommended Posts

Hello,

I transferred my array to a new system yesterday.  Everything seemed fine, booted up no problem and array started without issue, as did all docker containers.  I added some new drives to begin preclearing last night before bed.  I went to check this morning, and noticed several problems.  The dockers' version is marked 'not available'.  Running Fix Common problems yielded, "unable to write to cache", "unable to write to docker image", and "call traces found on your server".  In the Shares tab, all shares are missing.  The data still seems to be there if I look on the data disks.

 

My new system is an Asus P8V77-V, i5-3570k, unraid v6.3.5.  In order to add more drives, I also added a Supermicro AOC-SAS2LP-MV8.  I've noticed people have issues with this controller.  However, the cache disk is not connected to that controller. 

 

I've attached the syslog.  Please let me know what else would be helpful to diagnose.

 

Any help is  greatly appreciated.

 

 

 

 

scour-diagnostics-20180418-0732.zip

Link to comment

Your cache device dropped offline:

 

Quote

Apr 18 02:00:32 Scour kernel: ata8.00: status: { DRDY }
Apr 18 02:00:32 Scour kernel: ata8: hard resetting link
Apr 18 02:00:42 Scour kernel: ata8: softreset failed (1st FIS failed)
Apr 18 02:00:42 Scour kernel: ata8: hard resetting link
Apr 18 02:00:52 Scour kernel: ata8: softreset failed (1st FIS failed)
Apr 18 02:00:52 Scour kernel: ata8: hard resetting link
Apr 18 02:01:27 Scour kernel: ata8: softreset failed (1st FIS failed)
Apr 18 02:01:27 Scour kernel: ata8: limiting SATA link speed to 3.0 Gbps
Apr 18 02:01:27 Scour kernel: ata8: hard resetting link
Apr 18 02:01:32 Scour kernel: ata8: softreset failed (1st FIS failed)
Apr 18 02:01:32 Scour kernel: ata8: reset failed, giving up
Apr 18 02:01:32 Scour kernel: ata8.00: disabled

 

Check/replace cables.

Link to comment

Update:

After the pre-clears finished, I rebooted the server.  The array started successfully. The cache drive and shares are accessible and the dockers are functional.

I successfully made a copy of everything on the cache drive without any errors.  SMART short, and extended test ran without errors.

 

All the errors started at 2am, which is when the SSD trim is scheduled for the cache drive. Perhaps the controller doesn't support TRIM?

Link to comment
1 hour ago, johnnie.black said:

Check/replace cables.

 

Tried multiple cables.  It is not the issue.


I'm able to reproduce it when I invoke TRIM.  Switched the drive to another controller, invoked TRIM, and the issue did not reoccur. I think the other controller did not support TRIM.

Link to comment
11 hours ago, pwm said:

Might it be something wrong with the drive that manifests when it gets the trim command - and the second controller doesn't support trim?

 

The other controllers I get the errors whenever trim runs.  I get this when it's on the 'good' controller:

Apr 18 15:40:07 Scour root: /etc/libvirt: 1022.5 MiB (1072185344 bytes) trimmed
Apr 18 15:40:07 Scour root: /var/lib/docker: 16.6 GiB (17835913216 bytes) trimmed
Link to comment

Cache device is not being trimmed, only docker and libvrt images.

 

The controller where the SSD was installed in the diags you posted, the Asmedia controller, is using the AHCI driver and should support trim without any issues, same for the Intel onboard controller, the SAS2LP won't support trim.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...