(Solved) Mover Error


Recommended Posts

Trying to free up space on my cache and the mover isn't moving some files off. I turned on logging and it shows "error: Invalid argument" on multiple files.

 

Somethings I've tried to resolve this issue:

1. Stop all dockers and run mover.

2. Set new docker safe permissions and run mover

3. Installed the Open Files plugin and checked it for the files the syslog listed but it doesn't show any in use.

 

Not sure what is causing this error I thought it might be from the backups running but it continues to error even though I stopped the back ups.

 

Any ideas as to what is causing this?

 

tower-diagnostics-20190603-1602.zip

Edited by nickro8303
Link to comment
1 hour ago, nealbscott said:

Run the plugin 'fix common problems' and see if it finds anything.  I had files on my cache drive, but the share they belonged to was not enabled to use  that cache, therefore the mover never did anything with them.  The good news is that the plugin caught it.

Thanks for the reply, I've already checked the settings on that share several times to make sure it was set to use the cache. I ran the fix common problems plugin but it didn't find anything.

Link to comment
8 hours ago, John_M said:

You have a cable problem (or, much less likely, a controller problem) with your cache SSD. Shut down, replace the SATA cable, check the power cable and power up again.

I've replaced the cables several times. I replaced the motherboard, the power supply, the controller and even the SSD. Those CRC errors won't go away. The cache disk isn't plugged in to the controller it's plugged in to the motherboard.

Link to comment

If it isn't the cables then it's the controller or the SSD itself. You still have a hardware problem, irrespective of what you've changed. Your syslog is absolutely full with messages like this, which makes it difficult to read:

Jun  1 06:13:29 TOWER kernel: ata6.00: failed command: READ FPDMA QUEUED
Jun  1 06:13:29 TOWER kernel: ata6.00: cmd 60/08:78:40:dc:5c/00:00:18:00:00/40 tag 15 ncq dma 4096 in
Jun  1 06:13:29 TOWER kernel:         res 41/84:08:40:dc:5c/00:00:18:00:00/00 Emask 0x410 (ATA bus error) <F>
Jun  1 06:13:29 TOWER kernel: ata6.00: status: { DRDY ERR }
Jun  1 06:13:29 TOWER kernel: ata6.00: error: { ICRC ABRT }
Jun  1 06:13:29 TOWER kernel: ata6: hard resetting link
Jun  1 06:13:29 TOWER kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun  1 06:13:29 TOWER kernel: ata6.00: supports DRM functions and may not be fully accessible
### [PREVIOUS LINE REPEATED 1 TIMES] ###

Is it a Marvell based controller? They are known to have problems and they seem to be getting worse with never kernels. If so you'll need to replace it. There are inexpensive dual port ASMedia SATA controllers that work well with Unraid.

Link to comment
16 minutes ago, John_M said:

Is it a Marvell based controller? They are known to have problems and they seem to be getting worse with never kernels. If so you'll need to replace it. There are inexpensive dual port ASMedia SATA controllers that work well with Unraid.

I believe it is, but the SSD is not plugged in to the controller. Would it still causing these errors even though it's not on those ports?

Link to comment

No. If the controller isn't in use then it can't cause any problems. The problem is with the SATA link between the controller and the SSD. It keeps resetting. Usually that means a bad SATA cable. Other causes can be a bad power distribution, bad controller port (or a generally problematic controller) or a bad SSD. You could try moving the SSD to the port used by one of your hard disks. Also are you using power splitters? Change things one at a time and make notes.

Link to comment

SATA cables that have long parallel runs can crosstalk so obsessive cable neatness is best avoided. Most SATA cables leave a lot to be desired. The nasty stiff flat red ones give me the most trouble. I prefer the more flexible ones with latching connectors and an 8-shaped cross section.

Link to comment
15 minutes ago, John_M said:

SATA cables that have long parallel runs can crosstalk so obsessive cable neatness is best avoided. Most SATA cables leave a lot to be desired. The nasty stiff flat red ones give me the most trouble. I prefer the more flexible ones with latching connectors and an 8-shaped cross section.

I'm not kidding when I say I've literally replaced almost every part on this server over the last few months trying to stop those CRC errors. The only parts that I haven't replaced are the CPU and the RAM modules. I am using those flat red cables (my cable management isn't perfect either) I'll pick up some better cables and try replacing them again but I'm still not sure that this is causing the mover to error out on those files.

Link to comment

You might well have more than one problem but with the syslog so full of CRC errors it's difficult to see the wood for the trees. When a CRC error happens the controller retries and eventually resets the link, dropping the speed if necessary. This particular SATA port/cable/SSD combination can't even maintain a 1.5 Gb/s link. The occasional CRC error is no big problem but, as I said, you have so many of them it's difficult to see what else is going on so you need to fix that first and then see if there's anything else that isn't as it should be. I'd unplug the SSD and literally swap it with one of the hard disks as a way of narrowing down the problem. Unraid recognises drives by their serial number, not by where they are connected so you won't have to reconfigure anything. I can't see any way that this particular problem could be RAM or CPU related.

Link to comment

These are the ones I've been buying for the last couple of years. I haven't used them with Samsung SSDs though. I've had no issues using them with SanDisk or WD Blue SSDs. I also like the ones you tend to get bundled with ASRock motherboards - they're thicker than these, black and very flexible.

Link to comment

Well I changed the cables out and swapped ports with another drive and as no surprise to me I'm still getting CRC errors on the cache drive. I'm telling you there is no getting rid of these errors. I've literally replaced everything.

 

Any way I connected a spare drive and transferred the Backup data off the cache drive and then back to the array but now I'm getting an error when trying to write to the Backup share.  

 

image.png.96d8c970b74ebad642101487102a271d.png

 

The only way it will let me write to that share is if I enable cache on it. Any ideas?

tower-diagnostics-20190606-0002.zip

Link to comment
14 minutes ago, John_M said:

You said in an earlier post that you've replaced the SSD. Given johnnie.black's comment about Samsung ones being very picky, have you tried a different brand?

I have not tried another brand yet. As far as I can tell though there is no issue with the SSD. When I replaced it the CRC error count picked up where it left off with the old one and continued. I've not seen any failures to read/write from the SSD.

 

Any ideas as to the error I just posted about?

 

 

image.png

 

Edited by nickro8303
Link to comment

I'm glad you've found a work around but I can't see that stopping the CRC errors since they are a hardware issue. The error message looks like file system corruption, likely caused by the inability of the SATA controller to communicate reliably with the SSD. Your syslog will show you if the CRC errors are still happening. You haven't fixed the problem - just hidden it for a while.

 

Well, j.b has pointed out the likely cause and his advice is the best you'll get on the subject. I've suggested trying a different brand with a couple of examples of what work for me. I don't have any other suggestions, I'm afraid.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.