Jump to content

Disk error when transferring files for too long at a time


Go to solution Solved by JorgeB,

Recommended Posts

Greetings!

I've got a home built server running Unraid 6.12.0 (stable). Specs should be fine but here goes:

  • ASRock Z790 PG-ITX/TB4
  • 13th Gen Intel® Core™ i9-13900KS @ 3168 MHz
  • 64 GiB DDR5

 

Storage:

 

Connections:

  • Server -> LAN: Ethernet (Cat6, possibly shielded I believe)
  • Server -> Enclosure 1: Active USB-C 3.1 Gen 2 5m cable (https://www.club-3d.com/en/detail/2507/)
  • Enclosure 1 -> Enclosure 2 (daisy chained): Included USB-C cable (I believe also 3.1 Gen 2)

 

Misc:

  • I have a 1TB nvme ssd set as cache pool

 

I recently bought a Ultrastar 20TB HDD to use as parity drive at the same time I decided to migrate all my disks from a now disassembled QNAP NAS to Unraid. My thought process was I had ~15TB worth of data spread out on all my old disks and so buying the Ultrastar 20TB I decided to unload all the data on to that, move all my old disks to the new enclosures, let Unraid format them and then transfer back the data to them without a parity drive for the duration of the migration. I admit I've taken a suboptimal and rather naive approach but here I am xD

 

So the plan is to make the Ultrastar the parity drive once I've moved over all data from it. Now my first blunder here was most likely that I formatted it and set it up in Unraid. From there I realized I had to set up an exclusive share that only the Ultrastar disk would get the data from and for all other shares I've excluded said disk (Disk 20 was the temporary place I gave it).

 

Things have worked rather well until now, when I was to move all the data from my "server-migration" share to it's intended final destination - the "media" share. What I haven't dared to do is to let the mover spread the data and then I move the folder to the intended share.

What I have tried however is to:
 

  1. Copy using Unraid's built in feature. I also tried chopping it up in queued chunks but to no avail.
  2. Copy using my Windows machine. Still share to share so I didn't really expect it to work any better.
  3. Copy to my Windows machine as intermediary. Then over to Unraid.

 

Number 3 has worked the best but it also forces my to chop down the transfers to chunks of ~600GB which possibly matters.

 

What is my problem?


Like the title says I inevitably get disk read errors during my transfers. using method 3. I usually get 2 batches of 600GB to my machine and then onto Unraid before error strikes, all drives dismount and 8 out of 10 times I have to recreate my boot drive as it is no longer recognized when rebooting. I haven't dared any other methods as it seems to work every time with an accompanied "Array turned good" notification on startup to calm my nerves a bit xD

I'm struggling to understand what causes this and would greatly appreciate your help.

I would throw the active usb-c cable under the bus without blinking as a possible perpetrator if it wasn't for it's excellent performance copying the entirety of my ~15TB in approx 3-4 copies. It went for over a day non stop no problem, reading from all the old disks which at the time were part of the QNAP NAS using a different filesystem (I sadly don't remember which).

 

I recently suspected it could have been me copying from the server while the Mover was running as a possible issue but I just now can confirm that it did not matter one bit that I let the Mover finish.

 

My last suspicion is that my oldest WD drive (4.5+ years) maybe is the root of all evil even though I could have sworn one of the first occurrences never had read errors on that disk. Ever since I noticed it's age it has however always been one of the disks with errors.

I'll be happy to change it if it actually is the culprit.


Terribly sorry for such a long post. I could really use your help.

Diagnostics are attached.

skynet-diagnostics-20230616-0120.zip

Link to comment
Jun 16 01:11:51 SkyNet kernel: hub 4-1.3:1.0: hub_ext_port_status failed (err = -71)
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot reset (err = -71)
### [PREVIOUS LINE REPEATED 4 TIMES] ###
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: Cannot enable. Maybe the USB cable is bad?
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot disable (err = -71)
Jun 16 01:11:51 SkyNet kernel: hub 4-1.3:1.0: hub_ext_port_status failed (err = -71)
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot reset (err = -71)
### [PREVIOUS LINE REPEATED 4 TIMES] ###
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: Cannot enable. Maybe the USB cable is bad?
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot disable (err = -71)
Jun 16 01:11:51 SkyNet kernel: hub 4-1.3:1.0: hub_ext_port_status failed (err = -71)
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot reset (err = -71)
### [PREVIOUS LINE REPEATED 4 TIMES] ###
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: Cannot enable. Maybe the USB cable is bad?
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot disable (err = -71)
Jun 16 01:11:51 SkyNet kernel: hub 4-1.3:1.0: hub_ext_port_status failed (err = -71)
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot reset (err = -71)
### [PREVIOUS LINE REPEATED 4 TIMES] ###
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: Cannot enable. Maybe the USB cable is bad?
Jun 16 01:11:51 SkyNet kernel: usb 4-1.3-port1: cannot disable (err = -71)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jun 16 01:11:51 SkyNet kernel: hub 4-1.3:1.0: hub_ext_port_status failed (err = -71)

 

Please note that we don't recommend USB for array/pool disks, they are very prone to random disconnects and bad at error handling in general.

Link to comment

Thank you for getting back so quickly!

I did notice those logs but half disregarded them since it worked flawlessly and non stop several days in a row initially. I maybe thought if it was usb related that maybe it was the flash drive but I'm a bit out of my depth making such an assumption and it's only supposed to read and write quite infrequently if I'm not mistaken?

 

Now that I think back I never used the second enclosure (which is daisy chained/connected to the first one via usb-c) back when everything ran as expected. Perhaps it's that, perhaps it's just usb in general. The enclosures only support usb-c though so I'm in a pickle then.
For a server without space for 3.5" drives what would you recommend for an Unraid setup?

 

 

Link to comment

I see. Thank you kindly for excellent service. I greatly appreciate the help.

I'll be looking to set up a SAS solution instead then xD

Edit: I've ultimately decided to go with a new case and a SAS HBA PCIe expansion. For those interested the ordered parts was:

  • Fractal Design Node 804
  • LSI SAS 9300-8i
  • 2x YIWENTEC Mini SAS (SFF-8643) to 4 x SATA - 1m
Edited by SciKo
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...