Help with slow parity build


Recommended Posts

Hi All,

 

Sorry if this is stupid or obvious, but I'm having some trouble figuring it out. I'm pretty new to unRAID. I've had a series of issues, but the main current issue is that my parity build is going really slowly. 

 

Hardware: Recently changed from an old 2013 Revo Aspire PC connected to an 8 bay Orico drive with a USB-C to USB-A cable to an 8th gen Intel NUC i5 with USB-C to USB-C cable hoping to make everything run a bit snappier since the CPU was always maxed out. (I know external enclosures are not recommended, but that's what I had). I had a set of drives including 1x 10Tb parity, and 2x 10Tb, 2x 4Tb and 3x 2Tb data drives. With the NUC I bought a 480Gb SSD to use as cache, was cacheless before that. 

 

Initially things were fine, then the enclosure was unplugged during a data write and a 10Tb drive was disabled. I was reasonably convinced that it wasn't damaged, so I unassigned it and reassigned it for a rebuild. This was taking forever. It seemed the total read speed of the array stays constant at about 30-35Mb/s, so each drive will read at about 6Mb/s and the data drive would write at 6Mb/s. Once the 2Tbs are out of the way the remaining drives go up to 8-9Mb/s and once the 4Tbs are done the 10Tbs go at 18-19Mb/s. It got to 99.6% complete and my son threw a toy behind the TV and it unplugged again :( Hence why external enclosures aren't recommended. 

 

I then tried to do a SMART test on the drive, but even though the drives are active and working, none of the drives in the Orico enclosure appear spun up so they won't test. I think because they all get listed as JMicron Generic 0-7 instead of their actual model WD__ etc. 

 

So I copied the emulated data off disk3 (10Tb disabled drive) and did a new config and am now rebuilding the parity, thinking that maybe that drive was faulty and slowing things down, but am getting the same thing. Currently, the rebuild has been active for 16 hours, is 3.7% complete and has about 16 days to go. Oddly on the old Revo with the same drives, I would build parity in around 22 hours or so, which seemed normal. Why would the new system be slower?

 

It's all a bit of a mess. Can anyone help clarify where the bottleneck in this system is? I've attached my diagnostics file. I'm sure I've left out some crucial information, so just let me know what I can do to help.

 

Thanks!

blackbox-diagnostics-20200224-2248.zip

Link to comment

EDIT - Just googled and can see the Orico 8 bay (NS800U3 if that's the model) is USB3.0. I would have assumed that if you have multiple drives using a single USB link, it's possibly saturating it?  But from looking it states up to 5Gbps

 

I'm confused by what you're saying about the rebuilds.  unRAID parity will repair the one or two disks that are missing:-

 

"It seemed the total read speed of the array stays constant at about 30-35Mb/s, so each drive will read at about 6Mb/s and the data drive would write at 6Mb/s. Once the 2Tbs are out of the way the remaining drives go up to 8-9Mb/s and once the 4Tbs are done the 10Tbs go at 18-19Mb/s. It got to 99.6% complete and my son threw a toy behind the TV and it unplugged again"

 

I'm a bit confused about the above.  I would have assumed that when you unplugged the array and one of the drive was disabled, that this one drive would then be rebuilt by the parity drive?

 

Also, are the drives in the enclosure each classed as being passed through, i.e with bare metal access by the OS?  Or are they in their own kind of array?

 

Sorry, trying to get my head around this...!

 

Edited by sdamaged
Link to comment
9 hours ago, sdamaged said:

EDIT - Just googled and can see the Orico 8 bay (NS800U3 if that's the model) is USB3.0. I would have assumed that if you have multiple drives using a single USB link, it's possibly saturating it?  But from looking it states up to 5Gbps

 

I'm confused by what you're saying about the rebuilds.  unRAID parity will repair the one or two disks that are missing:-

 

"It seemed the total read speed of the array stays constant at about 30-35Mb/s, so each drive will read at about 6Mb/s and the data drive would write at 6Mb/s. Once the 2Tbs are out of the way the remaining drives go up to 8-9Mb/s and once the 4Tbs are done the 10Tbs go at 18-19Mb/s. It got to 99.6% complete and my son threw a toy behind the TV and it unplugged again"

 

I'm a bit confused about the above.  I would have assumed that when you unplugged the array and one of the drive was disabled, that this one drive would then be rebuilt by the parity drive?

 

Also, are the drives in the enclosure each classed as being passed through, i.e with bare metal access by the OS?  Or are they in their own kind of array?

 

Sorry, trying to get my head around this...!

 

Very close, it's the NS800C3 (http://my.orico.cc/goods.php?id=6531), basically the same but USB C instead of B. The NUC has Thunderbolt capability, so I thought in theory it should get decent speeds, 5Gbps for both ends. Again, I'm aware USB isn't the most stable, but I've protected the ports as much as possible now, and I had been using it with Windows and Drivepool for a while with no major issues, and can't afford to change all my hardware. Also, it worked well on the old Acer, so I'm not sure if it is something to do with the NUC or something to do with the cable. I'm not certain of the capability of the cable, so I ordered a Thunderbolt 3 cable with 40Gbps capability to eliminate that as a factor. 

 

Sorry for the confusion about the rebuilds. I've tried a few different things in the last couple of weeks. Initially, just a repair of the disabled drive from parity, but that was taking days to complete and when it was almost done got knocked again (I've since sorted that out). When I tried again it estimated 44 days to rebuild, so I cancelled thinking the drive was screwed, copied the emulated data to another external drive outside the array and did a new config without that drive and another empty 2Tb drive which had increasing errors, and am now rebuilding the parity with 5 hopefully healthy data drives.

 

I'll attach a screenshot of my main page. The drives all seem to be independent with individual bare metal access. The total array Read Speed stays fairly constant between 30-35Mbps, but the individual disks Read Speed can change depending on how many are being used to calculate parity, i.e. if the 2Tbs aren't used, the remaining 3 drives will be 9-10Mbs and if the 4Tbs are out the 10Tb will do 30-35Mbps. The write speed on the parity will be the same as any individual drive. I don't blame you for struggling to get your head around it, since I still can't either. Is it possible something about the NUC isn't playing well with unRAID since the Acer used to work?

 

Thanks so much for looking at this, I'll definitely buy you a beer if you can solve it (or even if you give it a crack!)

 

Screen Shot 2020-02-25 at 10.01.32 am.png

Screen Shot 2020-02-25 at 10.13.54 am.png

Link to comment
8 hours ago, xMaverickx said:

The drives all seem to be independent with individual bare metal access. The total array Read Speed stays fairly constant between 30-35Mbps, but the individual disks Read Speed can change depending on how many are being used to calculate parity, i.e. if the 2Tbs aren't used, the remaining 3 drives will be 9-10Mbs and if the 4Tbs are out the 10Tb will do 30-35Mbps.

That suggests the USB link is the bottleneck, like it's using USB 2.0 instead of 3.0, 35MB/s is about write for USB 2.0, though correct speed is detected during initialization, at least for the USB controller, can't see the actual link speed with the USB devices, try different cable/port to see if it makes any difference.

 

Also, CPU is overheating and throttling down:

 

Feb 24 20:57:42 BlackBox kernel: CPU7: Core temperature above threshold, cpu clock throttled (total events = 27)
Feb 24 20:57:42 BlackBox kernel: CPU3: Core temperature above threshold, cpu clock throttled (total events = 27)
Feb 24 20:57:42 BlackBox kernel: CPU6: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU4: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU0: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU5: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU2: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU1: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU3: Package temperature above threshold, cpu clock throttled (total events = 28)
Feb 24 20:57:42 BlackBox kernel: CPU7: Package temperature above threshold, cpu clock throttled (total events = 28)

 

 

  • Thanks 1
Link to comment

Also, i wonder why the drives are showing as JMicron generic instead of the actual drive serials.  This suggests to me that perhaps they are not connected bare metal?  I still think its a issue with USB2 though as said above, especially as the speeds seem about right for USB2.  Probably stating the obvious here, but make sure you're connected to one of the blue USB ports

Edited by sdamaged
  • Thanks 1
Link to comment
34 minutes ago, sdamaged said:

Also, i wonder why the drives are showing as JMicron generic instead of the actual drive serials.  This suggests to me that perhaps they are not connected bare metal?  I still think its a issue with USB2 though as said above, especially as the speeds seem about right for USB2.  Probably stating the obvious here, but make sure you're connected to one of the blue USB ports

It's a USB-C to USB-C cable, but I can't remember where I got it from, so maybe it's a cheap one... We may be on the right track because when it was working better with the old computer it was a different USB-C to USB-A cable. I just assumed a USB-C to USB-C cable would automatically be USB 3.1 and better. When my new cable comes I'll report back. 

 

I don't know why the drives all are JMicron. They were like that on windows as well and I guess it's a product of the enclosure. If I move the same drives to a different 2 bay dock they show up as their proper serials. It's annoying because I can't initiate a SMART test on them, since it says they aren't spun up, but they work and do self tests periodically.

Link to comment

Found these bit of info online too:-

 

"If you're using a cheapo third party USB-C cable, it might only support USB 2.0 High Speed transfer rates (if it doesn't have wires for all the USB 3.1 pins)."

 

"Type C defines a connector. It doesn't define speed. This is like assuming all M.2 slots are faster than SATA. M.2 can also use SATA based drives in addition to NVMe." 

 

i would say mystery solved...!

Edited by sdamaged
Link to comment
11 minutes ago, sdamaged said:

Found this online too:-

 

"If you're using a cheapo third party USB-C cable, it might only support USB 2.0 High Speed transfer rates (if it doesn't have wires for all the USB 3.1 pins)."

 

i would say mystery solved...!

Yep, I bit the bullet and cancelled the parity build that had been going for 2 days (that was hard) and plugged in the old USB-C to USB-A 3.0 cable and it's going 10x faster. Thanks a lot for your help. How can I send you some beer money?

Edited by xMaverickx
Link to comment
  • 2 years later...

Hi, sorry to resurrect an old thread. 

 

I have bought the Orico 8 bay (NS800U3) which is the USB3.0 version of the unit discussed in the thread above.

 

I’m experiencing all the symptoms of slowness in parity rebuilds (such as 5.7Mb/s for 7 days) and the drives all being identified as:

Jmicron_Generic_Disk00

Jmicron_Generic_Disk01

Jmicron_Generic_Disk02

Jmicron_Generic_Disk03

Jmicron_Generic_Disk04

Jmicron_Generic_Disk05

Jmicron_Generic_Disk06

Jmicron_Generic_Disk07

 

To note: if you shut down the server, remove ANY one of the 8 disks and restart will mean the Orico drive bay will reallocate the drives to be seen as Disk00 to Disk06 (Disk07 no longer exists)...

 

... when I remove the disk I assigned to Drive4 (Disk04) in the array I see that Unraid reports Drive7 (Disk07) is missing due to the Orico reassignment, which I think is kicking off another 7 to 10 day rebuild. 

 

Due to the S.M.A.R.T. data being visible in the “Identify” tab I would like to know if there’s a way to set Unraid to use the device serial number to identify the disks within the Orico unit just like it identifies when the drive is directly connected to a SATA port on the motherboard?

 

This way I can be sure Unraid is seeing the correct Drive(s) present or missing instead of the moving naming convention happening with the JMicron_Generic names. 

 

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.