Jump to content

Startech PEXSATA221 and troubles generating parity


Recommended Posts

Hello,

 

I'm having trouble creating parity in my new unraid server. I'm posting this in a new thread because I have been unable to find any other similar.

 

I have just moved all my data to my unraid disks (without the parity and then I have started building it. First time it stopped at 34% more or less, there where messages about errors on parity and some errors in the counter of a disk, the web page became not very responsive (minutes to refresh) and all my attemps to shutdown from webpage or console where not working.

 

After shutting down by brute force and start it again I decided to say all disk not to sleep, then I started the parity creation again, now it's near 45%, when I have comed back from the sofa and the system looked a bit irresponsive, I forced a spin up of all disks and the it wrote a little but now seems really stopped (the disk light looks off).

 

I have also reseted the counter to see the movements (but there was no erro in counters)

 

This is my current status:

 

 

Device  Identification  Temp.  Size  Free  Reads  Writes  Errors

 

parity WDC_WD20EARX-00PASB0_WD-WCAZA8431017 (sdf) 1953514552 33°C 2 TB - 0 197 0

disk1 WDC_WD15EARS-00Z5B1_WD-WMAVU2191182 (sdh) 1465138552 30°C 1.5 TB 40.34 GB 204 0 0

disk2 WDC_WD15EADS-00R6B0_WD-WCAVY0249852 (sdg) 1465138552 31°C 1.5 TB 1.05 TB 199 0 0

disk3 WDC_WD20EARS-00MVWB0_WD-WCAZA5876707 (sdd) 1953514552 29°C 2 TB 175.83 GB 210 0 0

disk4 WDC_WD20EARS-00MVWB0_WD-WMAZA0938029 (sda) 1953514552 30°C 2 TB 31.39 GB 209 0 0

disk5 WDC_WD20EARS-00MVWB0_WD-WCAZA5630745 (sdb) 1953514552 28°C 2 TB 291.48 GB 209 0 0

disk6 WDC_WD20EARS-00MVWB0_WD-WMAZA0939580 (sdc) 1953514552 30°C 2 TB 305.77 GB 209 0 0

flash USB_DISK - 2.03 GB 1.94 GB 145 17 0

 

Array Status

--------------------------------------------------------------------------------

Started

Stop will take the array off-line.

Parity-Sync in progress.

Cancel will stop Parity-Sync.

WARNING: canceling Parity-Sync will leave the array unprotected!

 

Total size:2TB

 

Current position:905.67GB (45%)

Estimated speed:139.35KB/sec

Estimated finish:130644minutes

 

Some minutes after:

 

Total size:2TB

 

Current position:906.29GB (45%)

Estimated speed:1.03 MB/sec

Estimated finish: 17632 minutes

 

 

My system:

 

As you see all these WD drives (and one more in ntfs not mounted pending to see that all data is safe)

 

The motherboard is an ASUS E35M1-M PRO (has all components including processor and VGA), 4GB RAM Kingston HyperX, and a suplementary sata controller Startech PEXSATA221.

Antec 900 Case and Tooq 700 PSU

- This system has been working with most of the disks with W2k8R2 for months without any issue (well, it never left the disks sleeping, thats all)

 

I will leave my system trying to finish while I'm sleeping (now is 2:34 in the morning Spanish time, I'm tyred )

 

Thanks.

Link to comment

I'm running last beta version 5.12a?

 

I will send you the logs as soon as I'm back home.

 

the disks have been wrking without errors for months in a desktop. Last 2-3 months also in a windows stripe. i've formated them in unraid and copied a large amount of data into them, no errors also..

 

Yes, the speed is really slow, at the beginig it was really fast, like 70MB per second.

 

Thanks.

Link to comment

mbryanr is right, you have lots and lots of disk errors.  They started on September 30th:

 

Sep 30 21:13:40 Tower kernel: ata8: illegal qc_active transition (00000001->ffffffff)
Sep 30 21:13:40 Tower kernel: ata8.00: exception Emask 0x2 SAct 0x0 SErr 0x0 action 0x6 frozen
Sep 30 21:13:40 Tower kernel: ata8.00: failed command: READ DMA EXT
Sep 30 21:13:40 Tower kernel: ata8.00: cmd 25/00:00:60:3c:24/00:04:00:00:00/e0 tag 0 dma 524288 in
Sep 30 21:13:40 Tower kernel:          res 50/00:00:5f:40:24/00:00:00:00:00/e0 Emask 0x2 (HSM violation)
Sep 30 21:13:40 Tower kernel: ata8.00: status: { DRDY }
Sep 30 21:13:40 Tower kernel: ata8: hard resetting link
Sep 30 21:13:40 Tower kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep 30 21:13:40 Tower kernel: ata8.00: configured for UDMA/133
Sep 30 21:13:40 Tower kernel: ata8: EH complete

 

...and the continue up to the present.

 

From earlier in the syslog, I can see that 'ata8' corresponds to this 1.5 TB WD EARS disk:

 

Sep 30 21:11:37 Tower kernel: ata8.00: ATA-8: WDC WD15EARS-00Z5B1, 80.00A80, max UDMA/133

 

First you need to determine the drive letter (SDA, SDB, etc.) of that drive.  Just look at the main page of the unRAID web GUI and you should find it easily.  Second, you need to obtain a SMART report for that drive.  Follow the instructions here: wiki link.  Once you have the SMART report, upload it to this thread and we'll take a look at it.  There's a good chance the drive is dying.

Link to comment

Beat me to it Raj. ;D

 

Looks like the WD 1.5TB drive is sdh from the post above...

 

May also explain these entries which kills the smb share

Oct  2 00:47:24 Tower emhttp: shcmd (245): :>/etc/samba/smb-shares.conf

Oct  2 00:47:24 Tower emhttp: get_config_idx: fopen /boot/config/shares/32GB.cfg: No such file or directory - assigning defaults

Link to comment

Yes, both disks that have reported errors are this ones:

 

disk1 WDC_WD15EARS-00Z5B1_WD-WMAVU2191182 (sdh) 1465138552 27°C 1.5 TB 40.34 GB 1818593 80 0

 

disk2 WDC_WD15EADS-00R6B0_WD-WCAVY0249852 (sdg) 1465138552 30°C 1.5 TB 1.05 TB 1834323 82 0

 

I have attached the smart report from both disks (launched from unmenu, one of the few things I know how to do :) )

 

Thanks

smart.txt

Link to comment

Device Model:     WDC WD15EARS-00Z5B1

199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       19

 

Bad cabling are the usual suspects for the above. <especially since no sectors were pending or reallocated.

 

Line 1701: Sep 30 22:56:14 Tower kernel: ata8: illegal qc_active transition (00000001->ffffffff)

Check your SATA connections first...

http://forums.debian.net/viewtopic.php?f=7&t=56304

 

 

those can also be attributed to poor power voltages...but first things first.

I also saw this against ata8

Line 2039: Oct  1 01:11:06 Tower kernel: ata8.00: device reported invalid CHS sector 0 <which may be the drive not responding properly>

 

 

Link to comment

I have just changed the 2 cables of both drives with 2 news in their bags but could be also old cables.

 

They have written:

serial ata 26awg copartner e119932 awm 2725 80ºC 30V VW-1

 

And are the same kind of the other 2 that were connected.

 

I could purchase some sata cabling tomorrow, but not today (it's 21:43 here).

 

What kind of test could I do now?

 

Thanks.

Link to comment

I started the building of the parity (but I'm not sure if a short smart test i have launched was finished, sorry for that).

 

From unmenu I see this and sounds bad:

(if I copy the color the format was terrible)

 

Oct  3 21:50:28 Tower emhttp: get_config_idx: fopen /boot/config/shares/write.cfg: No such file or directory - assigning defaults (Other emhttp)

Oct  3 21:50:28 Tower emhttp: Restart SMB... (Other emhttp)

Oct  3 21:50:28 Tower emhttp: shcmd (57): killall -HUP smbd (Minor Issues)

Oct  3 21:50:28 Tower emhttp: shcmd (58): ps axc | grep -q rpc.mountd (Other emhttp)

Oct  3 21:50:28 Tower emhttp: _shcmd: shcmd (58): exit status: 1 (Other emhttp)

Oct  3 21:50:28 Tower emhttp: shcmd (59): /usr/local/sbin/emhttp_event svcs_restarted (Other emhttp)

Oct  3 21:50:29 Tower emhttp_event: svcs_restarted (Other emhttp)

Oct  3 21:51:50 Tower kernel: ata7.00: exception Emask 0x32 SAct 0x0 SErr 0x0 action 0xe frozen (Errors)

Oct  3 21:51:50 Tower kernel: ata7.00: irq_stat 0xffffffff, unknown FIS 00000000 00000000 00000000 00000000, host bus  (Drive related)

Oct  3 21:51:50 Tower kernel: ata7.00: failed command: READ DMA EXT (Minor Issues)

Oct  3 21:51:50 Tower kernel: ata7.00: cmd 25/00:00:c8:11:88/00:04:00:00:00/e0 tag 0 dma 524288 in (Drive related)

Oct  3 21:51:50 Tower kernel:          res 50/00:00:c7:15:88/00:00:00:00:00/e0 Emask 0x32 (host bus error) (Errors)

Oct  3 21:51:50 Tower kernel: ata7.00: status: { DRDY } (Drive related)

Oct  3 21:51:50 Tower kernel: ata7: hard resetting link (Minor Issues)

Oct  3 21:51:51 Tower kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related)

Oct  3 21:51:51 Tower kernel: ata7.00: configured for UDMA/133 (Drive related)

Oct  3 21:51:51 Tower kernel: ata7: EH complete (Drive related)

Oct  3 21:52:59 Tower kernel: mdcmd (42): spinup 1 (Routine)

Oct  3 21:52:59 Tower kernel:  (Routine)

Oct  3 21:54:41 Tower kernel: ata7.00: exception Emask 0x32 SAct 0x0 SErr 0x0 action 0xe frozen (Errors)

Oct  3 21:54:41 Tower kernel: ata7.00: irq_stat 0xffffffff, unknown FIS 00000000 00000000 00000000 00000000, host bus  (Drive related)

Oct  3 21:54:41 Tower kernel: ata7.00: failed command: READ DMA EXT (Minor Issues)

Oct  3 21:54:41 Tower kernel: ata7.00: cmd 25/00:00:10:3a:ce/00:04:01:00:00/e0 tag 0 dma 524288 in (Drive related)

Oct  3 21:54:41 Tower kernel:          res 50/00:00:0f:3e:ce/00:00:01:00:00/e0 Emask 0x32 (host bus error) (Errors)

Oct  3 21:54:41 Tower kernel: ata7.00: status: { DRDY } (Drive related)

Oct  3 21:54:41 Tower kernel: ata7: hard resetting link (Minor Issues)

Oct  3 21:54:42 Tower kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related)

Oct  3 21:54:42 Tower kernel: ata7.00: configured for UDMA/133 (Drive related)

Oct  3 21:54:42 Tower kernel: ata7: EH complete (Drive related)

Oct  3 21:54:43 Tower kernel: ata7: illegal qc_active transition (00000001->ffffffff) (Drive related)

Oct  3 21:54:43 Tower kernel: ata7.00: exception Emask 0x2 SAct 0x0 SErr 0x0 action 0x6 frozen (Errors)

Oct  3 21:54:43 Tower kernel: ata7.00: failed command: READ DMA EXT (Minor Issues)

Oct  3 21:54:43 Tower kernel: ata7.00: cmd 25/00:f8:18:b7:ce/00:02:01:00:00/e0 tag 0 dma 389120 in (Drive related)

Oct  3 21:54:43 Tower kernel:          res 50/00:00:0f:ba:ce/00:00:01:00:00/e0 Emask 0x2 (HSM violation) (Errors)

Oct  3 21:54:43 Tower kernel: ata7.00: status: { DRDY } (Drive related)

Oct  3 21:54:43 Tower kernel: ata7: hard resetting link (Minor Issues)

Oct  3 21:54:43 Tower kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related)

Oct  3 21:54:43 Tower kernel: ata7.00: configured for UDMA/133 (Drive related)

Oct  3 21:54:43 Tower kernel: ata7: EH complete (Drive related)

Oct  3 21:57:03 Tower kernel: ata7: limiting SATA link speed to 1.5 Gbps (Drive related)

Oct  3 21:57:03 Tower kernel: ata7.00: exception Emask 0x32 SAct 0x0 SErr 0x0 action 0xe frozen (Errors)

Oct  3 21:57:03 Tower kernel: ata7.00: irq_stat 0xffffffff, unknown FIS 00000000 00000000 00000000 00000000, host bus  (Drive related)

Oct  3 21:57:03 Tower kernel: ata7.00: failed command: READ DMA EXT (Minor Issues)

Oct  3 21:57:03 Tower kernel: ata7.00: cmd 25/00:00:10:76:dc/00:04:02:00:00/e0 tag 0 dma 524288 in (Drive related)

Oct  3 21:57:03 Tower kernel:          res 50/00:00:0f:7a:dc/00:00:02:00:00/e0 Emask 0x32 (host bus error) (Errors)

Oct  3 21:57:03 Tower kernel: ata7.00: status: { DRDY } (Drive related)

Oct  3 21:57:03 Tower kernel: ata7: hard resetting link (Minor Issues)

Oct  3 21:57:04 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) (Drive related)

Oct  3 21:57:04 Tower kernel: ata7.00: configured for UDMA/133 (Drive related)

Oct  3 21:57:04 Tower kernel: ata7: EH complete (Drive related)

Oct  3 21:57:35 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen (Errors)

Oct  3 21:57:35 Tower kernel: ata7.00: failed command: READ DMA EXT (Minor Issues)

Oct  3 21:57:35 Tower kernel: ata7.00: cmd 25/00:00:10:96:dc/00:04:02:00:00/e0 tag 0 dma 524288 in (Drive related)

Oct  3 21:57:35 Tower kernel:          res 40/00:00:0f:7a:dc/00:00:02:00:00/e0 Emask 0x4 (timeout) (Errors)

Oct  3 21:57:35 Tower kernel: ata7.00: status: { DRDY } (Drive related)

Oct  3 21:57:35 Tower kernel: ata7: hard resetting link (Minor Issues)

Oct  3 21:57:36 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) (Drive related)

Oct  3 21:57:36 Tower kernel: ata7.00: configured for UDMA/133 (Drive related)

Oct  3 21:57:36 Tower kernel: ata7: EH complete (Drive related)

Oct  3 21:57:36 Tower kernel: ata7.00: exception Emask 0x10 SAct 0x0 SErr 0x990000 action 0xe frozen (Errors)

Oct  3 21:57:36 Tower kernel: ata7.00: irq_stat 0x00400000, PHY RDY changed (Drive related)

Oct  3 21:57:36 Tower kernel: ata7: SError: { PHYRdyChg 10B8B Dispar LinkSeq } (Errors)

Oct  3 21:57:36 Tower kernel: ata7.00: failed command: READ DMA EXT (Minor Issues)

Oct  3 21:57:36 Tower kernel: ata7.00: cmd 25/00:00:10:96:dc/00:04:02:00:00/e0 tag 0 dma 524288 in (Drive related)

Oct  3 21:57:36 Tower kernel:          res 50/00:02:00:00:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error) (Errors)

Oct  3 21:57:36 Tower kernel: ata7.00: status: { DRDY } (Drive related)

Oct  3 21:57:36 Tower kernel: ata7: hard resetting link (Minor Issues)

Oct  3 21:57:44 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) (Drive related)

Oct  3 21:57:44 Tower kernel: ata7.00: configured for UDMA/133 (Drive related)

Oct  3 21:57:44 Tower kernel: ata7: EH complete (Drive related)

 

Link to comment

This is my next step,

as I'm going to buy some sata cabling tomorrow I have created a new configuration only with the onboard disks already in the raid (4 drives + parity).

 

Now parity is being created 84-114MB/s

 

I will see if it finishes without errors as a test for the rest of my drives and for unraid itself.

Link to comment

Hello,

 

I'm having trouble creating parity in my new unraid server.  ...

 

and, then, in a follow-up, ...

 

I'm running last beta version 5.12a?

 

Forgive this old geezer's observation, but wouldn't the OP be better to start his new unraid server off with v4.7, and, once that appears stable, give v5.0-b12a a try.

 

Sounds a little creepy: "beta" hardware with beta software. That's  like "nested, circular, finger-pointing" waiting to happen.

 

-- UhClem

 

Link to comment

Just to inform you, I've had no problems without the controller and the 2 1,5TB drives. I'm going to return the controller. I have already ordered an Adaptec 1430sa controller, that from what I've seen is a PNP solution and i don't want to have more issues, it will also give me the posibility to upgrade the system with more disks :)

The controller will arrive in some weeks due to that I've purchased it from an ebay user too far and because I will also take some vacations and i don't know if the card will arrive before them.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...