How do I determine which disk is ata8?


Recommended Posts

Hopefully this is a simple question, but so far I've not found a way... unless I watch the unraid boot-up messages and jot down quickly and hope for the best. One of my disks - ATA8 - is continually showing error messages in the unmenu syslog snapshots.

e.g.

Feb 9 15:43:17 MediaServer kernel: ata8.01: status: { DRDY DF }

Feb 9 15:43:17 MediaServer kernel: ata8.01: hard resetting link

Feb 9 15:43:17 MediaServer kernel: ata8.01: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Feb 9 15:43:17 MediaServer kernel: ata8.01: configured for UDMA/33

Feb 9 15:43:17 MediaServer kernel: ata8: EH complete

 

Also, one of my disks, and I don't know which one, is clicking a lot during parity syncs and checks, which slow to a crawl while the clicking is happening. While there are clicks, there are numerous errors attributed to disk ATA8. I'd like to know which one it is so I can change the cables, copy the data off the disk, replace it etc. Is there a simple way to determine which one it is?

 

I'm running 4.6 rc5 and attached is a screenshot from unmenu. Thanks for any help.

 

 

[EDIT] From what I recall of the boot-up spew, it may be one of the 1 TB WD EACS drives, but even then, there are four of them (disks 14-17).

unRAIDFeb9th2011.jpg.9bae891f14e6677ef5e3ca72394a0bb4.jpg

Link to comment

I have two hot-swap 5 bay enclosures (disks 11-20), but currently I've removed disk 12 and 13. All my drives have the serial numbers written and taped on the sides so it's not so hard to know which disk is where. The problem is that I know each disks serial number, sd(x) designation and md(y) desigation. It's the ATA(z) reference that's eluding me.

Part one of syslog attached (all up to login); more available if needed.

UnRAIDSyslogFeb9-11a.txt

Link to comment

Determining ata numbers and relating them to the drive numbers has become much harder with recent kernels, especially with the advent of the SAS cards.  They are often scrambled, with no relation to the SCSI ID's.

 

You should probably not refer to it as ATA8 (capitalized), which refers to a mode or speed of drive communication, ATA8 refers specifically to UDMA/133 (I think).  It should be referred to as ata8 (without the caps) as the syslog does.  As the symbol ata8, it refers to a channel, in this case the channel to one of your port multipliers and the 5 SATA ports on it.  More accurately, the drive you are asking about is connected to/associated with ata8.01, the second drive of the 5 possible drives on ata8, the others being ata8.00, ata8.02, ata8.03, and ata8.04.  This port multiplier set of drives begins with 2 WD20EADS, then an empty port, then 2 WD10EACS, so your drive is the second WD20EADS which is sdj, serial ending in 7974.

 

But that seems strange as it is the only drive that is NOT in the array, is not even the Cache drive, and therefore cannot be the drive affecting your parity syncs and checks.  I did notice an ICRC in the 6 syslog lines showing in your screen pic, so your first step should be to replace the SATA cable to sdj, then make sure it is well seated in that drive enclosure.

 

The easiest way to find the clicking bad drive would be to get SMART reports for all of the drives.  An even faster way is to use the MyMain screen of UnMENU, and change to the SMART display, the one that shows critical SMART values, and red-flags any that are serious.  I suspect the bad clicking drive will be very obvious!

Link to comment

Feb 9 15:43:17 MediaServer kernel: ata8.01: status: { DRDY DF }

Feb 9 15:43:17 MediaServer kernel: ata8.01: hard resetting link

Feb 9 15:43:17 MediaServer kernel: ata8.01: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Feb 9 15:43:17 MediaServer kernel: ata8.01: configured for UDMA/33

Feb 9 15:43:17 MediaServer kernel: ata8: EH complete

 

 

Feb  9 15:32:14 MediaServer kernel: ata8.01: ATA-8: WDC WD20EADS-00S2B0, 01.00A01, max UDMA/133

Feb  9 15:32:14 MediaServer kernel: ata8.01: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)

Feb  9 15:32:14 MediaServer kernel: ata8.01: configured for UDMA/100

 

 

Feb  9 15:32:14 MediaServer emhttp: pci-0000:02:00.0-scsi-0:0:0:0 host2 (sdi) WDC_WD20EADS-00S2B0_WD-WCAVY1606662

Feb  9 15:32:14 MediaServer emhttp: pci-0000:02:00.0-scsi-0:1:0:0 host2 (sdj) WDC_WD20EADS-00S2B0_WD-WCAVY1937974

 

Link to comment

Thanks, Rob.

I think I may have done us all a disservice. The ata8 drive is probably not the cause of my bigger problems. The drive ending in 7974 was just put into the port multiplying enclosure for that particular reboot. It was being precleared and put into service as disk 12, to replace one of the missing drives. I've learned a lot from your analysis of the syslog.

 

Doing what you suggested in ummenu seems to show that the parity drive itself is the worst of the bunch. It's just a few weeks old, but it's one of the problematic Seagates which need a firmware upgrade.

Included is a screenshot. I'm also not enjoying the dark colour of some of the other drives.

 

 

 

Determining ata numbers and relating them to the drive numbers has become much harder with recent kernels, especially with the advent of the SAS cards.  They are often scrambled, with no relation to the SCSI ID's.

 

You should probably not refer to it as ATA8 (capitalized), which refers to a mode or speed of drive communication, ATA8 refers specifically to UDMA/133 (I think).  It should be referred to as ata8 (without the caps) as the syslog does.  As the symbol ata8, it refers to a channel, in this case the channel to one of your port multipliers and the 5 SATA ports on it.  More accurately, the drive you are asking about is connected to/associated with ata8.01, the second drive of the 5 possible drives on ata8, the others being ata8.00, ata8.02, ata8.03, and ata8.04.  This port multiplier set of drives begins with 2 WD20EADS, then an empty port, then 2 WD10EACS, so your drive is the second WD20EADS which is sdj, serial ending in 7974.

 

But that seems strange as it is the only drive that is NOT in the array, is not even the Cache drive, and therefore cannot be the drive affecting your parity syncs and checks.  I did notice an ICRC in the 6 syslog lines showing in your screen pic, so your first step should be to replace the SATA cable to sdj, then make sure it is well seated in that drive enclosure.

 

The easiest way to find the clicking bad drive would be to get SMART reports for all of the drives.  An even faster way is to use the MyMain screen of UnMENU, and change to the SMART display, the one that shows critical SMART values, and red-flags any that are serious.  I suspect the bad clicking drive will be very obvious!

unRaidFeb9th2011unmenupic.jpg.4de00cfdecd981d5f6b0eb60aaabb722.jpg

Link to comment

Thanks very much!

 

 

Feb 9 15:43:17 MediaServer kernel: ata8.01: status: { DRDY DF }

Feb 9 15:43:17 MediaServer kernel: ata8.01: hard resetting link

Feb 9 15:43:17 MediaServer kernel: ata8.01: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Feb 9 15:43:17 MediaServer kernel: ata8.01: configured for UDMA/33

Feb 9 15:43:17 MediaServer kernel: ata8: EH complete

 

 

Feb  9 15:32:14 MediaServer kernel: ata8.01: ATA-8: WDC WD20EADS-00S2B0, 01.00A01, max UDMA/133

Feb  9 15:32:14 MediaServer kernel: ata8.01: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)

Feb  9 15:32:14 MediaServer kernel: ata8.01: configured for UDMA/100

 

 

Feb  9 15:32:14 MediaServer emhttp: pci-0000:02:00.0-scsi-0:0:0:0 host2 (sdi) WDC_WD20EADS-00S2B0_WD-WCAVY1606662

Feb  9 15:32:14 MediaServer emhttp: pci-0000:02:00.0-scsi-0:1:0:0 host2 (sdj) WDC_WD20EADS-00S2B0_WD-WCAVY1937974

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.