Parity drive errors


Go to solution Solved by trurl,

Recommended Posts

I don't check the dashboard everyday - but, I checked a couple of weeks ago and found that one of my parity drives had an error and was disabled. I started an extended self-test and these are the results. I'm looking for advice here on if I should ignore the error - or, try to see if the drive is covered under warranty for replacement?

 

Thanks in advance::

 

 

ATA Error Count: 11733 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 11733 occurred at disk power-on lifetime: 18079 hours (753 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 50 ff ff ff ef 00  19d+15:20:45.234  READ DMA EXT
  25 00 08 ff ff ff ef 00  19d+15:20:45.234  READ DMA EXT
  35 00 88 ff ff ff ef 00  19d+15:20:45.231  WRITE DMA EXT
  35 00 30 ff ff ff ef 00  19d+15:20:45.229  WRITE DMA EXT
  35 00 40 ff ff ff ef 00  19d+15:20:45.227  WRITE DMA EXT

Error 11732 occurred at disk power-on lifetime: 18079 hours (753 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 c8 ff ff ff ef 00  19d+15:18:55.847  READ DMA EXT
  35 00 18 ff ff ff ef 00  19d+15:18:55.841  WRITE DMA EXT
  35 00 40 ff ff ff ef 00  19d+15:18:55.837  WRITE DMA EXT
  35 00 40 ff ff ff ef 00  19d+15:18:55.833  WRITE DMA EXT
  35 00 40 ff ff ff ef 00  19d+15:18:55.829  WRITE DMA EXT

Error 11731 occurred at disk power-on lifetime: 18059 hours (752 days + 11 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 f8 ff ff ff ef 00  18d+19:49:36.163  READ DMA EXT
  25 00 88 ff ff ff ef 00  18d+19:49:36.161  READ DMA EXT
  25 00 40 ff ff ff ef 00  18d+19:49:36.158  READ DMA EXT
  25 00 40 ff ff ff ef 00  18d+19:49:36.140  READ DMA EXT
  25 00 80 ff ff ff ef 00  18d+19:49:36.139  READ DMA EXT

Error 11730 occurred at disk power-on lifetime: 18059 hours (752 days + 11 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 40 ff ff ff ef 00  18d+19:48:32.973  READ DMA EXT
  25 00 40 ff ff ff ef 00  18d+19:48:32.970  READ DMA EXT
  25 00 40 ff ff ff ef 00  18d+19:48:32.940  READ DMA EXT
  35 00 b0 ff ff ff ef 00  18d+19:48:32.939  WRITE DMA EXT
  ea 00 00 00 00 00 a0 00  18d+19:48:32.873  FLUSH CACHE EXT

Error 11729 occurred at disk power-on lifetime: 18059 hours (752 days + 11 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 40 ff ff ff ef 00  18d+19:47:56.650  READ DMA EXT
  ea 00 00 00 00 00 a0 00  18d+19:47:56.636  FLUSH CACHE EXT
  35 00 08 ff ff ff ef 00  18d+19:47:56.636  WRITE DMA EXT
  25 00 40 ff ff ff ef 00  18d+19:47:56.615  READ DMA EXT
  25 00 08 ff ff ff ef 00  18d+19:47:56.614  READ DMA EXT

 

Link to comment
1 minute ago, Nomad32 said:

I don't check the dashboard everyday

You must setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Don't let one unnoticed problem become many and data loss.

 

That is only a part of the SMART report, and doesn't include the results of the extended test.

 

Attach Diagnostics to your NEXT post in this thread. that will give us that information and a lot more so we can get a more complete understanding of your situation.

Link to comment
  • Solution
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  5 Reallocated_Sector_Ct   PO--CK   001   001   010    NOW  0 (0 6)
SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: unknown failure    10%     18606         0

Both of these are a little unusual. The RAW_VALUE on reallocated isn't a simple count like we usually see, but the FAIL column says NOW, so I would say it has failed.

Extended self-test also says failed, but it doesn't know why. Usually we get something like read failure.

 

In any case, yes, it should be replaced.

 

You are also having problems with disk4, and it has a rather high Reallocated count. In fact, I would say that one needs replacing as well.

 

Do any of your other disks show SMART ( 👎 ) warnings on the Dashboard page?

 

What do you check on the Dashboard on those rare occasions?

 

Unrelated, your system share has files on the array. And you should update your plugins.

 

 

 

 

Link to comment

OK - thank you. Disk 4 has shown errors for a while now - but, never an issue bad enough to crap out on me. I think I'll pick up a couple of larger drives to replace the parity drives, and then use the good parity drive to replace disk 4.

 

Oh well - thank you so much for letting me know.

Link to comment
  • 2 weeks later...

I ended up buying 2, new, 14TB drives. I started by removing the dead 12TB parity drive, and replacing it with the 14TB drive. Unraid automagically started to rebuild parity on the new parity disk.

 

Once that completed - I followed the steps on the parity swap procedures page:: https://docs.unraid.net/legacy/FAQ/parity-swap-procedure/

 

For anyone curious - That process basically works through adding a new drive that is larger than your current parity drive. You remove the data drive you're replacing, change the current parity drive to be the data drive, and then assign the new disk as parity. After rebuilding parity, the system will start up the array and rebuild the new data disk (old parity drive converted to data).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.