Disabled disk with red cross - not sure how to proceed next

moleboy · September 16, 2022

Hi,

I shutdown my server whilst an electrician was working in the house and then rebooted to find I had a disk that won't mount. I'm hoping I don't need a new disk and haven't lost data. I've read the help and followed the process to Check the filesystem but not sure how to proceed. Any help greatly appreciated!

Here's what the filesystem check shows and I've attached my diagnostics zip file

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 2
        - agno = 0
        - agno = 6
        - agno = 5
        - agno = 4
        - agno = 7
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

I have also attached Smart report for drive in question and the latest Smart Errors. Not sure if they are relevant?

ATA Error Count: 43 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 43 occurred at disk power-on lifetime: 25353 hours (1056 days + 9 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 40 30 38 0d ea 40 08      00:19:32.589  WRITE FPDMA QUEUED
  61 40 48 00 04 28 40 08      00:19:32.589  WRITE FPDMA QUEUED
  61 20 38 48 7e 01 40 08      00:19:32.588  WRITE FPDMA QUEUED
  60 00 28 40 da cc 40 08      00:19:32.588  READ FPDMA QUEUED
  60 00 20 40 d9 cc 40 08      00:19:32.588  READ FPDMA QUEUED

Error 42 occurred at disk power-on lifetime: 1374 hours (57 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 53 30 1f 0f 00 40  Error: ICRC, ABRT 48 sectors at LBA = 0x00000f1f = 3871

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  35 00 40 10 0f 00 e0 08   6d+00:47:23.907  WRITE DMA EXT
  25 00 40 d0 33 00 e0 08   6d+00:47:23.093  READ DMA EXT
  25 00 40 90 2e 00 e0 08   6d+00:47:23.089  READ DMA EXT
  25 00 40 50 29 00 e0 08   6d+00:47:23.086  READ DMA EXT
  25 00 40 10 24 00 e0 08   6d+00:47:23.083  READ DMA EXT

Error 41 occurred at disk power-on lifetime: 30 hours (1 days + 6 hours)
  When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 40 28 40 bc a7 40 08   1d+05:20:41.719  WRITE FPDMA QUEUED
  61 40 10 80 54 a8 40 08   1d+05:20:41.719  WRITE FPDMA QUEUED
  61 40 08 40 4f a8 40 08   1d+05:20:41.707  WRITE FPDMA QUEUED
  61 40 00 00 4a a8 40 08   1d+05:20:41.702  WRITE FPDMA QUEUED
  61 40 f8 c0 44 a8 40 08   1d+05:20:41.696  WRITE FPDMA QUEUED

Error 40 occurred at disk power-on lifetime: 29 hours (1 days + 5 hours)
  When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 40 38 00 a8 c9 40 08   1d+04:49:41.264  WRITE FPDMA QUEUED
  61 40 20 40 40 ca 40 08   1d+04:49:41.264  WRITE FPDMA QUEUED
  61 40 18 00 3b ca 40 08   1d+04:49:41.247  WRITE FPDMA QUEUED
  61 40 10 c0 35 ca 40 08   1d+04:49:41.247  WRITE FPDMA QUEUED
  61 40 08 80 30 ca 40 08   1d+04:49:41.237  WRITE FPDMA QUEUED

Error 39 occurred at disk power-on lifetime: 28 hours (1 days + 4 hours)
  When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 40 80 80 cf de 40 08   1d+04:06:32.997  WRITE FPDMA QUEUED
  61 40 68 c0 67 df 40 08   1d+04:06:32.987  WRITE FPDMA QUEUED
  61 40 60 80 62 df 40 08   1d+04:06:32.987  WRITE FPDMA QUEUED
  61 40 58 40 5d df 40 08   1d+04:06:32.968  WRITE FPDMA QUEUED
  61 40 50 00 58 df 40 08   1d+04:06:32.968  WRITE FPDMA QUEUED

mothership-diagnostics-20220916-1117.zip mothership-smart-20220916-1152.zip

JorgeB · September 16, 2022

Sep 16 10:06:23 Mothership kernel: ata8.15: Port Multiplier detaching
Sep 16 10:06:23 Mothership kernel: ata8.00: disabled
Sep 16 10:06:23 Mothership kernel: ata8.01: disabled
Sep 16 10:06:23 Mothership kernel: ata8.02: disabled
Sep 16 10:06:23 Mothership kernel: ata8.00: disabled

We don't recommend using controllers with SATA port multipliers because they are known to cause various issues, reboot and post new diags after array start.

moleboy · September 16, 2022

mothership-diagnostics-20220916-1240.zip

Thanks for the reply. Here's the latest diagnostics following roboot and array start.

It's been a while since I built the machine but it's a Mini-ITX build so I'm not sure I had the luxury of multiple ports? It's been working fine for 3-4 years but I realise it might not be ideal.

JorgeB · September 16, 2022

You can use an add-on controller, just use one without port multipliers, disk itself looks OK, replace the SATA cable since there are some recent UDMA CRC errors, and since the emulated disk is mounting and assuming contents look correct you can rebuild on top:

https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself

moleboy · September 16, 2022

Ah brilliant. Thanks very much. I'll do what you suggest and let you know how I get on.

moleboy · September 16, 2022

Just as a follow on, in replacing the controller. This is the current card that I have is PCI 2.0

https://www.amazon.co.uk/dp/B005B0A6ZS/ref=as_li_ss_tl?language=en_US&ie=UTF8&linkCode=gg2&linkId=c0893f24a38dd363c84f64c8f1f3ff32&tag=serverforums-20&th=1

Will my system be okay to get one of the recommended PCI 3.0 card?

The build I have was taken from this guide and is the 6-bay Mini-ITX NAS (scroll down to see) and the pertinent information from the build was

"The case has room for 6x3.5" HDD as well as 2x2.5" SSD. The motherboard has 4 onboard SATA, so we need to add either a 2 port card (for 6xHDD) or a 4 port card (for 6xHDD + 2xSSD) to take advantage of the rest of the drive bays.

I recommend a MSATA SSD 833 for cache if you’re going to use Unraid. This leaves 6x3.5" for parity and data. I’d only run 1 parity drive in a 6 bay NAS like this. If you go for the 4 port card, you can have 2 more SSDs as unassigned devices. (VM storage, unpack drives, non-parity/non-crucial data, etc.)"

moleboy · September 16, 2022

Link to build

https://forums.serverbuilds.net/t/guide-nas-killer-4-0-fast-quiet-power-efficient-and-flexible-starting-at-125/667/13

JorgeB · September 16, 2022

38 minutes ago, moleboy said:

Will my system be okay to get one of the recommended PCI 3.0 card?

Yes, PCIe 3.0 is backward compatible with older PCIe.

moleboy · September 16, 2022

Hi again,

I seem to have more disks unable to mount now after swapping some SATA cables out. I think I'm going to get a new controller to eliminate all possibilities. I've been going round in circles a bit but think this might do the job. I think it matches one of the recommended chipsets although it's quite a bit more expensive than my last one.

What do you think?

Thanks again, and apologies for my lack of technical knowledge

https://www.amazon.co.uk/SATA3-0-Expansion-PCI‑E3-0-Interface-Adapter-default/dp/B097MQ6K31/ref=sr_1_4?crid=2JCG40YXTXV&keywords=Asmedia+ASM1166&qid=1663348894&sprefix=asmedia+asm1166%2Caps%2C53&sr=8-4

JorgeB · September 16, 2022

26 minutes ago, moleboy said:

What do you think?

Should be fine.

Disabled disk with red cross - not sure how to proceed next

Recommended Posts

moleboy

Link to comment

JorgeB

Link to comment

moleboy

Link to comment

JorgeB

Link to comment

moleboy

Link to comment

moleboy

Link to comment

moleboy

Link to comment

JorgeB

Link to comment

moleboy

Link to comment

JorgeB

Link to comment

Join the conversation