Jump to content

Unraid 6.11 Disk "disappeared" during upgrade from small to larger drive. Restarted put drive in different port and now drive is emulated and no way to rebuild array


Recommended Posts

Unraid 6.11

upgrading from WD 6TB -> 22 TB drive.
Stopped host.

Replaced 6TB WD drive with 22 TB drive in same physical drive bay
Powered on. Selected new drive and let it start the rebuild.
Shortly after it said there were READ errors and it couldn't find the drive and then the drive was listed in the unassigned devices.

Captured the diagnostics atached.
Powered off.

moved the 22TB drive to a new bay.

Now UNRAID has red cross next to the 22 TB drive and saying that it is emulated (the 6TB drive contents - or 5.78TB or so).

  • How do I get my array to reconstruct (rebuild) onto the 22TB drive?
  • I couldn't find anything else so it's currently doing a Read-Check?
  • Is it possible to reconstruct at all?
  • Do I need to construct a new array (new Config with the 22 TB drive added already) - wait the 1.5 days for that to construct parity .. and then copy back the content from the 6TB drive at about 30MB/s 😐 vs 150-200MB/s for the rebuild
     

Hoping that someone will have something that will save me days of manual copying?

Thanks for your attention :)

ctu-diagnostics-20230609-1434.zip

Link to comment

thanks for that 🙂
It's not easy to locate the required files from the broadcom site ...but I think I have them now.

How do I know if I need the R or IT version?

using the command ./sas2flash -list -c 0 
yielded this

Firmware Product ID            : 0x2713 (IR)

 

So they are all IR

 

And should I also be updating the BIOS at the same time? As that is ancient also. Do the firmware and bios need to match?

I have 3 cards - identically ancient versions

./sas2flash -listall
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

    Adapter Selected is a LSI SAS: SAS2008(B2)

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS2008(B2)     07.00.00.00    07.00.00.03    07.11.00.00     00:01:00:00
1  SAS2008(B2)     07.00.00.00    07.00.00.03    07.11.00.00     00:02:00:00
2  SAS2008(B2)     07.00.00.00    07.00.00.03    07.11.00.00     00:03:00:00

 

 

FIrmware and NVData appear to have been updated ... the BIOS SAID it was updated ...but the listing shows that it did NOT update 😐

 

./sas2flash -o -f 2118ir.bin -b x64sas2.rom
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

	Advanced Mode Set

	Adapter Selected is a LSI SAS: SAS2008(B2)

	Executing Operation: Flash Firmware Image

		Firmware Image has a Valid Checksum.
		Firmware Version 20.00.07.00
		Firmware Image compatible with Controller.

		Valid NVDATA Image found.
		NVDATA Version 14.01.00.00
		Checking for a compatible NVData image...

		NVDATA Device ID and Chip Revision match verified.
		NVDATA Versions Compatible.
		Valid Initialization Image verified.
		Valid BootLoader Image verified.

		Beginning Firmware Download...
		Firmware Download Successful.

		Verifying Download...

		Firmware Flash Successful.

		Resetting Adapter...
		Adapter Successfully Reset.

	Executing Operation: Flash BIOS Image

		Validating BIOS Image...

		BIOS Header Signature is Valid

		BIOS Image has a Valid Checksum.

		BIOS PCI Structure Signature Valid.

		BIOS Image Compatible with the SAS Controller.

		Attempting to Flash BIOS Image...

		Verifying Download...

		Flash BIOS Image Successful.

		Updated BIOS Version in BIOS Page 3.

	Finished Processing Commands Successfully.
	Exiting SAS2Flash.
root@ctu:/lsi# ./sas2flash -listall
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

	Adapter Selected is a LSI SAS: SAS2008(B2)

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS2008(B2)     20.00.07.00    14.01.00.09    07.11.00.00     00:01:00:00
1  SAS2008(B2)     07.00.00.00    07.00.00.03    07.11.00.00     00:02:00:00
2  SAS2008(B2)     07.00.00.00    07.00.00.03    07.11.00.00     00:03:00:00

 

Edited by belorion
command output from firmware and BIOS attempt
Link to comment

All 3 SAS adapters firmware updated but no change ... it's still paused on attempting to rebuild the array .... I will attempt to update the firmware also tomorrow with a boot disk / USB. I've attached the diagnostic logs

 

/sas2flash -listall
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

	Adapter Selected is a LSI SAS: SAS2008(B2)

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS2008(B2)     20.00.07.00    14.01.00.09    07.11.00.00     00:01:00:00
1  SAS2008(B2)     20.00.07.00    14.01.00.09    07.11.00.00     00:02:00:00
2  SAS2008(B2)     20.00.07.00    14.01.00.09    07.11.00.00     00:03:00:00

	Finished Processing Commands Successfully.
	Exiting SAS2Flash.

 

ctu-diagnostics-20230610-2001.zip

Link to comment

Many different combinations have been attempted.

I determined that 1 x 22 TB drive was on the onboard SATA controllers and 1 x 22 TB drive was on the LSI controllers. So I moved ALL the drives off the onboard controllers and onto LSI (I only have 16 drives in 24 bays total). That failed in the same way.

 

I swapped the bays of the 22TB around - ie good one with bad one. Bad one still dropped off even when it was on the "good bay".

 

I tried all drives off internal SATA controller and reverting to the 6TB that was previously being attempted to upgrade to 22 TB still failed.

 

I tried putting ONLY the good 22 TB drive (parity) on the onboard  SATA controller worked - when 6 TB was in its original (LSI bay). That took almost 2 days to recompute parity (as all the above attempts rendered parity mangled 😐 ).. The 6tb did have 98 read errors whist reconstructing parity (and failed SMart test due to the number of online hours (1592 days) - this was one of the reasons I was doing this (replacing older drives with bigger). This is my "other" unRaid  (that is made of the left overs of the main unraid ie reusing older drives).

 

I am now trying to replace the 6TB with the 2nd 22 TB WD in the onboard SATA controller - been going for 1.5 hours and projected to take 2 days more.

 

I haven't upgrade the BIOS as of yet (my USB sticks weren't recognised so I ordered some more from amazon). Will attempt after the 22 TB is fully accepted into the array.

 

 

SMART ERROR MESSAGE

======================

Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 [1] occurred at disk power-on lifetime: 38221 hours (1592 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

Link to comment
  • 2 weeks later...

As the bot is asking to close this ... it isn't done yet ... I've managed to get the system working with reduced disks numbers ... and the 22TB drives on the onboard motherboard SATA controllers. It is extremely slow to copy the content back on 25-35MB/s hence the long delay in posts.

 

I am now attempting to *gently* add back in the drives 1 at a time as it's a 2 day hit for parity sync / calc each time ...I saw gently as I'm doing the reconstructed write to simulate large load on the system (power draw) and I've placed back one of the drives into the machine but not included in the array and copying it back over the network (trying to stress it) and it's giving me link errors ..but keeps running ... just slowly

 

Jun 25 18:38:24 ctu kernel: ata3: EH complete
Jun 25 18:44:50 ctu kernel: ata3.00: exception Emask 0x10 SAct 0x40000000 SErr 0x4890000 action 0xe frozen
Jun 25 18:44:50 ctu kernel: ata3.00: irq_stat 0x08400040, interface fatal error, connection status changed
Jun 25 18:44:50 ctu kernel: ata3: SError: { PHYRdyChg 10B8B LinkSeq DevExch }
Jun 25 18:44:50 ctu kernel: ata3.00: failed command: READ FPDMA QUEUED
Jun 25 18:44:50 ctu kernel: ata3.00: cmd 60/00:f0:00:fe:d7/04:00:1d:03:00/40 tag 30 ncq dma 524288 in
Jun 25 18:44:50 ctu kernel:         res 40/00:00:00:fe:d7/00:00:1d:03:00/40 Emask 0x10 (ATA bus error)
Jun 25 18:44:50 ctu kernel: ata3.00: status: { DRDY }
Jun 25 18:44:50 ctu kernel: ata3: hard resetting link
Jun 25 18:44:52 ctu kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jun 25 18:44:52 ctu kernel: ata3: hard resetting link
Jun 25 18:44:57 ctu kernel: ata3: link is slow to respond, please be patient (ready=0)
Jun 25 18:45:02 ctu kernel: ata3: COMRESET failed (errno=-16)
Jun 25 18:45:02 ctu kernel: ata3: hard resetting link
Jun 25 18:45:06 ctu kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 25 18:45:06 ctu kernel: ata3.00: configured for UDMA/133
Jun 25 18:45:06 ctu kernel: ata3: EH complete
Jun 25 20:42:57 ctu kernel: ata3.00: exception Emask 0x10 SAct 0x3000 SErr 0x4890000 action 0xe frozen
Jun 25 20:42:57 ctu kernel: ata3.00: irq_stat 0x08400040, interface fatal error, connection status changed
Jun 25 20:42:57 ctu kernel: ata3: SError: { PHYRdyChg 10B8B LinkSeq DevExch }
Jun 25 20:42:57 ctu kernel: ata3.00: failed command: READ FPDMA QUEUED
Jun 25 20:42:57 ctu kernel: ata3.00: cmd 60/00:60:e8:9e:64/04:00:2a:03:00/40 tag 12 ncq dma 524288 in
Jun 25 20:42:57 ctu kernel:         res 40/00:00:e8:a2:64/00:00:2a:03:00/40 Emask 0x10 (ATA bus error)
Jun 25 20:42:57 ctu kernel: ata3.00: status: { DRDY }
Jun 25 20:42:57 ctu kernel: ata3.00: failed command: READ FPDMA QUEUED
Jun 25 20:42:57 ctu kernel: ata3.00: cmd 60/60:68:e8:a2:64/01:00:2a:03:00/40 tag 13 ncq dma 180224 in
Jun 25 20:42:57 ctu kernel:         res 40/00:00:e8:a2:64/00:00:2a:03:00/40 Emask 0x10 (ATA bus error)
Jun 25 20:42:57 ctu kernel: ata3.00: status: { DRDY }
Jun 25 20:42:57 ctu kernel: ata3: hard resetting link
Jun 25 20:42:59 ctu kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jun 25 20:42:59 ctu kernel: ata3: hard resetting link
Jun 25 20:43:04 ctu kernel: ata3: link is slow to respond, please be patient (ready=0)
Jun 25 20:43:09 ctu kernel: ata3: COMRESET failed (errno=-16)
Jun 25 20:43:09 ctu kernel: ata3: hard resetting link
Jun 25 20:43:13 ctu kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 25 20:43:13 ctu kernel: ata3.00: configured for UDMA/133
Jun 25 20:43:13 ctu kernel: ata3: EH complete

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...