2 disks have failed, and one of them stucks the array to "mouting" even when not here


xoC
Go to solution Solved by JorgeB,

Recommended Posts

BTW both disk1 & 2 were disabled.

 

If I'm not mistaken, the only way to make then enabled again in the array is to unselect the drive, run the array (maintenance mode here), stop the array, reselect the drives, re-run the array (maintenance mode again here), they now appear in blue and then start a rebuild ?

Link to comment
48 minutes ago, JorgeB said:

If the disks were disable you fixed the filesystem on the actual disk, not the emulated disk, and rebuilding will rebuild the emulated disk on top, were the emulated disks mounting before rebuilding?

 

One was mounting, I didn't do anything to that one, the other didn't mount and it said in the system log "can't mount disk, run xfs repair because bad primary superblock" or something like that.

 

41 minutes ago, itimpi said:

Did you run it from the GUI without the -n option?    The default is to run a read-only check.

 

I did -n, then with no -n, it said to do it with -L which I did. After completing, I ran -n again and there was no error. But the disk was still unmountable. When done from command line with -L, it did mount after that.

Link to comment

I'm in maintenance mode, how can I mount an array disk ? When not in rebuild, I was mounting via unassigned devices to verify it's mounting, but I don't know how to do that to an array disk.

 

If that helps :

image.png.a35d958cfc13d535b759ec1fbfa7695b.png

image.png.6076439ca32923dcd7dd6c949fe006c9.png

 

Since the beginning of the rebuild, the system log is free of error.

Link to comment
25 minutes ago, xoC said:

how can I mount an array disk ?

You can't, wait for the rebuild to finish and start in normal mode.

 

P.S. rebuild can be done in normal model.

 

26 minutes ago, xoC said:

When not in rebuild, I was mounting via unassigned devices

Except for emulated disks you can only do this if the disks are mounted read-only, or it will invalidate parity, an no point in doing it for emulated disks.

Link to comment

So it seems to work with everything plugged back as it was since 2+ years.

Could it be possible that the corrupted file system was just preventing rebuilds, as it just tried again and again to mount the disk during rebuild ?

 

Anyway, I'll monitor closely the next few days and thanks a lot for your answers.

Link to comment
  • 2 months later...
47 minutes ago, xoC said:

It's one of the controller from the motherboard (which has 3 controllers managing 10 ports)

6 of those are from the Intel SATA controller, those are fine, 2 are from a JMB controller, they *should* be fine, and the remaining two are from a Marvell controller, and that's where the disabled disk is connected.

  • Like 1
Link to comment
  • 2 weeks later...

So I've bought a 6 port card based on ASM1166, upgraded the firmware to the latest.

How's the procedure to migrate disk ?

I have in mind to move all the ones connected to the maxwell & jmicron chipsets. Can I just move the 4 disks at the same time, restart and it will be recognized ?

Link to comment

So, we're back at smart errors.

Disk was at "reported uncorrect = 1" when I re-plugged it (before rebuilding the array).
I acknowledged the issue and let the server run, it rebuilt and there was no error.
This night I received a mail after parity check saying it failed, with "Disk 5 - ST4000VN006-3CW104_ZW603BKR (sdk) - active 32 C (disk has read errors) [NOK]" with reported uncorrect gone to 2.

image.thumb.png.50139862eb6d31add2359d049c514c38.png

 

System log show some read errors on disk 5, on sectors close to each other, but no more disk reset with the new controller. It is a quite recent disk BTW.

 

Nov 26 04:15:21 NAStorm kernel: ata22.00: exception Emask 0x0 SAct 0x7f SErr 0x0 action 0x0
Nov 26 04:15:21 NAStorm kernel: ata22.00: error: { UNC }
Nov 26 04:15:21 NAStorm kernel: I/O error, dev sdk, sector 146077392 op 0x0:(READ) flags 0x0 phys_seg 59 prio class 2
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077328
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077336
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077344
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077352
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077360
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077368
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077376
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077384
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077392
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077400
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077408
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077416
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077424
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077432
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077440
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077448
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077456
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077464
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077472
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077480
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077488
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077496
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077504
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077512
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077520
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077528
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077536
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077544
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077552
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077560
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077568
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077576
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077584
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077592
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077600
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077608
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077616
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077624
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077632
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077640
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077648
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077656
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077664
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077672
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077680
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077688
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077696
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077704
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077712
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077720
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077728
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077736
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077744
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077752
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077760
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077768
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077776
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077784
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077792
Nov 26 04:30:15 NAStorm root: Fix Common Problems: Error: disk5 (ST4000VN006-3CW104_ZW603BKR) has read errors

 

I'm attaching current diagnostics.

nastorm-diagnostics-20231127-1642.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.