Jump to content

Array Starting - Mounting disks seems totally stuck after update


Recommended Posts

Hello

I am hoping someone can help me get my system up and running again.
After updating to 6.12.8 and restarting Unraid I see 3 disks having SMART errors and the mounting disks are stuck so the array is not comming online. I tried to turn off the system and check if some cables had come loose or other evedience of something physical wrong but I could not see anything out of the ordinary.

I don't know where to start looking for a solution. I have seen in other posts that the diagnostic file is required so I managed to make one and attach (i hope)

Also I dont know if my system had errors prior to the update as I am not monitoring it regulary. It's been running steadily for several years until today.

I hope someone can see the issue and point me in the right direction.

Thanks in advance
 

musse-unraid-diagnostics-20240301-2126.zip

Link to comment

I will take your advice and setup Notifications. When it's just running it's easy to forget about it.
I have started a extended SMART self-test right now.

Would it be time to get a new set of disks?

Link to comment
Posted (edited)

Is the SMART test likely to solve the issue or should I expect to do something else after the test?

It's only 10% done so I expect that it will take quite some time to complete the test.
Is there anything I can do in the meantime to help solving the issue?

The log-file seems stuck in this loop;
Mar  1 22:23:50 Musse-unRAID kernel: ata2: EH complete
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x0
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: irq_stat 0x40000008
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: failed command: READ FPDMA QUEUED
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: cmd 60/c0:90:20:ed:66/00:00:57:00:00/40 tag 18 ncq dma 98304 in
Mar  1 22:23:53 Musse-unRAID kernel:         res 41/40:00:20:ed:66/00:00:57:00:00/40 Emask 0x409 (media error) <F>
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: status: { DRDY ERR }
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: error: { UNC }
Mar  1 22:23:53 Musse-unRAID kernel: ata2.00: configured for UDMA/133
Mar  1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=2s
Mar  1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 Sense Key : 0x3 [current] 
Mar  1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 ASC=0x11 ASCQ=0x4 
Mar  1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 CDB: opcode=0x28 28 00 57 66 ed 20 00 00 c0 00
Mar  1 22:23:53 Musse-unRAID kernel: I/O error, dev sdc, sector 1466363168 op 0x0:(READ) flags 0x0 phys_seg 24 prio class 2
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363104
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363112
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363120
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363128
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363136
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363144
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363152
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363160
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363168
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363176
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363184
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363192
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363200
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363208
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363216
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363224
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363232
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363240
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363248
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363256
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363264
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363272
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363280
Mar  1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363288
Mar  1 22:23:53 Musse-unRAID kernel: ata2: EH complete

Edited by Flænz
Link to comment

It's stll running the test and it has been on 90% since saturday noon. Maybe it is stuck.

I have recieved new disks so I was thinking to simply just change the parity right away.

Is it possible to restart and turn off the "auto mount" of the array and then start replacing the Parity and one of the other faulty disks without completing the SMART test?

I would really like to get the server up and running again

Link to comment
37 minutes ago, Flænz said:

Is it possible to restart and turn off the "auto mount" of the array and then start replacing the Parity and one of the other faulty disks without completing the SMART test?

Yes.   The SMART test would be automatically aborted if you power off to replace the drive.

Link to comment
Posted (edited)

Thanks for all the advice. It's 3 years ago I put the system together so in the meantime I have forgotten everything ;-)
Right now I am just afraid to mees it up and then all the data is gone.

Unraid seems a bit unresponsive at the moment. I try to apply no to autostart but the settings does not get saved and when I try to do restart it seems not to excequte the commands.

Can I do something in the Terminal to stop the Array autostart and restart the Unraid server?
 

 

Edited by Flænz
Link to comment
Posted (edited)

I have tried from 2 different laptops running windows but I am denied access to edit directly in the disk.cfg. 
I can open the file in textpad but I can't save or overwrite the file.

When I issue command "restart" in CLI nothing happens.

Should I hard turn-off the server and put the flashdrive directly into the laptop and see if I can edit the disk.cfg file here?

 

Edited by Flænz
Link to comment
1 hour ago, Flænz said:

afraid to mees it up and then all the data is gone

Each data disk in the array is an independent filesystem that can be read all by itself on any linux. And parity contains none of your data.

 

4 hours ago, Flænz said:

turn off the "auto mount" of the array and then start replacing the Parity and one of the other faulty disks

Parity seems to be the only disk needing replacement. Don't see any need to turn off auto start

Link to comment

I didn't notice before that you have a disable disk, with a disabled disk you cannot remove parity, at least not without losing the data on that disk, has that been disabled for long? Do you know if there's any data  there? Do you still have the old disk?

Link to comment

Some years back I did something because one of the disks failed, if i remember correctly.
I cant quite remember what I did back then but I fear that the disk is gone and I might have tried to put a new(old) disk in as a replacement. 

I most likely did something wrong because looking inside the PC there is actually 5 disks connected which is kind of a surprise to me.

Link to comment

Ok, then that must be done. 
I learn from this and in future pay much more attention. It's when things stop working you know what you have lost!

It cannot be brought back from the historical devices?


Can you describe to me what I have to do next 

I guess that I can disconnect "Disk 4" which is "not installed" and then use this slot to connect the new parity. I dont have vacant connection otherwise
I have a new 4TB disk ready for parity replacement but I have not pluged it in yet
 

Link to comment
4 minutes ago, Flænz said:

It cannot be brought back from the historical devices?

No, but if you still have the old disk you can see if there's any data there.

 

4 minutes ago, Flænz said:

Can you describe to me what I have to do next

Tools - New config - Keep array and pool assignments - apply. Then go back to main and start the array

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...