Flænz Posted March 1 Share Posted March 1 Hello I am hoping someone can help me get my system up and running again. After updating to 6.12.8 and restarting Unraid I see 3 disks having SMART errors and the mounting disks are stuck so the array is not comming online. I tried to turn off the system and check if some cables had come loose or other evedience of something physical wrong but I could not see anything out of the ordinary. I don't know where to start looking for a solution. I have seen in other posts that the diagnostic file is required so I managed to make one and attach (i hope) Also I dont know if my system had errors prior to the update as I am not monitoring it regulary. It's been running steadily for several years until today. I hope someone can see the issue and point me in the right direction. Thanks in advance musse-unraid-diagnostics-20240301-2126.zip Quote Link to comment
trurl Posted March 1 Share Posted March 1 1 minute ago, Flænz said: I am not monitoring it regulary Setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Quote Link to comment
trurl Posted March 1 Share Posted March 1 Parity has pending sectors. Run an extended SMART self-test on it. Quote Link to comment
Flænz Posted March 1 Author Share Posted March 1 I will take your advice and setup Notifications. When it's just running it's easy to forget about it. I have started a extended SMART self-test right now. Would it be time to get a new set of disks? Quote Link to comment
Flænz Posted March 1 Author Share Posted March 1 (edited) Is the SMART test likely to solve the issue or should I expect to do something else after the test? It's only 10% done so I expect that it will take quite some time to complete the test. Is there anything I can do in the meantime to help solving the issue? The log-file seems stuck in this loop; Mar 1 22:23:50 Musse-unRAID kernel: ata2: EH complete Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x0 Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: irq_stat 0x40000008 Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: failed command: READ FPDMA QUEUED Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: cmd 60/c0:90:20:ed:66/00:00:57:00:00/40 tag 18 ncq dma 98304 in Mar 1 22:23:53 Musse-unRAID kernel: res 41/40:00:20:ed:66/00:00:57:00:00/40 Emask 0x409 (media error) <F> Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: status: { DRDY ERR } Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: error: { UNC } Mar 1 22:23:53 Musse-unRAID kernel: ata2.00: configured for UDMA/133 Mar 1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=2s Mar 1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 Sense Key : 0x3 [current] Mar 1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 ASC=0x11 ASCQ=0x4 Mar 1 22:23:53 Musse-unRAID kernel: sd 2:0:0:0: [sdc] tag#18 CDB: opcode=0x28 28 00 57 66 ed 20 00 00 c0 00 Mar 1 22:23:53 Musse-unRAID kernel: I/O error, dev sdc, sector 1466363168 op 0x0:(READ) flags 0x0 phys_seg 24 prio class 2 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363104 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363112 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363120 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363128 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363136 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363144 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363152 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363160 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363168 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363176 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363184 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363192 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363200 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363208 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363216 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363224 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363232 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363240 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363248 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363256 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363264 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363272 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363280 Mar 1 22:23:53 Musse-unRAID kernel: md: disk0 read error, sector=1466363288 Mar 1 22:23:53 Musse-unRAID kernel: ata2: EH complete Edited March 1 by Flænz Quote Link to comment
trurl Posted March 2 Share Posted March 2 If it fails extended test it needs to be replaced. Quote Link to comment
JorgeB Posted March 2 Share Posted March 2 If you want to get the array back online you can unassign parity and it should start, then get a new parity disk as soon as possible. Quote Link to comment
Flænz Posted March 3 Author Share Posted March 3 It's stll running the test and I have already put in a order for new disks, they arrive on monday. I hope the test is ok and then I will start replacing the diske one by one Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 It's stll running the test and it has been on 90% since saturday noon. Maybe it is stuck. I have recieved new disks so I was thinking to simply just change the parity right away. Is it possible to restart and turn off the "auto mount" of the array and then start replacing the Parity and one of the other faulty disks without completing the SMART test? I would really like to get the server up and running again Quote Link to comment
itimpi Posted March 4 Share Posted March 4 37 minutes ago, Flænz said: Is it possible to restart and turn off the "auto mount" of the array and then start replacing the Parity and one of the other faulty disks without completing the SMART test? Yes. The SMART test would be automatically aborted if you power off to replace the drive. Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 Is it ok to stat the replacement of the disks even they have the SMART errors and the array cannot start at the moment? Quote Link to comment
JorgeB Posted March 4 Share Posted March 4 The array should start if you unassign/disconnect parity, did you try that? Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 (edited) Thanks for all the advice. It's 3 years ago I put the system together so in the meantime I have forgotten everything Right now I am just afraid to mees it up and then all the data is gone. Unraid seems a bit unresponsive at the moment. I try to apply no to autostart but the settings does not get saved and when I try to do restart it seems not to excequte the commands. Can I do something in the Terminal to stop the Array autostart and restart the Unraid server? Edited March 4 by Flænz Quote Link to comment
JorgeB Posted March 4 Share Posted March 4 If auto start if enabled and you cannot disable it using the GUI, edit disk.cfg on your flash drive (config/disk.cfg) and change startArray="yes" to "no", then type reboot in the CLI. Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 (edited) I have tried from 2 different laptops running windows but I am denied access to edit directly in the disk.cfg. I can open the file in textpad but I can't save or overwrite the file. When I issue command "restart" in CLI nothing happens. Should I hard turn-off the server and put the flashdrive directly into the laptop and see if I can edit the disk.cfg file here? Edited March 4 by Flænz Quote Link to comment
trurl Posted March 4 Share Posted March 4 1 hour ago, Flænz said: afraid to mees it up and then all the data is gone Each data disk in the array is an independent filesystem that can be read all by itself on any linux. And parity contains none of your data. 4 hours ago, Flænz said: turn off the "auto mount" of the array and then start replacing the Parity and one of the other faulty disks Parity seems to be the only disk needing replacement. Don't see any need to turn off auto start Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 ok, I like that and thank you for all the help. It seems to me that somehow I cannot do anything as long as it's stuck in this "Array Starting * Mounting disks" Am I missing something? Quote Link to comment
JorgeB Posted March 4 Share Posted March 4 I didn't notice before that you have a disable disk, with a disabled disk you cannot remove parity, at least not without losing the data on that disk, has that been disabled for long? Do you know if there's any data there? Do you still have the old disk? Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 Some years back I did something because one of the disks failed, if i remember correctly. I cant quite remember what I did back then but I fear that the disk is gone and I might have tried to put a new(old) disk in as a replacement. I most likely did something wrong because looking inside the PC there is actually 5 disks connected which is kind of a surprise to me. Quote Link to comment
JorgeB Posted March 4 Share Posted March 4 See if you can get the syslog to see if the issue is still with parity or there's also a problem with disk1 filesystem cp /var/log/syslog /boot/syslog.txt Then attach here. Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 The "5th-drive" is actually in the istorical devices syslog.txt Quote Link to comment
JorgeB Posted March 4 Share Posted March 4 The problem still appears to be parity, but with a disable disk, to remove it you will need to do a new config, and will lose any data that was on disk5, but it's probably the only option now. Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 Ok, then that must be done. I learn from this and in future pay much more attention. It's when things stop working you know what you have lost! It cannot be brought back from the historical devices? Can you describe to me what I have to do next I guess that I can disconnect "Disk 4" which is "not installed" and then use this slot to connect the new parity. I dont have vacant connection otherwise I have a new 4TB disk ready for parity replacement but I have not pluged it in yet Quote Link to comment
JorgeB Posted March 4 Share Posted March 4 4 minutes ago, Flænz said: It cannot be brought back from the historical devices? No, but if you still have the old disk you can see if there's any data there. 4 minutes ago, Flænz said: Can you describe to me what I have to do next Tools - New config - Keep array and pool assignments - apply. Then go back to main and start the array Quote Link to comment
Flænz Posted March 4 Author Share Posted March 4 It seems that I cannot do anything as long as the array is "Starting" Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.