shadowlord Posted July 19 Share Posted July 19 Unraid version 6.12.10 My server has been running well for a couple of years. In the last week, it has crashed daily, and can only be saved with a power-off/power-on. I think it may be the USB drive, but I wanted to check. Diagnositics are attached. Symptoms: - All Docker containers and VMs no longer responsive - UnRAID webGUI inaccessible - Wireguard VPN is running and will accept new connections (weird...) - Network card responds to pings - Unable to SSH into the Unraid server from within local LAN - Console attached directly to Unraid server shows the login screen - Attempts to login with a valid username/password via the console show an error message like "bash-completion: input/output error" - Once or twice: power cycling the computer, the BIOS says "no valid boot device found", but then after shutting down for a minute or so, and booting up, I can get into Unraid, and it seems normal for a while. Notes: - I have backed up my USB drive via Unraid Connect and downloaded a copy of that to my local PC server03-diagnostics-20240718-1736.zip Quote Link to comment
JorgeB Posted July 19 Share Posted July 19 Enable the syslog server and post that after a crash. Quote Link to comment
shadowlord Posted July 19 Author Share Posted July 19 (edited) OKay. Thanks, I've enabled the syslog. It hasn't crashed yet; but I've attached an updated diagnostics file. I'm specifically interested in this line, which leads me to suspect a USB drive failure. sda is the USB boot drive, and I've been seeing these errors in the log recently (maybe in the past 2-3 weeks) Jul 19 02:10:11 server03 kernel: critical medium error, dev sda, sector 16978124 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 2 server03-diagnostics-20240719-0529.zip Edited July 19 by shadowlord Quote Link to comment
JorgeB Posted July 19 Share Posted July 19 That does look like a failing flash drive, you should replace it. Quote Link to comment
shadowlord Posted July 19 Author Share Posted July 19 Would a failing flash drive explain the symptoms that I've been seeing? Symptoms: - All Docker containers and VMs no longer responsive - UnRAID webGUI inaccessible - Wireguard VPN is running and will accept new connections (weird...) - Network card responds to pings - Unable to SSH into the Unraid server from within local LAN - Console attached directly to Unraid server shows the login screen - Attempts to login with a valid username/password via the console show an error message like "bash-completion: input/output error" Quote Link to comment
Veah Posted July 19 Share Posted July 19 I have had symptoms 1, 2, 5 and 7 while flash drive was failing. 1 Quote Link to comment
JorgeB Posted July 19 Share Posted July 19 Flash drive failing can cause all sorts of issues, so the first thing is to replace it. Quote Link to comment
shadowlord Posted July 20 Author Share Posted July 20 Thanks. Repalcement USB flash drive is on the way (ordered when I first made this post). Normally, I'd just run out to the local store and grab any old stick, but Limeware only allows you to change sticks once per year (without contacting them) and I want a low-profile one so that it doesn't get physically broken when I'm moving my server around. Quote Link to comment
JonathanM Posted July 20 Share Posted July 20 5 hours ago, shadowlord said: I want a low-profile one so that it doesn't get physically broken when I'm moving my server around. Use a full size one and mount it inside the case, that way it can't get misplaced or broken without opening the case. Quote Link to comment
shadowlord Posted July 21 Author Share Posted July 21 On 7/20/2024 at 7:47 AM, JonathanM said: Use a full size one and mount it inside the case, that way it can't get misplaced or broken without opening the case. Cool! I did not know this was an option. I will start looking into this (and also seeing what internal USB headers are still open on my server's motherboard). Quote Link to comment
shadowlord Posted July 21 Author Share Posted July 21 Server crashed after about 2 days of stability. But here is the syslog. I believe the problems start around timestamp=Jul 21 01:58:59 (My new USB stick is still on the way, so I haven't changed the flash drive yet) syslog-10.10.10.20 - copy.log Quote Link to comment
Solution JorgeB Posted July 21 Solution Share Posted July 21 Jul 21 01:43:55 server03 kernel: sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=0s Jul 21 01:43:55 server03 kernel: sd 0:0:0:0: [sda] tag#0 Sense Key : 0x3 [current] Jul 21 01:43:55 server03 kernel: sd 0:0:0:0: [sda] tag#0 ASC=0x11 ASCQ=0x0 Jul 21 01:43:55 server03 kernel: sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 01 03 12 24 00 00 40 00 Jul 21 01:43:55 server03 kernel: critical medium error, dev sda, sector 16978468 op 0x0:(READ) flags 0x80700 phys_seg 8 prio class 2 Flash drive issues. 1 Quote Link to comment
shadowlord Posted July 22 Author Share Posted July 22 (edited) Okay, this is not good. I followed the steps for restoring my backup .zip file TO the new USB key, it behaves like a completely new installation. My hostname is reset to "Tower" instead of server03, and I can't access the webGUI at the IPv4 address shown on the console. Help! Edited July 22 by shadowlord Quote Link to comment
shadowlord Posted July 22 Author Share Posted July 22 Okay. It seems that the USB flash backup .zip was missing key files needed to boot unraid. For reference, I used the backup zip from Unraid Connect and the Unraid USB Creator tool. When I copied all the files manually from my previous-but-failing USB flash drive onto the new USB flash drive, now I am able to boot into my config. However, the registration page now says: Quote Multiple License Keys Present: There are multiple license key files present on your USB flash device and none of them correspond to the USB Flash boot device. Please remove all key files, except the one you want to replace, from the /config directory on your USB Flash boot device. Alternately you may purchase a license key for this USB flash device. If you want to replace one of your license keys with a new key bound to this USB Flash device, please first remove all other key files first. I guess I have to figure out which files are the licence key files in the /config folder... Quote Link to comment
shadowlord Posted July 22 Author Share Posted July 22 (edited) Deleted 2 very old keys in my flash drive's /config directory, Trial.key and Plus.key. Then I got the expected "Registration key / USB Flash GUID mismatch" page. Clicked on the Replace Key button and this took me to the Unraid website. I had to link my old key to my Unraid.net account. Returned to the Unraid webGUI and went to the "Registration key / USB Flash GUID mismatch" page. Clicked on replace key again, and I was presented with "Replace Key" page on the unraid.net website with a big red "GUID to be blacklisted" containing my old USB key GUID... also with the a New GUID for my new key. After acknowledging, I clicked Confirm. I then got windows that said "success" at the top, but indicated that while my "Pro Key Replaced Successfully" (sic), there was a "Post Install LIcense Key Error" saying that I was "Missing key file" (sic) an error message. I clicked the "Fix Error" button. Then it proceeded to show me a "Thank you for choosing Unraid OS!" screen. I am now able to start the array and my containers are starting up. (Sorry for writing down all the details. I'm using this post to document what I did as a record for my future-self.) Edited July 22 by shadowlord Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.