Aaron Oz Posted December 16, 2018 Share Posted December 16, 2018 Recently, upon power up (from a complete power down), UNRAID starts with the arrayoffline. It says "stale configuration". But all looks correct. If I just reboot the server from there, upon reboot, the array is started automatically. Looking at the log file (attached), I see an error that says; kernel reports TIME_ERROR: 0x41: Clock Unsynchronized Would it make sense that if a CMOS battery is dead, and the motherboard can't keep time when the power is off, UNRAID would see the configuration as stale due to a time mismatch? That theory makes sense to me because UnRaid fixes the clock and with a reboot, the motherboard keeps the updated time. So that's why it starts the array. But I thought I'd check with those more knowledgeable than myself to make sure this isn't a sign of a potentially bigger problem. tower-syslog-20181216-1236.zip Quote Link to comment
Frank1940 Posted December 16, 2018 Share Posted December 16, 2018 (edited) The dead battery would certainly be the first thing I would look at. They are cheap and available about anywhere where batteries are sold. You should be able to find the Battery Number in the MB manual. They are also quite easy to replace after you open the case. You should also upload the Diagnostics file as it provides much more information than the syslog. Tools >>> Diagnostics (IF you had done that I could have hazarded a guess about the age of the battery by knowing when the MB was introduced.) IF it is a dead battey and you made any BIOS changes, recheck and see that things are set correctly! Edited December 16, 2018 by Frank1940 Quote Link to comment
Aaron Oz Posted December 19, 2018 Author Share Posted December 19, 2018 Whelp... doesn't seem to have been the battery. I just happened to put an entirely new MB into my server. All works great! First boot up, no problem. I shut the server down last night and just restarted it this morning, and I had "State Configuration" and the array was offline. Everything was right, so I started it, no problems. I've attached the full diagnostics zip file. If anything can see anything, that would be appreciated! tower-diagnostics-20181219-0804.zip Quote Link to comment
JorgeB Posted December 19, 2018 Share Posted December 19, 2018 I believe the stale config message is because there were two missing disks at first: ... Dec 19 08:03:33 Tower kernel: md: import disk3: (sdj) Hitachi_HUA722020ALA331_YBJXDE6F size: 1953514552 Dec 19 08:03:33 Tower kernel: mdcmd (5): import 4 Dec 19 08:03:33 Tower kernel: md: import_slot: 4 missing Dec 19 08:03:33 Tower kernel: mdcmd (6): import 5 sdi 64 1953514552 0 Hitachi_HUA722020ALA331_B9G4M4NF ... Dec 19 08:03:33 Tower kernel: md: import disk9: (sdb) WDC_WD5000KS-00MNB0_WD-WCANU2446927 size: 488386552 Dec 19 08:03:33 Tower kernel: mdcmd (11): import 10 Dec 19 08:03:33 Tower kernel: md: import_slot: 10 missing Dec 19 08:03:33 Tower kernel: mdcmd (12): import 11 sdc 64 976762552 0 ST31000528AS_6VP35NMG Dec 19 08:03:33 Tower kernel: md: import disk11: (sdc) ST31000528AS_6VP35NMG size: 976762552 They came online right after that: Dec 19 08:03:38 Tower kernel: sd 13:0:3:0: [sdm] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Dec 19 08:03:39 Tower kernel: sd 13:0:3:0: [sdm] Write Protect is off Dec 19 08:03:39 Tower kernel: sd 13:0:3:0: [sdm] Mode Sense: f7 00 10 08 Dec 19 08:03:39 Tower kernel: sd 13:0:3:0: [sdm] Write cache: disabled, read cache: enabled, supports DPO and FUA Dec 19 08:03:39 Tower kernel: .ready Dec 19 08:03:39 Tower kernel: sd 13:0:2:0: [sdl] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Dec 19 08:03:39 Tower kernel: sd 13:0:2:0: [sdl] Write Protect is off Dec 19 08:03:39 Tower kernel: sd 13:0:2:0: [sdl] Mode Sense: cf 00 10 08 Dec 19 08:03:39 Tower kernel: sd 13:0:2:0: [sdl] Write cache: disabled, read cache: enabled, supports DPO and FUA Dec 19 08:03:39 Tower kernel: sdm: sdm1 Dec 19 08:03:39 Tower kernel: sdl: sdl1 Dec 19 08:03:39 Tower kernel: sd 13:0:2:0: [sdl] Attached SCSI disk Dec 19 08:03:39 Tower kernel: sd 13:0:3:0: [sdm] Attached SCSI disk But it's not normal, maybe check connections on those disks. P.S. unrelated but the onboard SATA controller is set to IDE, change to AHCI. Quote Link to comment
Frank1940 Posted December 19, 2018 Share Posted December 19, 2018 IS this server located in a cold environment? One thing that can cause hard drives to come on line 'late' is slow spin-up of the drive motor. (Twenty-some years ago, I had a hard drive with slow spin-up on startup and I had to set the BIOS to do a long memory test rather than a quick one.) Quote Link to comment
Aaron Oz Posted December 19, 2018 Author Share Posted December 19, 2018 Interesting... great observations. And Frank, your thinking is correct, except the slow spin-up drives isn't due to a cold environment. Those two disks are SAS. They are connected to a Dell PERC H310. Watching the boot sequence, I noticed that those drives get spun up slower (........). So it seems the array tries to start before the drives are fully spun up. By the time I go in there, the drives are up and connected and the array is complete. That leads me to think a couple things; SAS SCSI drives aren't used in UnRaid that often and they all spin up more slowly (I don't think that's true) For some reason my H310 / SAS drives spin up more slowly than anyone else's (can't explain that, but it seems likely) I've missed an obvious setting, just like I missed having the onboard controller in IDE VS AHCI. FYI, that H310 is controlling two other SATA drives, too. So I don't think it's the card causing the problem. Quote Link to comment
JorgeB Posted December 19, 2018 Share Posted December 19, 2018 It's likely related to those being SAS drives, they could be on a spin up delay or something, but if they always spin up nothing to worry about. Quote Link to comment
Aaron Oz Posted December 20, 2018 Author Share Posted December 20, 2018 15 hours ago, johnnie.black said: It's likely related to those being SAS drives, they could be on a spin up delay or something, but if they always spin up nothing to worry about. Cool. Thanks for the input. Yeah, they've never not spun up. Is there a variable to tell UnRaid to delay before starting the array? If my kids or wife decide to turn on the server while I'm not around, they won't know to log in and start the array manually. It would be nice if starting the array could be delayed a few more seconds. Quote Link to comment
JorgeB Posted December 20, 2018 Share Posted December 20, 2018 14 minutes ago, Aaron Oz said: Is there a variable to tell UnRaid to delay before starting the array? Not that I know of, you can try disabling spin up delay on the LSI bios, or deleting the bios completely since it's not needed, and that should also get rid of it. Quote Link to comment
JorgeB Posted December 20, 2018 Share Posted December 20, 2018 Another option that might work is adding a delay to the go file before emhttp starts, e.g.: #!/bin/bash sleep 30 # Start the Management Utility /usr/local/sbin/emhttp & Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.