rmp5s Posted January 23, 2019 Share Posted January 23, 2019 I'm...um...a bit...concerned... Was wondering why I could no longer access one of my shares and, come to find out, they're ALL GONE!! This has to be a UI glitch or something...I mean...seriously. This can't be right. Anyone else had this happen? Link to comment
rmp5s Posted January 23, 2019 Author Share Posted January 23, 2019 Is it because I added a new drive and it's "clearing"? Link to comment
trurl Posted January 23, 2019 Share Posted January 23, 2019 5 minutes ago, rmp5s said: I'm...um...a bit...concerned... Was wondering why I could no longer access one of my shares and, come to find out, they're ALL GONE!! This has to be a UI glitch or something...I mean...seriously. This can't be right. Anyone else had this happen? This is usually a browser issue. If you have an adblocker, whitelist your server. Try clearing browser cache, try another browser. When you say you can't access them, do you mean over the network? If you can't access your shares over the network, go to Tools - Diagnostics and attach the complete diagnostics zip file to your next post. Link to comment
rmp5s Posted January 23, 2019 Author Share Posted January 23, 2019 3 minutes ago, trurl said: This is usually a browser issue. If you have an adblocker, whitelist your server. Try clearing browser cache, try another browser. When you say you can't access them, do you mean over the network? If you can't access your shares over the network, go to Tools - Diagnostics and attach the complete diagnostics zip file to your next post. I rebooted the server and they're back. This stupid disk clearing nonsene, which took nearly 30 HOURS to get 24% complete, started over though. SUPER annoying. I REALLY thought it would detect that 24% had been done and start from there. But. No. Apparently not. Might get to add my new drive by mid March! lol Another thing that I'm stumped by: NONE of my VMs can access the server. Can't access shares, can't even ping it. Nothing. If you could shed some light on that, I'd REALLY appreciate it. Link to comment
JorgeB Posted January 23, 2019 Share Posted January 23, 2019 4 hours ago, rmp5s said: This stupid disk clearing nonsene, which took nearly 30 HOURS to get 24% complete That's way too long, a 10TB disk takes around 18 hours to clear, and you can use the server normally during the clear. Link to comment
Fireball3 Posted January 23, 2019 Share Posted January 23, 2019 9 minutes ago, johnnie.black said: a 10TB disk takes around 18 hours to clear Depends on the preclear script. This is a 4 TB drive and I think the speeds are OK. Can't imagine how a 10 TB would finish in 18 hours!? I don't have the preclear reports of my 10 TB at hand right now. ========================================================================1.15b == invoked as: ./preclear_bjp.sh -A -f /dev/sdd == WDCWD40PURX-64NZ6Y0 WD-WCC7K5YELJJ3 == Disk /dev/sdd has been successfully precleared == with a starting sector of 1 == Ran 1 cycle == == Using :Read block size = 1000448 Bytes == Last Cycle's Pre Read Time : 9:42:56 (114 MB/s) == Last Cycle's Zeroing time : 7:49:21 (142 MB/s) == Last Cycle's Post Read Time : 10:08:54 (109 MB/s) == Last Cycle's Total Time : 27:42:11 == == Total Elapsed Time 27:42:11 ======================================================================== Link to comment
JorgeB Posted January 23, 2019 Share Posted January 23, 2019 Depends on the preclear script. This is a 4 TB drive and I think the speeds are OK. Can't imagine how a 10 TB would finish in 18 hours!?I was referring to clearing a disk, not pre-clearing. Link to comment
trurl Posted January 23, 2019 Share Posted January 23, 2019 9 hours ago, trurl said: go to Tools - Diagnostics and attach the complete diagnostics zip file to your next post. Link to comment
rmp5s Posted January 24, 2019 Author Share Posted January 24, 2019 11 hours ago, trurl said: Attached. 16 hours ago, johnnie.black said: That's way too long, a 10TB disk takes around 18 hours to clear, and you can use the server normally during the clear. Ok...good to hear something's out of whack. It's back up to 18-ish% now...after 24hrs or so...kinda nuts. I've been using it anyway, but I'd really like to add the drive to the array so I can finish what I'm doing, ya know? 15 hours ago, Fireball3 said: Depends on the preclear script. This is a 4 TB drive and I think the speeds are OK. Can't imagine how a 10 TB would finish in 18 hours!? Yea, I'm (we're) not talking about preclear. I tried to run that and it took 3 days to get halfway through step two... So, I stopped it and just added the drive. That's what started this "clearing" thing. I've never heard of anything like this. Why is this even necessary? Even my coworkers are like, "why is it even doing that?" We've all added shares to all different kinds of arrays and we're all like, "I've never heard of any file system requiring a drive to be zeroed before it can be added." Is there a way to skip this? Why does it do this? tower-diagnostics-20190123-1555.zip Link to comment
Squid Posted January 24, 2019 Share Posted January 24, 2019 Might not be your problem, but you've got bad memory Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1548247119 SOCKET 0 APIC 3 Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: CPU 1: Machine Check Event: 0 Bank 9: c814394c00800091 Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: TSC 0 Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: ADDR 0 Your event log in the BIOS would probably help you narrow down which stick. Link to comment
rmp5s Posted January 24, 2019 Author Share Posted January 24, 2019 And now, for no apparent reason, the array stopped. Time to start clearing all over again!! Link to comment
rmp5s Posted January 24, 2019 Author Share Posted January 24, 2019 46 minutes ago, Squid said: Might not be your problem, but you've got bad memory Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1548247119 SOCKET 0 APIC 3 Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: CPU 1: Machine Check Event: 0 Bank 9: c814394c00800091 Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: TSC 0 Jan 23 04:38:39 Tower kernel: EDAC sbridge MC0: ADDR 0 Your event log in the BIOS would probably help you narrow down which stick. Oh nice. Thanks. Link to comment
rmp5s Posted January 24, 2019 Author Share Posted January 24, 2019 (edited) So...the million dollar question: HOW CAN I SKIP CLEARING!? I didn't have to clear my cache drive! Can't I just format the new drive and add it? I'd like to use it THIS WEEK! rofl Edited January 24, 2019 by rmp5s Link to comment
JonathanM Posted January 24, 2019 Share Posted January 24, 2019 44 minutes ago, rmp5s said: HOW CAN I SKIP CLEARING!? Set a new config and assign all the drives where you want them, and let parity rebuild. Keeping parity valid when you add a drive requires that the drive be clear. So you can either clear the disk or rebuild parity, your choice. Link to comment
trurl Posted January 24, 2019 Share Posted January 24, 2019 30 minutes ago, jonathanm said: Set a new config and assign all the drives where you want them, and let parity rebuild. Keeping parity valid when you add a drive requires that the drive be clear. So you can either clear the disk or rebuild parity, your choice. Ordinarily clearing is what you want since the array remains protected during the process, whereas if you elect to rebuild parity then it isn't protected until parity is rebuilt. Possibly whatever is causing your slowness will also affect the parity rebuild anyway. Here is another possibility: Set a New Config without that new disk, check the box saying parity is valid, then just use your array without that additional disk and try to get to the bottom of your hardware problems. 1 Link to comment
Squid Posted January 24, 2019 Share Posted January 24, 2019 This is interesting Jan 22 19:17:05 Tower kernel: mdcmd (50): check correct Jan 22 19:17:05 Tower kernel: md: recovery thread: clear ... Jan 22 19:17:05 Tower kernel: md: using 1536k window, over a total of 7814026532 blocks. Been *forever* since I've added a new disk rather than replacing one so can't say for sure, but this looks to me that you're also running a parity check at the same time as a clear. If that is so, then you're really going to bog down the entire system. Link to comment
trurl Posted January 24, 2019 Share Posted January 24, 2019 12 minutes ago, Squid said: This is interesting Jan 22 19:17:05 Tower kernel: mdcmd (50): check correct Jan 22 19:17:05 Tower kernel: md: recovery thread: clear ... Jan 22 19:17:05 Tower kernel: md: using 1536k window, over a total of 7814026532 blocks. Been *forever* since I've added a new disk rather than replacing one so can't say for sure, but this looks to me that you're also running a parity check at the same time as a clear. If that is so, then you're really going to bog down the entire system. That first screenshot doesn't have enough reads and writes to make me think a parity check is going on. I'm guessing that is just how it logs the start of the clearing. I haven't really looked for that in other logs though. Link to comment
rmp5s Posted January 24, 2019 Author Share Posted January 24, 2019 Hmm... So, what'd I screw up? I'm noticing the rebuild time is now going UP. It took a while to build parity, but not an exorbitant amount of time. Adding this drive?...I don't know if it will EVER get done at this rate. Tempted to skip it but if the only way to do so is to break parity, not sure I want to do that. There's actually data on the array now. Data I don't want to lose. And, can anyone point me in the right direction for my VM connectivity issues? My VMs can't so much as ping the server... Thanks everyone for the help, btw. It is greatly appreciated. Link to comment
rmp5s Posted January 24, 2019 Author Share Posted January 24, 2019 How do you restart a parity check? Just noticed mine says... "Parity is valid"...but it says the last check was incomplete and gives an error. Doesn't exactly give the warm and fuzzy. Maybe I'll stop the clearing, run a parity check, then when that's done, start the clearing and it'll finish this month? Link to comment
trurl Posted January 24, 2019 Share Posted January 24, 2019 It still thinks parity is valid. Not sure what happened as far as that incomplete parity check report. Did you miss my suggestion above? 1 hour ago, trurl said: Set a New Config without that new disk, check the box saying parity is valid, then just use your array without that additional disk and try to get to the bottom of your hardware problems. If you New Config without that new disk, then you can use your array again without waiting for it to clear. And since parity is valid (and you will tell it so) you won't even need to rebuild parity. Then you can see if you still have problems without the disk. Link to comment
JorgeB Posted January 24, 2019 Share Posted January 24, 2019 There have been various reports with slow rebuilds/clears and the ST8000DM004, likely you got a dud. Link to comment
JonathanM Posted January 24, 2019 Share Posted January 24, 2019 7 hours ago, rmp5s said: Data I don't want to lose. It's backed up on another drive not connected to the array, right? Link to comment
Fireball3 Posted January 24, 2019 Share Posted January 24, 2019 9 minutes ago, jonathanm said: It's backed up on another drive not connected to the array, right? Link to comment
Recommended Posts