wheel

Members
  • Posts

    236
  • Joined

  • Last visited

Everything posted by wheel

  1. I have no idea how exactly the move went down during the "windows lag" moments, but the "missing" subfolder (and all of its subfolders) straight moved into another (adjacent) subfolder. I'm double-checking to make sure everything is still there, but this is looking more and more like user error solved by find /mnt -iname maneuver. Thank you SO MUCH for steering me in that direction!
  2. I can probably guess at a decent amount of them, but one for sure should be (edited)/
  3. I think I may have done a poor job of explaining myself initially: HD is gone. The files within HD are gone. The file size of those files (multiple terabytes on each of the 18 drives in the array) is still being taken up by SOMETHING, but I can't see or access it anymore because the folder named "HD" (which once contained them) has disappeared from both my user shares (where it was a subfolder of "features") and from each individual disk it was on (where it was a subfolder of each disk's "features" folder). This occurred immediately after trying to move a folder from "HD" to "HD-2" (both within the "features" folder) on a user-share level (not share-to-disk or vice-versa) using Windows SMB. Windows SMB lagged, which I didn't realize, and I tried to move the same folder a second time. When Windows caught up with itself, I saw the folder get "moved" twice (two successful moves of the same folder). After that, /HD (in user shares and on disks alike) was invisible and inaccessible from SMB. I really wish I had captured diagnostics at this point, but I've had so many "inaccessible" SMB issues that were resolved by a simple reboot over the years that I just went ahead and did that instinctively. Once the system was back up and running, with "HD" still missing, I went through Telnet/MC (which has previously shown me folders that had been "missing" over SMB until the SMB issues were resolved) and saw the "HD" folders were ALSO gone on that side of things (i.e., not JUST an SMB issue), I came here with my questions. I had no idea one folder being moved in a weirdly-Windows way could potentially lock me out of a massive pile of data, but here I am, confused and lost, really hoping I'm not about to spend the next few months moving what folders did survive over to some other storage solution before I have to wipe all 18 of these drives and start replacing things from scratch. Edit: for additional clarity, I'm guessing this is dozens of TB total. My "free space" in the array has been stable at about 10TB free both before and after the "disappearance" of the HD folder, which is what gives me hope that its contents are salvageable (since they still seem to exist somewhere across the drives, if only as dead bits taking up space).
  4. Bad news: after upgrading ram and running xfs_repair without -n on every drive (and receiving the following results on each drive), (edited) is still missing on SMB and Telnet (but its data is still clearly "taken up" from the array"). Diagnostics attached (and I also have a set of diagnostics I took before exiting maintenance mode for the repairs, if that helps). Any more ideas would be seriously appreciated! Getting a little more concerned that something weird happened, now. Thank you both for all of the help so far! Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 2 - agno = 0 - agno = 1 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done
  5. I tracked down a locally-available pair of mobo-compatible Crucial 4GB sticks which I could get and install this afternoon; based on what I read (across various forums) about xfs_repair yesterday, it seems like I want it to work as well as possible on the first try if I want the best chance of keeping all data, so: Does it make more sense to run xfs_repair -P now (before modifying my system at all by pulling out and installing more RAM) or switch out the RAM later today and try xfs_repair -n again to see if that message (prompting me to -P) pops up again (or is replaced by new concerns)?
  6. This is totally news to me! I built this server for unraid well over a decade ago now and haven't touched anything (hardware-wise) since, so I'm actually shocked it only had 2GB of RAM (with my "modern-day" thinking) this whole time. It's an incredibly ancient motherboard now, but from what I'm seeing online, my American Megatrends ECS A885GM-A2 board is compatible with up to 32GB of Crucial's old DDR3L-1600 UDIMM (non-ecc) sticks (all listed as EOL on their website, but apparently still available at a few retailers). I clearly need to upgrade the RAM regardless, but it would be AMAZING if doing that and running xfs_repair could save me a few weeks / months of work trying to save the other folders on this system before (from what I'm reading) I'd need to format the whole thing for safety and start from scratch anyways. Two questions: (1) Any potential benefits (for this rescue attempt or any other future functions) to just bumping this box up to 32GB of ram, or should 8GB or 16GB be plenty for future-proofing a vanilla system (and I realize the answer for this question may be "hard to predict the future" - if so, could my repair operations benefit from more RAM, or just getting past the 4GB barrier is as best as I can hope for on odds of success)? (2) Are there any extra safety tips I should follow (in this particular, precarious instance) before removing the old RAM and installing the new sticks, or once those sticks are installed before running xfs_repair? Thank you both so much for the assistance with this! Been pretty concerned about the work involved with "getting back to normal" and this news is helping greatly.
  7. These are the (identical) results for every drive in my array (EDIT-actually, each Phase 4 seems to have the agno entries in a different order. The rest of the results look identical). I've completely exhausted any help existing posts could provide, unless I'm searching way off base. Potentially hundreds of terabytes of (replaceable but not easily) data are in you gurus' hands, now. Cannot thank you enough for any possible help. Phase 1 - find and verify superblock... Memory available for repair (1308MB) may not be sufficient. At least 1911MB is needed to repair this filesystem efficiently If repair fails due to lack of memory, please turn prefetching off (-P) to reduce the memory footprint. Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 2 - agno = 3 - agno = 0 - agno = 1 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting.
  8. Attaching new diagnostics after running (all seemingly OK?) file system checks on each XFS drive in the array as other users have been advised in past threads and using the in-GUI 6.x methods in the unraid FAQ. EDIT TO ADD: All other subfolders in this share (and all other shares) are still totally fine. Just the one that I was moving items FROM is now gone in every way except array file size. EDIT 2: The movement occurring was from one subfolder to another subfolder within the same share. No disk share / user share cross-contamination issues here. EDIT 3: Learning all sorts of fun Linux stuff trying to figure out what's happened here. Just used Telnet to (edited), and I'm getting "no such file or directory." There's no way I've really lost this entire multi-terabyte folder due to a single glitch while moving one file out of it, right? It'd be the craziest thing that's ever happened to me in over a decade with unraid if so, for sure.
  9. Vanilla, no-cache-drive system; only app/plug-in/docker even set up is Krusader. I was just now using Windows SMB cleaning up some files like usual (moving folders from a long-standing, enormous folder - many, many terabytes - to a smaller, new folder) when my system lagged substantially, then eventually moved the file I was asking it to move (a request which had doubled itself due to me thinking Windows hadn't received the move request the first time I made it, and me trying it again before the lag became apparent). The next thing I know, that entire source subfolder I was moving from (edited) is gone. (Edit: even stranger, it's gone as a subfolder from EACH INDIVIDUAL DRIVE, not just the "main" shared folder!). My first thought was Windows SMB glitched up (which it does every now and then in terms of "unreadable" unraid folders), and a reboot would fix things. I idiotically rebooted my unraid system for good measure without pulling diagnostics before doing so. The many, many terabytes from the folder (spread across all disks in the array) appear to still be used up on my disks (total free space is the same amount it was before I started cleaning this afternoon). I've had shares and subfolders disappear from SMB before, and found ways around that, but I've NEVER seen a subfolder disappear from being viewable when Telnetting into any of my unraid servers. Any clues on how to restore visibility of and access to this subfolder would be greatly appreciated. I'm not seeing any obvious warnings about file system corruption like others on the forum seem to have had with missing subfolders in the past, but I may be reading the diagnostics incorrectly.
  10. I do, but temps were hitting the threshold on one or two drives quickly enough to make the auto-pause feature effectively useless for most of the day. Luckily, a buddy with an air-conditioned home came through in the clutch, and I was able to power through all my replacements/parity checks there. Next step longterm is definitely to work on a hardware-based cooling improvement solution!
  11. I have a feeling I already know the answer to my question from years of experience and having searched recent posts, but: I’m running a large server (20 disks, including dual parity) on a MB/CPU combo that rebuilds at “normal” speeds when I add a new drive (replacing an older, smaller one), but takes twice as long for parity checks (due to dual parity calculations, as I’ve been told in the past, with no simple upgrade path - it’s a “dumb” box just used for local A/V storage, content replaceable from another backed-up server, though in a time-consuming way for many reasons, so this hasn’t been something I’m concerned about 95% of the time). I’ve historically always run a parity check after a rebuild to confirm the new disk was “rebuilt”without errors. I’m planning on adding between 3 and 5 drives (as “rebuilds” of existing, smaller drives) to the system before the end of summer, and with temperatures heading in the direction they are, I would like to do this as soon as possible. Most of the year, parity check temperatures are a good 10 degrees Celsius below the safe operating temperatures, but by late summer, almost all disks start flirting with 60C during parity check unless I’m constantly running the air conditioner in the room with the unraid box (which is already getting expensive). I’m staring down the barrel of limited time before the usual local power outages from people running air conditioners during heat waves, and I’m curious if anyone (empirically) sees one path as materially more risky than the other: —-continuing to parity check (for an additional few days each time) as I add each of these new 3 to 5 drives, with all disks running hotter than usual (in each intense parity check) for additional days or weeks, or —-adding all the drives one after the other and then parity checking the lot of them once at the end before outdoor temperatures rise further and seasonal power outages could even start risking me getting caught unawares on UPS issues (just bought a new one a few months ago, but I’m always paranoid about “surprise” power issues and battery failures these days). I know the first instinct for a lot of people will be to guide me towards finding a way to lower the drive temperatures even further using internal methods, but I’ve already researched that path, and replacing hotswap cage fans is going to take more time and effort than I will be able to expend this summer thanks to work obligations, so I’m just looking to take one of the two paths listed above (knowing neither is ideal) in hopes of “beating the heat clock” as safely as possible. Greatly appreciate any guidance and advice from anyone who’s been in a similar situation or knows a good reason (which I haven’t found online yet) why adding multiple drives consecutively before a parity check at the end isn’t that much riskier for the “rebuild” expansion at all!
  12. Subject basically says it all: the tower received a bump during some cleaning while the array was being written to, and I'm thinking that caused the errors in the attached diagnostics, but I wanted to check in here to make sure it doesn't look to any experts like there's some deeper underlying issue I should resolve before simply going for a rebuild (which is going to take about 3 days since I still haven't found an ideal motherboard/CPU combo for speed upgrade on that front). Thanks in advance for any guidance before I start the parity rebuild! tower2-diagnostics-20220923-1552.zip
  13. Had to unexpectedly travel for work these past few weeks (RIP, returns window), but GOOD NEWS! Back home now and confirmed: setting the system clock to a modern date in the Unraid menu fixed everything. Based on a few other topics I’m seeing on system clocks, this might be tied into issues others have had with their system clocks changing/resetting after various forms of unraid shutdown (I’d had an unclean shutdown with OK non-c parity check after not too long before this incident with the system clock), so I’m going to mark this as solved and hope the details within help the team with any overall system clock issues. Thank you all so much for the help and guidance!
  14. Definitely noticing that now. Multi-hour power outage last Tuesday (7/19) if it makes a difference on that front (non-correcting parity check fine after everything got back to normal). Sign of a deeper-rooted issue? (Edit: UPS-protected box, manual power-off button-press before UPS died) (Edit 2: was running gfjardim preclear plugin on a USB-connected drive through unassigned devices WHEN POWER OUTAGE OCCURRED, despite manual shutdown before UPS died, if that makes a difference in anything)
  15. Nope; neither. ISP modem/wifi combo (wifi disabled) into a Netgear R6 series wifi router, both with vanilla default settings. Direct ethernet cable for this particular box. Haven't modified anything in that chain at any point in the better part of the past 5 years.
  16. Yeah, I tried everything listed above, and no luck. System still can't reach githubusercontent by ping (raw still OK), and the manual plugin installation attempt gives me an SSL verification failure error. Any ideas for next steps to find the cause of the github / amazonaws blockage on my end? Or if it's ISP-based, does this mean I basically need to just accept unraid as a deprecated OS for my living situation, at least until I move to a new home with a new potential ISP? Seriously, any ideas at all will be incredibly appreciated! Thanks to everyone for the help so far.
  17. Done. Thank you for checking! CA-Logging-20220723-0847.zip
  18. Yep; screen gives the wiggly-line "loading" animation for a second, then always goes to "Download of appfeed failed. Community Applications requires your server to have internet access. The most common cause of this failure is a failure to resolve DNS addresses. You can try..." ...and then I've gone through all the things to try, using unraid Network Settings, router settings, changing one thing at a time in each as a troubleshooting method... nada. Checked forum posts for others getting this error message, and tried some of their workarounds. Still the same error message. All three servers listed in the forum as handling the appfeed are currently up and reporting no connection errors. I'm totally at a loss and out of ideas (other than giving up and messing with Crystal Disk or whatever, which wouldn't be the end of the world... but damn, it'd be nice to have a preclear option back in my comfort zone, if possible).
  19. I'm definitely not explaining my situation properly. I use my server daily, access it daily, and am constantly on top of drive-replacement-related issues. That's kind of all I use unraid for: a big, dumb-terminal NAS box for file storage and access. WORM stuff. If the default was to keep those boxes completely "off the internet", I would probably never take extra steps to get them communicating with the outside world, if only for security purposes - I just get no personal benefit from (what absolutely seem like amazing and useful) extra features that unraid now offers which it didn't a decade or so ago when I started using it. By no means am I trying to cast shade at any of these features, if anyone's taking my posts the wrong way; I'm just trying to get my stress test functionality back for new disks while changing as little of my comfortable-for-10ish-years setup as possible. Seriously thankful for all the help so far!
  20. Sorry, poor phrasing on my part - I haven't clicked on the Apps tab since I first set it up (I think back when I upgraded to the 6.9 series, a couple years back). I definitely haven't tried installing anything (apps or otherwise) since the version bump, until now (trying to install this new "preclear for unassigned devices" service). There's a chance I've been having issues connecting to the Community Apps system for the entire couple of years I've had it set up - I just wouldn't know because I haven't been clicking on the tab to try using it for anything (no need on my part). I also don't see myself using it again for anything once I get preclear working again, based on my past decade or so of unraid use. Again, apologies for not phrasing that more clearly! Rough morning.