Garbonzo

Members
  • Posts

    57
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Garbonzo's Achievements

Rookie

Rookie (2/14)

3

Reputation

  1. I have been sticking with MacVlan for some time now, but am wanting to switch to IPvlan to see if it stops the once-a-month or so crash I am having... I don't have the skill currently to figure out WHY the crash is happening, but I do keep getting this message telling me to switch to IPvlan. Back when the issue (about macvlans) started, I read through the "help" and it was pretty confusing and complicated. Since I wasn't having problems, I put it on the back burner. Now I am trying to switch and see that there might be some issues with which versions of what docker I use, and so on... plus my mediastack is a custom network, and the latest instructions I just read through said to put anything needing to be proxied in "BRIDGE" so I am wondering where I am gonna land there. I guess, if anyone has good current info on just making the switch that might come in handy, I would appreciate it. -G
  2. Disk5 started throwing some errors while I was out of town for Xmas. It has been emulated since then, and I am going to try to move forward with it today, but may not have time to open it up and move some drives around (the 2x 14tb drives I added just last month run hot where they are mounted without another fan, so I want to move them a bit). But I am not sure if I should move the contents of DIsk5 to the mostly empty DIsk6 and then just replace Disk5 later, but at least have the array back operating in the green... or remove and re-add Disk5 and run some test/diags to make sure I am ok... This seems to happen whenever I have a hard drive problem.. errors of any kind really, I just don't feel like I know where to START or the best way to BEGIN reading/trying to understand what is going on... I understand there are many variables with harrd drives. But is there any resource that will give me some direction like: Look "here" to determine type of error, or something.. then based on that, probable reasons WHY they occurred... Also, I have many of these shucked 8tb Seagate drives from Costco (SMR) that are problematic for multiple reasons.. Several that are removed because of various "reasons" over time, but they are working without issue in other devices (botrh in TrueNAS Scale and a 5bay usb enclosure using drive pooling) Other than doing a pre-clear, how can I check to see if one of these drives would be BETTER to replace Disk5 with on a temporary basis until I can get ANOTHER new drive to replace it with. The NEW Seagate 14tb drives that Costco had turned out to be the newer dual actuator Exos 2x14 drives. A step up for sure! (but they do run even warmer than the SMR drives). So whenever they get them back in stock, I am planning to replace ALL of the 8tb's over this coming year... as I CAN (hopefully NOT as I HAVE TO). Anyway, every time I have an issue, I post the diags and get my problem solved, but never feel like I have learned how to figure anything out for myself moving forward.. Obviously I appreciate the help, but would like to be moving toward self sufficiency for these kinds of situations.. So two related questions: best way to handle the "x" disabled drive TODAY. best way to understand why/what is cause behind the failure (I suppose) TIA, -G ezra-diagnostics-20231226-0540.zip
  3. Somehow the Windows Server VM that is connected to these terrible drives for the purpose of a second "backup" had it's network discovery turned off. I need to better understand the plugin anyway, especially the common script in general, its over my head atm tbh, and I really would love a way to delay the mount until the VM is online... that is something I have to do manually every time I restart the server, and would LOVE to automate! But sincere thanks for helping me focus in to find the problem I was having with the mount... it surprised me that the plugin searched and found the server, and share and let me set it up again, but I guess to be fair, when I pinged the EzraWin server it showed the IP address, just didn't return any pings, so I guess DNS resolved the ip, but I am shocked it took the username/password and showed me the mountable shares... but really, thanks!
  4. Ok, back home and here is the new diags. I really do appreciate you taking a look! ezra-diagnostics-20231210-1702.zip
  5. yeah, sorry, that was my bad, I forgot I turned that on when trying to figure this out when I first noticed it a few days ago after the update... I will get on that and post back soon, thanks for the quick reply though!
  6. I have been using UD without issues before updating from 6.12.4 to 6.12.6 but have not been able to mount anything since. The log shows this happens ever few seconds... here are my diags ezra-diagnostics-20231210-1212.zip I really don't know where to start and haven't been able to google anything particularly helpful on my own, so asking for help from those that know... TIA -G
  7. Yep, I actually had forgotten that, thank you... and the way you describe doing things seems the most practical as well... last time I was swapping a dead drive, this time I will have that peace of mind about having the old drive... So thanks for the help on that.... was thinking I may just add the second 14 to the array and leave the other drives so I have more room to move things around for the time being. There currently is no parity2, I had 2 in the past, had to pull one for another use, sorry if I was confusing
  8. Sorry to reuse a solved issue, but it is the conclusion (hopefully) and maybe will help someone else. I have replaced a parity drive in the past; I followed a video from Spaceinvaderone and it went fine, as far as I recall... Here is where I am today. I finally bought 2 x 14tb drives (to start upgrading past the 8tb drives I am using. I am using shucked drives again, but I have my reasons for that.. -unfortunately. So I was originally thinking I'd replace the parity and either add or replace one of the other drives (so I can maybe use the 8's I pull in my truenas experiment). I realize that it is still just basically "a spin of the wheel" as to which drive will die first, but if there is any insights the log can bring as to which main drive to pull (besides the parity) from the array, it is over my head, so any assistance is welcomed. I hate to @ people, but @JorgeB has been helpful in the past but didn't want to PM in case this info helps someone else find their way to a solution of sorts... Seems like I should be able to leave the 8tb parity and add one of the 14tbs as parity2 then remove the first parity drive (although if parity2 stays named that way, it will mess with my OCD for sure But I don't recall if there was a reason that wasn't/isn't the way to go. Then either just add the 14 to the main array and leave or remove the WORST 8tb offender. <-please advise which that "could" be I will re-watch the videos I did last time (or newer ones if they are out there) prior to attempting this, but since the two drives are going to finish pre-clearing today, I may have time to shuck them and put them in this evening... just wanted to get some experienced perspectives on the best approach if I am missing something obvious... Again, I am sorry to throw this out there in such a vague way- but I will be doing some more research prior to pulling the trigger, but if I can't get to it tonight it will be another week or more probably, so I was trying to preemptively get some assistance about order of operations, etc. TIA! -G ezra-diagnostics-20231116-0854.zip
  9. Right! I guess I had to just process the fact that having 2 parity drives protects against 2 failures INCLUDING one of the parity drives (which could happen to me). SO, as long as it covers that case, it would be the upside I need right now. thanks again
  10. Ok, these were just the only drives that showed the agno = x x so I wasn't 100%... but if that looks OK, I just wanted thoughts adding a second parity drive (it would be another similar age/used drive) to cover me a little more... unless there is some downside in doing that (other than my having to dismantle my truenas test box, but thats fine)... it should give me extra protection until I can physically replace the bad drive with a larger/better drive.... thanks for everything guys, really! -G
  11. I realize this is like 5 yrs old, but considering the drive issues I am dealing with (you are helping me currently actually) I thought something like this might be a good idea. I was wondering if there was a specific tool for unraid that you recommend for creating and reconciling the checksums. TIA
  12. Sorry about the formatting (or lack of) Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2- using internal log - zero log... - 09:18:57: zeroing log - 521728 of 521728 blocks done - scan filesystem freespace and inode maps... - 09:19:00: scanning filesystem freespace - 32 of 32 allocation groups done - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - 09:19:00: scanning agi unlinked lists - 32 of 32 allocation groups done - process known inodes and perform inode discovery... - agno = 0 - agno = 15 - agno = 30 - agno = 31 - agno = 1 - agno = 16 - agno = 2 - agno = 17 - agno = 18 - agno = 3 - agno = 19 - agno = 4 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - 09:19:22: process known inodes and inode discovery - 135040 of 134944 inodes done - process newly discovered inodes... - 09:19:22: process newly discovered inodes - 32 of 32 allocation groups done Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - 09:19:22: setting up duplicate extent list - 32 of 32 allocation groups done - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 12 - agno = 13 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 3 - agno = 4 - agno = 16 - agno = 14 - agno = 5 - agno = 15 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - 09:19:22: check for inodes claiming duplicate blocks - 135040 of 134944 inodes done Phase 5 - rebuild AG headers and trees... - 09:19:27: rebuild AG headers and trees - 32 of 32 allocation groups done - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... - 09:19:45: verify and correct link counts - 32 of 32 allocation groups done done So I can assume there is some other flags or something I need to include, but nothing sticks out to me looking at that... other than some thing being ambiguous like "verify" instead of "verifying" and "rebuild" instead of "rebuilding"... It seems like it may be telling ME to do that (but some have timestamps, so it appears it is what the process is doing... -g
  13. I had just ran it without the -n. this was the first run after, next time I can take it down I will re-run it again and see what it looks like. thanks, I will try to get to it tomorrow. -g
  14. as a side question, was there anything (besides studying the logs on a regular basis) that would have clued me into the disk4 situation? You have always been helpful with pointing out this kind of thing whenever I post a diagnostics, but I feel like maybe I should have seen a warning (similar to hdd temps or something) but I could be wrong. Anyway, thanks for pointing it out, I think its fixed, I just wanted to know how to catch these things if/as they occur. I am not sure if that is just diligence (scanning the log for a key word or two on a regular basis) or some other plug-in or setting I am missing... that type of question.. cheers, -g EDIT: Ok, so not OK, disk4 still shows (after removing the -n) : Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2 - using internal log - zero log... - 10:29:34: zeroing log - 521728 of 521728 blocks done - scan filesystem freespace and inode maps... - 10:29:36: scanning filesystem freespace - 32 of 32 allocation groups done - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - 10:29:36: scanning agi unlinked lists - 32 of 32 allocation groups done - process known inodes and perform inode discovery... - agno = 30 - agno = 15 - agno = 0 - agno = 16 - agno = 31 - agno = 1 - agno = 17 - agno = 18 - agno = 2 - agno = 19 - agno = 3 - agno = 20 - agno = 4 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - 10:29:56: process known inodes and inode discovery - 134912 of 134816 inodes done - process newly discovered inodes... - 10:29:56: process newly discovered inodes - 32 of 32 allocation groups done Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - 10:29:56: setting up duplicate extent list - 32 of 32 allocation groups done - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 8 - agno = 12 - agno = 15 - agno = 6 - agno = 7 - agno = 1 - agno = 10 - agno = 9 - agno = 11 - agno = 13 - agno = 14 - agno = 2 - agno = 5 - agno = 4 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - 10:29:56: check for inodes claiming duplicate blocks - 134912 of 134816 inodes done No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... - 10:30:13: verify and correct link counts - 32 of 32 allocation groups done No modify flag set, skipping filesystem flush and exiting. I really want to understand... I need to learn more about file systems (that aren't NTFS) as well... I need to find some good videos on this type of stuff... as reading understanding some of this stuff takes me alot longer to grok the situation. Based on other threads, unsure if there are additional flags I should use for the xfs_repair OR if I need to rebild (that may only be if the drive is disabled, this one is mounting fine atm), so ALSO - ORDER OF OPERATIONS (considering replacing the parity is playing into this game) Again, a sincere thanks for the help. (as a side note, disk5 has some AGNO's listed as well, but disks1-3 seem fine) -g
  15. I was talking about the 55 listed in the error column on the array devices tab, but looking back at my email alerts from the previous month with errors, there were 3000+ so it was a different "event". I REALLY thought I had replaced parity when I pulled the pairity2 drive that was in there... It may be a week or 2 until I research (and get a little more coin together) something like 16tb+ enterprise drives, at the minimum a single drive to start with parity. SO, in the meantime, I think I will pull 2 of these 8tb drives from the Truenas box and get them running pre-clears in unraid... then rebuild them as parity/parity2 (not sure if I can do that at one time)... unless replacing disk4 is a better option (although these are gonna be in that 3+yr range as well) as it looks like it was disk5 I replaced earlier this year... I'd love to be on like 4 x 20tb drives by end of year if possible, so I can feel SOLID about this media server, and use the other drives to learn new stuff with... I just have so many issues to resolve on this unraid server... this macvlan to ipvlan has been confusing for me, as by the time I had read and re-read how to setup ipvlan with multiple nics, only to have that seem to be fixed in 6.12.4 (yet fix common problems still gripes), but mainly I need to track down why accessing this server seems SO LAGGY for the last few months... from wired and wireless connections... its possible that its network related, but learning wireshark at the same time has got me moving VERY slow to figure out what is going on... Here is all I REALLY know: when I first upgraded/migrated from a dual xeon X5680's with 72gb ram to a AMD Ryzen 7 1700 w/32gb seemed to perform at a similar "feel" to the old server for a few months... then got worse... I always planned to upgrade to a 5000 series cpu (never even intended on the 1700, it just came with the MB I bought) but I don't want to throw money at a problem atm if that isn't the main issue... I will start the correcting parity check now, and move on from there... oh, and file system on disk4 and extended SMART test @JorgeB recommended as well, and I guess post the extended SMART results would be req'd Again, thanks for your help, I will report back and/or mark solutions afterwards. -G