BAlGaInTl

Members
  • Posts

    33
  • Joined

  • Last visited

Posts posted by BAlGaInTl

  1. I woke up this morning to all of my docker containers being gone.  I dug in a bit more and found the following message for my Cache Pool where the appdata folder resides:

     

    Unmountable: Unsupported or No File System

     

    Not sure how this happened, and I'm not great at troubleshooting it.  I'm not sure if something got corrupted?

     

    I run the backup utility, and it ran last night, so I have all my appdata.  Just wondering if there is a quick way to restore this?

     

    I've attached my diagnostics

    dmitri-diagnostics-20231030-0654.zip

     

    Edit: Changed from solved to add details below.

  2. 2 hours ago, ich777 said:

    Set this variable in the template to false:

    grafik.thumb.png.ca250cec53e016c2881bf0697ac77115.png

    (I've updated the description from the variable a bit for new installations, the UPDATE_CHECK will only work on the stable branch)

    I was just coming back to post that I found that was the issue.  I disabled that since it also happens every 60 minutes and it fixed it.

     

    Thanks for the confirmation.

     

     

    • Like 1
  3. On 12/1/2022 at 11:11 PM, dremox1 said:

    Hello ich777,

     

    My friend is running a valheim dedicated server using your docker and we have an issue of an disconnect from server for all players on a what seems to be an hourly schedule.  There is a connection error icon that pops up and flashes on and off in the top left of the game screen and we all get kicked at the same time. 

     

    Attempted fixes:

    Remove crossplay 

    Change worldsave time (less frequent)

     

    There is nothing denoting a problem in the logs from the server either that we can see (we are also not very experienced this is our first server).  Not sure if you have seen anything like this before or if you had any suggestions to resolve.  Any help would be greatly appreciated.  

     

    On 12/1/2022 at 11:29 PM, ich777 said:

    How many players are connected to the server? Are you using the official version or the beta branch where you can test the new update?

    What kind of Internet connection does your friend have?

     

    I‘ve never experienced such an issue at all and it seems that something is wrong with the Internet connection here from the server if this also happens when crossplay is enabled.

     

    This is a bit hard to troubleshoot since this is not your server.

     

    Another take on this would be if he has some other things running on the server which maybe restart the container or better speaking, is the container restarted after such a disconnect happens (should be visible in the log)?

     

    I came here looking for the same thing.

     

    Running in to exactly the same issue with the public test beta.  We have 1-6 players at any given time.  We've used the ich777 container several times before without issue.  The only difference with this one is that I'm using a different appdata location and I've added:

     

    '-beta public-test -betapassword yesimadebackups'

     

    to the 'GAME_ID'.

     

    One interesting thing that I've noticed is that world files are NOT being created in the backups folder.  It remains empty. Since this behavior was happening each hour, and the default backup is 62 min, I tried setting the backup variable to 'false'.   This did not fix the issue and players are still being disconnected about every 60 minutes.

     

    I'm going to keep digging, but so far, I've had no luck.

     

    If I can provide anything to help in troubleshooting, let me know.

  4. 31 minutes ago, JonathanM said:

    The built in memtest will still somewhat exercise the ECC, but doesn't bypass the ECC functionality, so you have to look in the BIOS for memory error logging to see if the ECC registered a correctable bit error.

     

    The way ECC is supposed to work is if the memory error is correctable, it corrects and logs the the error, similar to the way Unraid corrects for a read error on a hard drive. If the error can't be corrected, it hard locks the machine to keep silent errors from multiplying and corrupting things, similar to how Unraid drops a hard drive when it gets a write error.

     

    I agree more could be done to educate about the limitations of the built in memtest, it's many years old at this point.

     

    That's good info, thanks.

     

    So I confirmed that it does look like it is just one module creating the issue.  I'll run of one 32GB module for the time being and try to RMA the other.

     

    Thanks to all for the help.

    • Like 1
  5. Just now, trurl said:

    yes

     

    I wonder if there is a way that can be clarified in the boot menu, or removed as a boot option?

     

    Could have saved me a lot of time.  There is nothing to indicate that it isn't working.  It booted and ran just fine.  Recognized the ram and did 3+ passes in 24 hours.

     

    Also, that should probably be spelled out here since it seems that ECC is actually a good idea for Unraid servers even if it isn't necessary:

     

    https://wiki.unraid.net/Manual/Troubleshooting#RAM_Issues

     

    That says it doesn't have all the features, but not supporting ECC is a pretty big one.  Especially since MemTest still runs and seems to be doing it's thing just fine.

     

    Perhaps it would be better if the old version just isn't included anymore, and users just directed to download the latest version for troubleshooting?

  6. 22 minutes ago, trurl said:

    builtin memtest doesn't work with ECC memory. You have to get the official memtest86

     

    Seriously?  I wasted a lot of time then.  That's the first time I've seen that.

     

    For now, I'm verifying that it wasn't just seated improperly by putting the stick I think may have been failing back.  If I start getting errors again, I may download the full MemTest86, but not sure if I will need to.

     

    Is it a licensing issue for including the most recent version that supports ECC with Unraid?

  7. 5 minutes ago, JorgeB said:

    Don't see how.

    Yeah... I don't know what I was thinking.

     

    Looking at the system info that I uploaded, it says 64GB.

     

    I pulled the second stick and started getting errors.  Moved that one I removed to the same slot I was testing and so far no errors.  Looks like I may have a stick going bad?  Fingers crossed I don't get any more errors.

     

    So is Memtest just worthless then?

  8. I'm working on doing that now.

     

    I pulled one stick and realized I have a 64GB kit and Unraid is only reporting 32GB with both sticks installed.  Memtest was reporting 64GB.  Could it be a weird issue with Unraid?

     

    Seems weird that I could go 24 hours with no errors in Memtest, and then I get memory errors withing 10-30 minutes in Unraid.

  9. Hello All,

     

    I recently started noticing that my server was constantly running parity checks.  Turns out, it seems it was randomly rebooting once a day, and I can't figure out why.

     

    I looked at the logs, and I see a bunch of errors like this:

     

    Jul  4 15:11:48 Dmitri kernel: mce: [Hardware Error]: Machine check events logged
    Jul  4 15:11:48 Dmitri kernel: [Hardware Error]: Corrected error, no action required.
    Jul  4 15:11:48 Dmitri kernel: [Hardware Error]: CPU:0 (17:71:0) MC18_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|Scrub]: 0xdc2041000000011b
    Jul  4 15:11:48 Dmitri kernel: [Hardware Error]: Error Addr: 0x00000007c3f5d0c0
    Jul  4 15:11:48 Dmitri kernel: [Hardware Error]: IPID: 0x0000009600150f00, Syndrome: 0x000400040a801202
    Jul  4 15:11:48 Dmitri kernel: [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
    Jul  4 15:11:48 Dmitri kernel: EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#2channel#1 (csrow:2 channel:1 page:0x0 offset:0x0 grain:64 syndrome:0x4)
    Jul  4 15:11:48 Dmitri kernel: [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD

     

    So I thought maybe it was memory related.  I rebooted my server and ran Memtest for 24 hours with no errors.

     

    Hardware:

    ASRockRack X470D4U

    AMD Ryzen 9 3900X 12-Core @ 3800 MHz
    32GB DDRF ECC

     

    This is a relatively new development.  Any idea on what may be going on?

     

    Thanks.

    dmitri-diagnostics-20220704-1554.zip

  10. On 3/12/2021 at 8:53 PM, Squid said:

    Reason is this (when I try and add a "user" share)

     

    image.thumb.png.9af413f6c71603660f83a1fbc5911301.png

     

    From the command prompt, you need to rename the share to some else:

    
    cd /mnt/user
    mv user my_users

     

     

    So renaming the share worked and everything showed up.  I just had to give users permission to that share again.

     

    It seems like the upgrade script should check to see if that's an issue before the upgrade. 

     

    Either way, thank you for the help on getting this fixed.  I was racking my brain.

     

    Thanks.

  11. 1 hour ago, Squid said:

    What does that mean?

     

    It just doesn't seem to exist in my shares.  I know it was set up, because I have //server/user mapped on multiple computers here in my house and they all stopped working.

     

    This is what I see on my shares now:

     

    shares.thumb.PNG.36ca1a7b84af21fe3df39773627e15b1.PNG

     

    Media is the main share where I have movies tv home videos etc.

     

    The share "user" used to be there I think... It used two separate physical disks from the rest of the shares.  It worked for so long without issue, I don't even remember how I had it set up. :)

     

    The two disks are still there... I can still browse the physical disks and see the files.  I just can't connect by SMB, and and all my network mappings broke.

     

    Can a share named "user" just be recreated using Disk 3,4 and it will go back to working?

     

    I'm just perplexed as to how it just disappeared in the first place.  There seems to be config files pointing to it in diagnostics, but it seems to be gone from any unraid menu.

  12. 9 hours ago, Squid said:
    
    shareReadList=""
    shareWriteList="..."

    That share has no authorized users to read it.  Try setting up the user permissions again in the share (Toggle something off / on if necessary)

     

    I guess that's the problem... there is no share listed for it. There's no place for me to set permissions.

     

    I don't know where it could have gone.

     

     Maybe I'm missing something.

  13. The diagnostics are attached.

     

    The shares used to be visible at //dmitri/user/

     

    What was once the "user" share was segregated from my other shares and was on Disk 3/4 only.  Like I said... I can still see the data physically on the disk, but the share is gone. None of my applications or media touch those two drives.  All of that stuff is working as normal.

     

    Any help/insight is appreciated.

     

    Thanks.

     

     

    dmitri-diagnostics-20210312-0700.zip

  14. I recently upgraded to 6.9.0, and I thought everything went fine.  Today I rebooted my windows system, and it said it couldn't reconnect to my user shares.  These are the ones that I had specifically set up for different user accounts.

     

    All of the other regular shares for media and such are there.

     

    I've tried rebooting, but no luck.

     

    I can see that the disks are there with no errors.  I can also see that the files are still there. 

     

    What would make the "user" shares disappear?

  15. Having just considered moving from standard software RAID to unRAID, this is what I would suggest... Assuming that you don't already have an easy to access backup.

     

    Get a good deal on a WD 8 or 10TB external drive (Easy Store or My Book).

     

    Copy the data to that and verify that it's good.

     

    Build your new pool without a parity drive.

     

    Copy the data from the external to the new pool, and verify the data.

     

    Shuck the 8 or 10TB drive, and install it as your parity drive.

     

    Now you have the new pool/system set up, and plenty of room for easy expansion since your parity drive is now much larger.

     

    Keep your old 4TB parity drive as a spare in case a drive fails.

     

     

    • Like 1
    • Upvote 2
  16. Okay, I had another thought today. I'm starting the build tonight...

    I have the 2 x 10TB drives shucked and ready to go.

    Am I better of doing 1 parity and 1 data to start? Or is it better to start with 2 data drives so that the data is balanced access the two drives from the start?

    Either way, I'll be adding the 3rd drive in a few weeks.

    Thanks again for all the help.

    Sent from my Pixel 2 XL using Tapatalk

  17. 14 minutes ago, jonathanm said:

    Depends on the enclosure. USB is not recommended for array members, for multiple reasons, disk id being one. Heat being another, USB enclosures are typically designed for light usage, a parity build or check keeps the drives active 100% for many hours, which will likely overheat the drives if they are in the enclosure. Some enclosures also remap the drive in some way, making the content unreadable once shucked.

     

    If you insist on using them via USB, and they are not readable once you shuck them, all is not lost. As long as you have valid parity and the drives stay healthy, you can unshuck and rebuild them directly connected one at a time. Still not recommended though.

     

    I was afraid of that.

     

    It's a WD My Book.

     

    I was only planning on building a very basic pool without parity initially.  Then simply copy all the files from my current NAS over.  Then shuck the drives and put them in the proper server.