Jump to content

DjJoakim

Members
  • Posts

    46
  • Joined

  • Last visited

Posts posted by DjJoakim

  1. 23 hours ago, sjrahn said:

    You have too much memory.

     

    But seriously I too have 128GB and encounter similar issues and I believe the only fix is to adjust the write caching to ram. I have been searching this issue for weeks and lots of people have similar issues and the only resolution seems to be the following:

     

    Download the tips and tweaks plugin and adjust down the vm.dirty_ratio and vm.dirty_background_ratio values. Or set those values yourself from the command line.

     

    I need to do some more indepth testing but I currently have mine set to 4 and 2. I haven't fully tested if it fixes my specific symptoms when running the mover since I have undertaken other migitations to resolve them but I do know I am seeing similar I/O waits with the new values too.

     

    If you want to find previous topics search for 'io wait' or 'dirty_ratio' but here is the one in particular that led me to this fix:

     

     

    Wow, too much memory! Didn't think that should make any errors, only solutions! hehe.. 
    I have read more about I/O waits and just like you say, it seems to be the problem.

    The only thing confusing me, is that it was working fine before, and i haven't changed anything since then... i was first starting to think some disk in my array was causing this.

     

    I have changed the dirty_ratio and i hope this solves the issue, thank you alot for the response! :)

  2. Hey!

    So for a while ago, maybe 3-4 months - i added an cache drive in my array.
    Some of the new files first lands on the cache disc and then every wednesday it gets moved to the array - this has been working fine for some months now, but recently i noticed that everytime it does this, my whole array just freezez - i can't see any dockers and everything is super slow, the UI even craches sometimes, but as soon as it's done moving, everything moves on as usual.

    I don't really know where to start searching for the errors, i don't get any parity errors or something like that..

    I have provided my latest diagnostic.

     

    Thanks!

    slave-diagnostics-20230823-2035.zip

  3. 37 minutes ago, ich777 said:

    Everything seems correct, where do you have a issue?

    Is this an old instance from the container or did you just install it?

    How much RAM and CPU does the container use?

     

    You can ignore for example the GPU and shader errors because a dedicated server has no GPU or doesn't need any shaders to work.

     

    I just tried it on my server, I've attached the log from a completely fresh install which is working perfectly fine, at the bottom you see this message:

    #DSL Dedicated server loaded.

     

    Here is my log: sotf.log

     

    grafik.thumb.png.15f8f96d09f73da127413a813c22e43f.png

     

    I can also connect to it just fine:
    grafik.thumb.png.007540a4330520939befdea7cc186619.png

     

     

    So it looks to me that you server is also just working fine and you should be able to connect to it.

     

    Thank you for the fast reply!

    Yeah i figured i could ignore all the errors, but it still got me wondering if it had something to do with why it didn't run..
    Comparing the logs, it seems like it's pretty much the same..

    I have tried with this for some week now, but this is a fresh install from today.

    when the server is in idle, the docker uses 6 GB of ram.

     

    This is the last line in the log, and it's pretty much the same for you before you connected, i guess.

     

    WARNING: Shader Unsupported: 'Sons/Trees/LeavesOptimized' - All subshaders removed
    WARNING: Shader Did you use #pragma only_renderers and omit this platform?
    WARNING: Shader If subshaders removal was intentional, you may have forgotten turning Fallback off?
    ERROR: Shader Sons/Trees/LeavesOptimized shader is not supported on this GPU (none of subshaders/fallbacks are suitable)
    WARNING: Shader Unsupported: 'Sons/Trees/LeavesOptimized' - All subshaders removed
    WARNING: Shader Did you use #pragma only_renderers and omit this platform?
    WARNING: Shader If subshaders removal was intentional, you may have forgotten turning Fallback off?
    WARNING: Shader Unsupported: 'Sons/Gui/Unlit/PickupIconGraph' - All subshaders removed
    WARNING: Shader Did you use #pragma only_renderers and omit this platform?
    WARNING: Shader If subshaders removal was intentional, you may have forgotten turning Fallback off?
    ERROR: Shader Sons/Gui/Unlit/PickupIconGraph shader is not supported on this GPU (none of subshaders/fallbacks are suitable)
    WARNING: Shader Unsupported: 'Sons/Gui/Unlit/PickupIconGraph' - All subshaders removed
    WARNING: Shader Did you use #pragma only_renderers and omit this platform?
    WARNING: Shader If subshaders removal was intentional, you may have forgotten turning Fallback off?

     

    Well, i guess there is just something with the network that makes me for some reason unable to see the server, i know i had this problem with Valheim before - and the only solution was to connect to it through my LAN adress, but the SONS server dosen't show up under LAN... :( Why didn't they make a console and create so you could force connect through an IP..

     

    Well, i guess there is nothing wrong with the server and docker, so thanks anyways for you'r help - and as i said before, for the amazing work you put on this!

    I will be scratching my head trying to figure my network out and see if i can find a solution.

     

    Thanks!

  4. Hey @ich777, first of all - thanks for this amazing job you do with all you'r dockers. I have been running many of you'r dockers through the year.

    But, unlike the other dockers - Sons of the forest gives me a little bit of a headache.

    All ports are open, but when i start the game, i get TONS of errors. I have read in this thread that you should ignore the "Yellow WINE errors bc "They will always be there", but i am getting alot of crazy errors, i will post some of them down below.
    And since you can't connect to the server with the ip adress, i don't really know if it's just me not finding the server in SONS or if the server isn't started...

    Hopefully, you will see something in the logs.

     

    Let me know if there is anything more you need.

     

    Thanks!

     

    (Since the log was to long, i posted some of it on pastebin)

     

    https://pastebin.com/zHRdWATf

     

  5. On 10/29/2016 at 6:56 PM, Squid said:

    Run A Custom Script At Parity Check / Rebuild Start And Stop

    Use it to run a custom script to (as an example), shut down various docker applications, etc.  Adjust the variables within the script file.

     

    Note that you either need to run this in the background or at array start.  Running this in the foreground will not work.

     

     

    #!/usr/bin/php
    <?PHP
    # A simple script to allow you to run a custom script when a parity check starts or stops
    # Adjust the following variables to suit:
    
    $checkInterval = 300;                             # Number of seconds in between checks
    $startScript   = "full path to the script";       # The full path to the script to run when a parity check starts
    $stopScript    = "full path to the stop script";  # The full path to the script to run when a parity check stops
    
    # Don't touch anything below
    
    while (true) {
      $vars = parse_ini_file("/var/local/emhttp/var.ini");
      if ( $vars['mdState'] == "STOPPED" ) {
        break;
      }
      if ( ($vars['mdResyncPos'] != 0) && $vars ) {
        echo "Parity Check / Sync / Rebuild in progress.  Executing the start script ($startScript)";
        exec($startScript,$output);
        foreach ($output as $line) {
          echo $line."\n";
        }
        while (true) {
          $vars = parse_ini_file("var/local/emhttp/var.ini");
          if ( ($vars['mdResyncPos'] == 0) && $vars ) {
            echo "Parity Check / Sync / Rebuild finished.  Executing the stop script ($stopScript)";
            exec($stopScript,$output);
            foreach ($output as $line) {
              echo $line."\n";
            }
            break;
          } else {
            sleep($checkInterval);
          }
        }
      } else { 
        sleep($checkInterval);
      }  
    }
    ?>
    
     

     

    custom_script_parity_check_start_stop.zip 1010 B · 58 downloads

     

    So another update on this one..
    Thanks for the script! Has been working fine for me, after i added the missing / in /var/local.... and i also had to add "bash" infront of my path to script, since it didn't get permission.

    But, since 6.12.3, the script dosen't work anymore...
    I know it's an old script, but do you maybe know why it dosen't work anymore? I have tried searching my self, but i can't find anything...

     

    Thanks!

  6. 2 hours ago, bonienl said:

    macvlan call traces impact the complete system.

     

    Your situation is a bit weird. I studied your first diagnostics and everything looks alright. There is nothing in the logs which explains a loss of communication.

     

     

    Alright i see, i have now changed it to ipvlan, if it works i will mark this as solution. Thanks for now :) 

  7. 1 minute ago, bonienl said:

     

    In your latest log there is a macvlan call trace, change docker settings to use ipvlan instead.

     

     

    Alright, i will try it - but does this effect the rest of unraid or only the dockers?

    Bc it feels like unraid after 1 day just can't connect to internet, if i try to update OS i get "not avalible" if i try to update plugins i also get "not avalible"?

     

    Thanks

     

  8. 1 hour ago, JorgeB said:

    Were you having the issue when the diags were saved? I don't see anything out of the ordinary logged.

     

    Yes, but to be safe - here is a fresh diagnostic from just now.
    I haven't rebooted unraid since 2-3 days, and the problem has been the last 2 days.

    Also, when i try to "Check for update" on the plugin page, i get this 
     

    Checking connectivity ...
    No response, aborting!

     

     

    slave-diagnostics-20230705-1129.zip

  9. Hey!

    So i have an unraid server that's been running fine for the last 3-4 years.
    Since 6.12.0 and 6.12.1 came, i have some wierd network error that hasen't been before.
    After 1-2 days of the unraid running, it feels like the OS looses the connection to internet(?) all my dockers etc works and have internet access, but i can't update anything, plugins, dockers and update OS all show not avalible, like the pic below.

     

    If i reboot the server, it works again for 1-2 days... 

     

    Posted my diagnostic, does someone have the same problem and figure out a solution?

    I haven't changed anything in my firewall or on the unraid, the only change i did was updating the unraid OS.

     

    Thanks!

     

    Edit - Just found out there was a new release, 6.12.2, i will ty and see if this fixes the issue.

    Edit2 - 6.12.2 didn't fix anything.. :(

     

    notav.png

    slave-diagnostics-20230702-1339.zip

  10. Just now, JorgeB said:

    Yes.

     

    Bad RAM will corrupt data, btrfs detects corrupt data, so problem is related.

     

    Alright, then it feels like i got bad RAM (since memcheck gave me errors) and that made corruption on my cache drive, witch now is still there.
    I will get new RAM from the retailer, as soon as i get them - i will run a scrub on the cache, and hopefully i don't have alot of data corruption in my array..

  11. 3 minutes ago, itimpi said:

    You can run a scrub by clicking on the drive (or first member if a pool) and selecting the scrub option.

     

    a scrub is completely independent of the Unraid’s parity system.   A btrfs formatted drive has internal block checksums so that it can check its own data integrity.

     

    Oh okey i see, but hold on a minute now... i didn't get the math right.. my array is in XFS format, but my cache is in BTRFS format. Could this mean that the BTRFS errors i get are related to the cache drive? 
    Well, then i should't get the erorrs when i run memcheck, right?

     

    Feels like i don't have to run a scrub on my whole array, maybe only on the cache, am i right?

     

    Thanks for clearing things out for me

  12. 8 hours ago, JorgeB said:

    Run a scrub, any corrupt files will be listed in the syslog, then delete/restore those files from a backup.

     

    Sorry for being a totally noob, but this isn't the same as parity, right? I did some googleing about it and can't find so much information on how to get it done, and some people are talking about scrub being done during parity, but i don't really know what to believe.. 

  13. 6 minutes ago, itimpi said:

    Simply fixing the ram will not fix existing errors so they need to be fixed.    Only after that has been done do you worry about any new ones.

     

    Maybe a stupid question, but how do i fix the existing errors? Remove the files i think is corrupted, or is there some other way?

     

    Thanks.

  14. On 5/4/2023 at 8:37 PM, itimpi said:

    You will have to correct any current corruption after the RAM is fixed, but then hopefully it will stop happening.

     

    So i changed the memory to 2133MHz again, and i am not getting any errors when the server is running, but when i try to run a mover, i get BTRFS errors, does that indicate that some files is corrupt becouse of the broken memory before, or is my memory still bad?
    I will run a memcheck again on 2133MHz and leave it for 24h just to be sure...

  15. 21 minutes ago, JonathanM said:

    Replace the failing memory. Don't try to use the machine until memtest passes at least 1 full pass, preferably let it run for 24 hours.

     

    Do you think all my problems are related to the memory?
    I have already contacted the supplier where i bought the memory, and they will replace it under warranty. The machine is working fine (if you don't read the syslog), should i be worried having it running until i replace the memories?

     

    Thanks

  16. Hey!

     

    I upgraded my unraid system recently and after that i got alot of errors.

    My upgrade was RAM, CPU, Motherboard and also i added a SSD m.2 cache drive.

     

    My system;

    ASUSTeK COMPUTER INC. ROG STRIX Z370-H GAMING (Latest BIOS that supports 128GB ram)

    Corsair 128GB (4x32GB) DDR4 3600MHz

    Intel® Core™ i7-8700K CPU @ 3.70GHz

     

    I have provided my diagnostics and also a picture of my Memcheck (yes i didn't let it finish, but since it gave me errors i just gave up after 17h)

     

    So, the problem began with me finding out my cache didn't transfer all my files, so i starded woundering, huh? Why dosen't it transfer the files? And after seeing the syslog, i noticed alot of errors like this;

     

    May  4 17:13:58 Slave kernel: BTRFS warning (device dm-7): csum failed root 5 ino 617557 off 90546176 csum 0xb6988775 expected csum 0xce5d6bad mirror 1
    May  4 17:13:58 Slave kernel: BTRFS error (device dm-7): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 589, gen 0
    May  4 17:13:58 Slave kernel: BTRFS warning (device dm-7): csum failed root 5 ino 617557 off 90546176 csum 0xb6988775 expected csum 0xce5d6bad mirror 1
    May  4 17:13:58 Slave kernel: BTRFS error (device dm-7): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 590, gen 0
    May  4 17:13:58 Slave kernel: BTRFS warning (device dm-7): csum failed root 5 ino 617557 off 90546176 csum 0xb6988775 expected csum 0xce5d6bad mirror 1
    May  4 17:13:58 Slave kernel: BTRFS error (device dm-7): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 591, gen 0
    May  4 17:13:58 Slave kernel: BTRFS warning (device dm-7): csum failed root 5 ino 617557 off 90546176 csum 0xb6988775 expected csum 0xce5d6bad mirror 1

     

    So googled a bit about the problem and found that people was having this error when the RAM was failing, and since i have OC my RAM (bc it's 3600MHz) i first tried to make it to normal, 1.2V and 2133MHz, but the errors remains.

    And that's when i started the memcheck, that gave me the errors provided in the pic.

     

    So yes, the memory is only 2 weeks old and it can be DOA, but what confuses me is when i run mover, AKA triggering the cache to move to array, i get these errors - is that normal when RAM is failing, or should i also concidering having a broken m.2 drive?

    May  4 17:13:58 Slave kernel: BTRFS error (device dm-7): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 591, gen 0
    May  4 17:13:58 Slave kernel: BTRFS warning (device dm-7): csum failed root 5 ino 617557 off 90546176 csum 0xb6988775 expected csum 0xce5d6bad mirror 1
    May  4 17:13:58 Slave kernel: BTRFS error (device dm-7): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 592, gen 0
    May  4 17:13:58 Slave  shfs: copy_file: /PATHTOTHEFILE /PATHTOTHEFILEAGAIN.partial (5) Input/output error
    May  4 17:13:58 Slave kernel: BTRFS warning (device dm-7): csum failed root 5 ino 617557 off 90546176 csum 0xb6988775 expected csum 0xce5d6bad mirror 1
    May  4 17:13:58 Slave kernel: BTRFS error (device dm-7): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 593, gen 0

     

    I am thankful for all the help i can get!

     

    EDIT
    I also wan't to add that my last parity check, i got over 500 errors.. Maybe it's related!
     

    image0.jpeg

    slave-diagnostics-20230503-1928.zip

  17. On 4/28/2023 at 11:03 PM, DjJoakim said:

     

    Sorry for bumping this old post, but this dosen't seem to be working anymore. Does anyone have a another solution like this?
    Or maybe i am just doing it wrong, but i can't get it working.

     

    Thanks!

     

    So if anyone else bumps into problem with this script, on line 24 there is a / missing, and the script will give you errors for that.

    So add / infront of var/local/emhttp/var.ini on line 24.

     

    If you get permission errors like i did, add "bash" to the pathway of the running and stopping script, like "bash /path/to/script/"

  18. On 10/29/2016 at 6:56 PM, Squid said:

    Run A Custom Script At Parity Check / Rebuild Start And Stop

    Use it to run a custom script to (as an example), shut down various docker applications, etc.  Adjust the variables within the script file.

     

    Note that you either need to run this in the background or at array start.  Running this in the foreground will not work.

     

     

    #!/usr/bin/php
    <?PHP
    # A simple script to allow you to run a custom script when a parity check starts or stops
    # Adjust the following variables to suit:
    
    $checkInterval = 300;                             # Number of seconds in between checks
    $startScript   = "full path to the script";       # The full path to the script to run when a parity check starts
    $stopScript    = "full path to the stop script";  # The full path to the script to run when a parity check stops
    
    # Don't touch anything below
    
    while (true) {
      $vars = parse_ini_file("/var/local/emhttp/var.ini");
      if ( $vars['mdState'] == "STOPPED" ) {
        break;
      }
      if ( ($vars['mdResyncPos'] != 0) && $vars ) {
        echo "Parity Check / Sync / Rebuild in progress.  Executing the start script ($startScript)";
        exec($startScript,$output);
        foreach ($output as $line) {
          echo $line."\n";
        }
        while (true) {
          $vars = parse_ini_file("var/local/emhttp/var.ini");
          if ( ($vars['mdResyncPos'] == 0) && $vars ) {
            echo "Parity Check / Sync / Rebuild finished.  Executing the stop script ($stopScript)";
            exec($stopScript,$output);
            foreach ($output as $line) {
              echo $line."\n";
            }
            break;
          } else {
            sleep($checkInterval);
          }
        }
      } else { 
        sleep($checkInterval);
      }  
    }
    ?>
    
     

     

    custom_script_parity_check_start_stop.zip 1010 B · 51 downloads

     

    Sorry for bumping this old post, but this dosen't seem to be working anymore. Does anyone have a another solution like this?
    Or maybe i am just doing it wrong, but i can't get it working.

     

    Thanks!

  19. 3 hours ago, JorgeB said:

    If you finished the previous parity check run another one to see if no more errors are detected, if it didn't finish run a correcting check followed by a non correcting one.

     

    It didn't finish, but when i rebooted the server, disk nr 7 came up as "emulated" "Unmountable: Wrong or no file system" again, and then unraid started doing a data-rebuild, so i will se what will happend when the rebuild is done..

×
×
  • Create New...