Jump to content

je82

Members
  • Posts

    468
  • Joined

  • Last visited

Posts posted by je82

  1. looks like its /mnt/disk4 could this be correct? found this inside syslog.txt:

     

    Sep 27 11:16:04 NAS kernel: XFS (dm-3): Ending clean mount
    Sep 27 11:16:04 NAS kernel: xfs filesystem being mounted at /mnt/disk4 supports timestamps until 2038 (0x7fffffff)

  2. Im trying to understand what XFS dm-3 is, when i do cli: dmsetup info -c dm-3     
    Device does not exist.

    What am i looking at here?

     

    dmsetup ls

     

    returns:


    md1     (254:0)
    md2     (254:1)
    md3     (254:2)
    md4     (254:3)
    md5     (254:4)
    md6     (254:5)
    md7     (254:6)
    md8     (254:7)
    md9     (254:8)
    sdb1    (254:9)
    sdc1    (254:10)

  3. Today i saw in the log, from unraid:

     

    Quote

    Oct 20 11:18:11 NAS kernel: <TASK>
    Oct 20 11:18:11 NAS kernel: dump_stack_lvl+0x46/0x5a
    Oct 20 11:18:11 NAS kernel: xfs_corruption_error+0x64/0x7e [xfs]
    Oct 20 11:18:11 NAS kernel: xfs_dir2_leaf_getdents+0x23e/0x30f [xfs]
    Oct 20 11:18:11 NAS kernel: ? xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
    Oct 20 11:18:11 NAS kernel: xfs_readdir+0x123/0x149 [xfs]
    Oct 20 11:18:11 NAS kernel: iterate_dir+0x95/0x146
    Oct 20 11:18:11 NAS kernel: __do_sys_getdents64+0x6b/0xd4
    Oct 20 11:18:11 NAS kernel: ? filldir+0x1a3/0x1a3
    Oct 20 11:18:11 NAS kernel: do_syscall_64+0x80/0xa5
    Oct 20 11:18:11 NAS kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
    Oct 20 11:18:11 NAS kernel: RIP: 0033:0x1485fc4d55c3
    Oct 20 11:18:11 NAS kernel: Code: ef b8 ca 00 00 00 0f 05 eb b3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 48 10 00 f7 d8
    Oct 20 11:18:11 NAS kernel: RSP: 002b:00007ffec5d47228 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
    Oct 20 11:18:11 NAS kernel: RAX: ffffffffffffffda RBX: 0000000000453e00 RCX: 00001485fc4d55c3
    Oct 20 11:18:11 NAS kernel: RDX: 0000000000008000 RSI: 0000000000453e00 RDI: 0000000000000007
    Oct 20 11:18:11 NAS kernel: RBP: ffffffffffffff88 R08: 0000000000006240 R09: 00000000004958d0
    Oct 20 11:18:11 NAS kernel: R10: fffffffffffff000 R11: 0000000000000293 R12: 0000000000453dd4
    Oct 20 11:18:11 NAS kernel: R13: 0000000000000000 R14: 0000000000453dd0 R15: 0000000000001092
    Oct 20 11:18:11 NAS kernel: </TASK>
    Oct 20 11:18:11 NAS kernel: XFS (dm-3): Corruption detected. Unmount and run xfs_repair
    Oct 20 11:18:23 NAS kernel: XFS (dm-3): Internal error !xfs_dir2_namecheck(dep->name, dep->namelen) at line 462 of file fs/xfs/xfs_dir2_readdir.c.  Caller xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
    Oct 20 11:18:23 NAS kernel: CPU: 17 PID: 30521 Comm: find Tainted: P        W  O      5.15.46-Unraid #1
    Oct 20 11:18:23 NAS kernel: Hardware name: Supermicro Super Server/X12SPI-TF, BIOS 1.4 07/11/2022
    Oct 20 11:18:23 NAS kernel: Call Trace:
    Oct 20 11:18:23 NAS kernel: <TASK>
    Oct 20 11:18:23 NAS kernel: dump_stack_lvl+0x46/0x5a
    Oct 20 11:18:23 NAS kernel: xfs_corruption_error+0x64/0x7e [xfs]
    Oct 20 11:18:23 NAS kernel: xfs_dir2_leaf_getdents+0x23e/0x30f [xfs]
    Oct 20 11:18:23 NAS kernel: ? xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
    Oct 20 11:18:23 NAS kernel: xfs_readdir+0x123/0x149 [xfs]
    Oct 20 11:18:23 NAS kernel: iterate_dir+0x95/0x146
    Oct 20 11:18:23 NAS kernel: __do_sys_getdents64+0x6b/0xd4
    Oct 20 11:18:23 NAS kernel: ? filldir+0x1a3/0x1a3
    Oct 20 11:18:23 NAS kernel: do_syscall_64+0x80/0xa5
    Oct 20 11:18:23 NAS kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
    Oct 20 11:18:23 NAS kernel: RIP: 0033:0x14ccdcd7e5c3
    Oct 20 11:18:23 NAS kernel: Code: ef b8 ca 00 00 00 0f 05 eb b3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 48 10 00 f7 d8
    Oct 20 11:18:23 NAS kernel: RSP: 002b:00007fff742a4de8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
    Oct 20 11:18:23 NAS kernel: RAX: ffffffffffffffda RBX: 0000000000453e00 RCX: 000014ccdcd7e5c3
    Oct 20 11:18:23 NAS kernel: RDX: 0000000000008000 RSI: 0000000000453e00 RDI: 0000000000000007
    Oct 20 11:18:23 NAS kernel: RBP: ffffffffffffff88 R08: 0000000000000030 R09: 0000000000450ad0
    Oct 20 11:18:23 NAS kernel: R10: 0000000000000100 R11: 0000000000000293 R12: 0000000000453dd4
    Oct 20 11:18:23 NAS kernel: R13: 0000000000000000 R14: 0000000000453dd0 R15: 0000000000001092
    Oct 20 11:18:23 NAS kernel: </TASK>
    Oct 20 11:18:23 NAS kernel: XFS (dm-3): Corruption detected. Unmount and run xfs_repair
    Oct 20 11:18:23 NAS kernel: XFS (dm-3): Internal error !xfs_dir2_namecheck(dep->name, dep->namelen) at line 462 of file fs/xfs/xfs_dir2_readdir.c.  Caller xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
    Oct 20 11:18:23 NAS kernel: CPU: 10 PID: 30521 Comm: find Tainted: P        W  O      5.15.46-Unraid #1
    Oct 20 11:18:23 NAS kernel: Hardware name: Supermicro Super Server/X12SPI-TF, BIOS 1.4 07/11/2022
    Oct 20 11:18:23 NAS kernel: Call Trace:
    Oct 20 11:18:23 NAS kernel: <TASK>
    Oct 20 11:18:23 NAS kernel: dump_stack_lvl+0x46/0x5a
    Oct 20 11:18:23 NAS kernel: xfs_corruption_error+0x64/0x7e [xfs]
    Oct 20 11:18:23 NAS kernel: xfs_dir2_leaf_getdents+0x23e/0x30f [xfs]
    Oct 20 11:18:23 NAS kernel: ? xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
    Oct 20 11:18:23 NAS kernel: xfs_readdir+0x123/0x149 [xfs]
    Oct 20 11:18:23 NAS kernel: iterate_dir+0x95/0x146
    Oct 20 11:18:23 NAS kernel: __do_sys_getdents64+0x6b/0xd4
    Oct 20 11:18:23 NAS kernel: ? filldir+0x1a3/0x1a3
    Oct 20 11:18:23 NAS kernel: do_syscall_64+0x80/0xa5
    Oct 20 11:18:23 NAS kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
    Oct 20 11:18:23 NAS kernel: RIP: 0033:0x14ccdcd7e5c3
    Oct 20 11:18:23 NAS kernel: Code: ef b8 ca 00 00 00 0f 05 eb b3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 48 10 00 f7 d8
    Oct 20 11:18:23 NAS kernel: RSP: 002b:00007fff742a4de8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
    Oct 20 11:18:23 NAS kernel: RAX: ffffffffffffffda RBX: 0000000000453e00 RCX: 000014ccdcd7e5c3
    Oct 20 11:18:23 NAS kernel: RDX: 0000000000008000 RSI: 0000000000453e00 RDI: 0000000000000007
    Oct 20 11:18:23 NAS kernel: RBP: ffffffffffffff88 R08: 0000000000006240 R09: 00000000004958d0
    Oct 20 11:18:23 NAS kernel: R10: fffffffffffff000 R11: 0000000000000293 R12: 0000000000453dd4
    Oct 20 11:18:23 NAS kernel: R13: 0000000000000000 R14: 0000000000453dd0 R15: 0000000000001092
    Oct 20 11:18:23 NAS kernel: </TASK>
    Oct 20 11:18:23 NAS kernel: XFS (dm-3): Corruption detected. Unmount and run xfs_repair

     

    On my logserver i see the log server captured nearly 25 000 lines of log errors like this, it was occuring for nearly 1.5 hours! I dont know if it was occuring that long because it took that long to generate the log or if it actually was something happening for 1.5 hours. Anyway, i dont understand this message, i just saw it now, everything works fine as far as i can tell and the web gui shows no disk errors? What am i looking at here? What could be the cause? Should i be worried? The server has had no issues for years.

  4. I ran some backups and there was a ton of files written to the cache ssd, over 4 million files.

    The mover ran, and i just noticed its still running, its been 16 hours, and the ssd is still filled with stuff, i can't see anything actually moving.

     

    2 questions:

    1. how can i see what the mover is doing?

    2. how can i stop the mover and restart it?

     

    Thanks.

  5. Darn, scary read, but i have NFS disable as well as hardlink support, but i found someone who was doing stuff over SMB via a WM that was hosted on the same unraid machine, this is actually something i was also doing when this error occured, at first i though it was due to rsync running but looking closer the rsync script finished sucessfully just seconds before this error occured so its more likely i managed to trigger the error doing stuff via SMB from the WM. Makes me wonder if you can craft a bad smb packet and just crash shfs on demand?

     

    Ill monitor the situation and hope for the best, thanks for the information @JorgeB

    • Like 1
  6. Running Version: 6.10.3  i had SHFS crash on me for the frist time, the system is very stable, runtime was over half a year and previously before that the same, never had this happen before and would like to try to investigate why it suddenly happened.

     

    Log isn't very helpful, no issues until suddenly:

    Sep 27 10:33:40 NAS shfs: shfs: ../lib/fuse.c:836: unref_node: Assertion `node->refctr > 0' failed.
    Sep 27 10:33:41 NAS emhttpd: error: get_filesystem_status, 6258: Transport endpoint is not connected (107): scandir Transport endpoint is not connected

     

    All i know is that rsync was running at the time it happened, my guess this is the culprit? Is it that it ran out of memory or something else, like i said i am not getting much help from the logs.

     

    For future reference, is there anyway to start SHFS if this occurs or do i have to unmount the array and then remount it again in order to get it back up running? I had all my docker containers mapped directly instead of via shfs so the only thing that happened when shfs crashed was that i couldnt access any data via the smb shares.

  7. I have this issue to, seems very random, not available at random times on random dockers.

     

    My uneducated guess is that docker is limiting the amount of requests it takes from a client, and if you have too many dockers and do a check it will flood their system and it will block some of the requests ending up in not available, none of the fixes in this thread solved my issue, it randomly comes up as not available at random dockers at random times.

     

    I cannot use squids patch because i am on an old version, for now i have to live with it.

  8. Had a major unraid crash today.

     

    So once more the oom memory ran while the mover ran and killed my biggest VM, it seems like whenever the fuse process is taking place it consumes a massive amount of extra virtual memory that is not visible via the normal tools to look at the memory pool.

     

    After the oom-killer had killed my VM the memory consumption while mover was running according to unraid was 30%, my VM consumes 50% of the total memory, that means 20% should have been left over.

     

    I start the VM again without waiting for mover to stop.

     

    Once i do this, the webgui instantly becomes unresponsive.

     

    I SSH into the machine and try to see whats consuming memory. I start from the heaviest docker and try to to stop it via command "docker stop name"

    this yelds no result, it cannot be stopped because something is not responding (dont remember exactly how the message was formulated)

     

    i try command docker kill name

    this yeild the exact same result as docker stop.

     

    next i try to do virtsh list -all to see if my vm is running, the command just hangs in the air, never respond, i wait over 10 minutes.

     

    next i do top and sort for whatever is consuming most virtual memory, i see that SHFS is asking for a massive amount of virtual memory so i do a kill -9 PID to free up memory

    process is killed but nothing really happens, webgui still completely unresponsive, no docker containers can be shut down via cli, trying to interact with virsh yields no result completely unresponsive.

     

    finally i give up, i issue command powerdown in the cli.

    looking at my logserver, it actually does somthing this time, it actually shutsdown all the docker containers and unmounts the disks, (why could i not shut down the dockers is my main question here).

     

    eventually it locks up trying to umount several of processes, such as

    umount sys/fs/cgroup target is busy

    umount sys/fs/cgroup/cpu target is busy

    umount sys/fs/cgroup/cpu/freezer target is busy

    umount /mnt target is busy

     

    i wait for some minutes then i hard reset using the hardware reset button.

     

    i am out of options here, my main 2 questions are:

     

    1. why is so much virtual memory being randomly consumed by the fuse operation at random times? How can i put a cap on this? I use unraid for VM hosting and its becoming unreliable because the fuse system just eats memory at random and then oom-killer defaults to killing the process eating most memory which is always going to be my VM because it consumes 50% of the total memory when running statically.

     

    2. any ideas why docker and virsh was both unoperationable, virsh was actually tototally unresponsive issuing any command to it yielded no response, but the strange part is docker was responding just fine but i couldn't shut down any of the containers using stop or kill, both responded in the same way that something (i cannot rememeber exactly what it said) is not responding and therefor it cannot be stopped.

     

    thanks.

     

  9. I'm not a proficent coder, but can someone tell me if it would at all be possible to re-write some of the functions of "Active Streams" and have it log opened media files to a log file?

     

    I've been experimenting with samba logging and the problem is it is far to noisy, if you open a playlist of mp3 containing 10 000 mp3s for whatever reason the program makes a quick check to stat check to see each file is indeed there, and this makes samba logging make 10000 entrys for  "opened files" which makes the log sort of useless.

     

    My though is since activity streams has an indicator of "how long" a file has been opened, could you lets say make it so that it would only log a file as opened if it "streams data" for 10+seconds making it log this entry, but not the ones that come and go under 10 seconds?

     

    I'd also like to know how acitivty streams knows when someone openes something, is it hooked up to some kind of api to know when to run smbstatus or is it continously running smbstatus every second draining server resources even though nothing is going on? (Maybe this is not as taxing as i may think but still)

     

    Is it even worth starting to look into this or should i stop right here? I really really want some way of logging whats going on via smb on my unraid server without things getting crazy noisy and the logging options in smb does seem to get too noisy, need some kind of way to ignore quick smb connections but keep the ones that are relevant because they are streaming data and log those only.

  10. Just now, dlandon said:

    These are correct.  Any settings you put in smb-extra.conf will overwrite those because smb-extra.conf is included after those settings.  The problem with the ones you removed is that they were after the smb-extra.conf include and were overwriting your settings.  You hadn't rebooted in a while and the legacy settings were building up.  Normally this doesn't cause any problems, but you are changing those settings in smb-extra.conf.  This is the reason for my latest changes.

     

    You can check your settings with 'testparm' to be sure they are set the way you want.

    Thanks, yes i noticed a lot of changes to SMB going on in both recycle.bin and unassigned devices :) All works now, thanks for clarifying! Have a good day and thanks for the amazing plugin!

  11. 1 minute ago, dlandon said:

    Remove these.

    ok, but there is still an recycle.bin entry directly inside smb.conf saying:

            # recycle bin parameters
            syslog only = No
            syslog = 0
            logging = 0
            log level = 0 vfs:0

     

    Will this setting not overwrite my own loglevel setting in smb-extra.conf

  12. Just now, dlandon said:

    Reboot.  Some changes were made and the legacy settings are still in the smb.conf.

    damn, my unraid is actually a production server so i cannot reboot it right now, can i just modify the files and run

    Quote

    smbcontrol all reload-config

    samba restart

    if yes? which parts would i remove safetly? the entry in smb.conf for recycle.bin i take it? and perhaps one of the multiple includes?

    Thank you again for the support.

  13. Whats going on with this plugin, constant changes to SMB messing with my config, i am trying to figure out what to do here.

     

    I want to log my smb user activity, so i have configured in extra settings:

     

    Quote

    log level = 2 vfs:2
    max log size = 50480
    log file = /home/logserver/smblogs/samba.%m
    vfs objects = full_audit
    full_audit:prefix = %u|%I|%m|%S
    full_audit:failure = none
    full_audit:success = open

     

    This does not appear to work because recycle.bin plugin has put its configuations regarding smb everywhere it seems, first we have:

    /etc/samba/smb.conf

     

    Quote

            # recycle bin parameters
            syslog only = No
            syslog = 0
            logging = 0
            log level = 0 vfs:0

     

    and a little lower in the same conf.smb file:

     

    Quote

    # hook for recycle bin log settings
            include = /etc/samba/smb-recycle.bin.conf

    # recycle_bin
            include = /etc/samba/smb-recycle_bin.conf

     

    A double entry to multiple configs, checking these files we have:

    smb-recycle.bin.conf

    Quote

    [global]
       syslog only = No
       syslog = 0
       logging = 0
       log level = 0 vfs:0

    smb-recycle_bin.conf

    Quote

    #vfs_recycle_start
    #Recycle bin configuration
    [global]
       syslog only = No
       syslog = 0
       logging = 0
       log level = 0 vfs:0
    #vfs_recycle_end

     

    Is this working as intended? Why is it incisting on being in global, why are these multiple entries leading to the same configurations multiple times?

     

    I want to have some kind of acitivty logging of files being accessed via smb, i put this into extra but i believe the global variable that recycle bin plugin puts in many places of loglevel 0 overwrites my settings as it works on unraid without recycle.bin plugin installed.

     

    I love this plugin and its a must have, so i really want to make sure i am not breaking anything by fiddling with these settings, your advise would be greatly appreciated, thank you!

  14. You can find the digest used by running

    Quote

    docker image inspect --format '{{index .RepoDigests 0}}' lscr.io/linuxserver/mariadb:latest

     

    Replace "lscr.io/linuxserver/mariadb:latest" with whatever the id of the container you want to extract the digest from

     

    But i could not find the exact digest among the tags so i had to query database by running SELECT @@version; which gave me the version and solved the issue.

  15. Not sure if this is the correct forum for this question. But i have a docker container that i do not want it to be updated anymore and stay at where it is right now because it is a crucial application for some of the stuff i am running on unraid.

     

    The particular docker container is MariaDB (lscr.io/linuxserver/mariadb)

    Since it is just set to lscr.io/linuxserver/mariadb it will always grab the latest from list: https://hub.docker.com/r/linuxserver/mariadb/tags

     

    But i don't want to update it, i want to make it stay where it is now. How can i figure out which exact version it is running right now? docker ps just shows "lscr.io/linuxserver/mariadb"

     

    Any ideas?

     

     

  16. On 2/21/2022 at 4:42 PM, bonienl said:

    I can only speak for myself, I use the Dynamix File Integrity plugin to monitor bitrot.

    Never had any occurrence in the last 6 years, besides some false positives due to file processing.

    Though I keep monitoring, bitrot isn't a real issue for me.

     

    I really wish there was a way to only run this plugin on selected shares, right now (if i remember correctly) it will run on the entire array and having nearly 150tb of data and over 10 million files yeah that's not going to happen, i installed the plugin and it would take nearly a month to just process all the files once, id be fine with just protecting certain files from bitrot which are located in particular shares

  17. Hi,

     

    I recently had issues with SHFS draining my ram, i have done some tweaking on my docker containers and moved the path of operation from /mnt/user/Appdata/ to /mnt/cache_appdata/Appdata/ which is my dedicated ssd for docker appdata. this tremendously lowered both cpu usage and memory usage on my system.

     

    I went from when everything is up and running 83% out of 128gb memory used to 62% out of 128gb used and cpu is almost 15% less usage so i am very happy with this finding.

     

    To my question, since i have a dedicated SSD for my appdata folder, and the drive itself is not even near to being full, would there be any benefits to setting this specific share to "Only" instead of "Prefer"? I don't know how unraid operates but my guess is that there could be a little benefit to setting it to "Only" in this case, am i right or wrong? I do know the risks of doing this, if the drive runs out of space its trouble for the apps running there, but i think i have this under good control. But if there is zero benefit to it there is no reason to allow the risk of the cache becoming full, do you experts have a take on this?

  18. 11 minutes ago, Frank1940 said:

     

    These have been the defaults for Linux since back in the days when most computers had 100MB of RAM.  With that amount of RAM, those percentages make a lot more sense.  If you install the Tips and Tweaks plugin, you can change the 'dirty' percentages.  I have been using 1% and 2% for years.

     

    One thing to check is that all your VM's and Dockers are writing their temp files to an actual device (HD or SSD) and not to Unraid's Ram-based file system.  An error in configuration is often the problem for this type of problem.  (Some programs are sloppy about cleaning up these files...)

     

    Re-ran some things that triggered the oom-killer before and it does not now after tuning the vm dirty values so that was definitely the issue, i learn something everyday :)

    • Like 1
  19. Ok i managed to trigger the vm oom-killer again, it is the rsync script, not sure if i can limit the amount of memory is uses?

     

    This is the command i am running, do you see any problems with it perhaps? Any advice on the matter would be greatly appreciated

     

    Quote

    rsync -avu --chown=nobody:users --chmod=Du=rwx,Dg=rwx,Do=rwx,Fu=rwx,Fg=rwx,Fo=rwx --delete --backup --backup-dir=/mnt/disk5/Deleted --suffix="-deleted-"$(date +"%Y%m%d") --stats --exclude=.Recyc* -e "ssh -i /root/.ssh/NAS-rsync-key -T -o Compression=no -x" $src $dest >> /home/nasbackuplogs/logs/$(date +"%Y%m%d")-media-backup2.log

     

×
×
  • Create New...