sdfyjert

Members
  • Posts

    28
  • Joined

  • Last visited

Posts posted by sdfyjert

  1. After a lot of further investigation today I am getting more and more convinced the issue is with unraid crashing due to some bad data on a disk. Here's how it went

    1. I loaded a usb-stick backup from a few days ago (before all hell broke loose)
    2. I boot up unraid (array off), the disk that was marked earlier today (but not when the issues started) as dirty and in need of parity fix is now green
      (this sounds like a bug)
    3. I start the array.... all hell breaks loose (reboot)
    4. Safe mode... the same...
    5. I disable everything (docker, VM manager)
    6. I manually disable the drive that needed rebuild (that was marked as dirty days after the reboot issues started).
    7. Start the array - no reboot.

     

    Reboot in safe mode

    1. Keep array offline
    2. Start an extended smart test on the dirty drive (now marked as green just by loading an older backup on the usb stick)
    3. Let it run for 10 minutes... no reboot.
      (stopped it there, as with the array running it would have rebooted as all the previous times)

    I am unfortunately yearning more and more towards software issue... this is extremely discouraging so far.

  2. I have already tried that, since day 0, the machine would reboot the moment disk check was starting.

    I have also tried switching the power cables for the drives to different PSU outputs with no positive results.

     

    I was going through the logs, in one of them only I found 2 entries that would normally look suspicious

    Apr 10 13:37:53 nas kernel: mpt3sas 0000:01:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM

     

    But it only exists in a log from earlier this morning (twice). In all previous days and since that it hasn't appeared again.

    The latest thing I did before the issues started appearing was an update on the filemanager (dynamix). I would like to believe it is not related in any way.

     

  3. I have checked the syslog (was already recording it) and there's nothing out of the ordinary there. The messages I see in the display with the pictures posted earlier do not appear in the syslog which means syslog is not started at that point yet.

     

    As of today things have taken a turn for the worse. Now it randomly reboots, one of the drives got disabled and marked for errors (it is currently being emulated).

    Taking the machine offline is not much of an option right now.

    Running check/fix is impossible as it just reboots the system.

    Running SMART short tests all drives appear to be fine (extended SMART cannot run as it reboots before they are finished).

     

    Currently waiting for the easter days to pass so I can get a new PSU delivered to test it out. If things continue down that path I am considering installing truenas on the same hardware (just different drives) out of curiosity to verify if it is hardware or software related.

     

     

    In the meantime, any ideas are welcome.

    fingers crossed

  4. I video recorded the display, booting on normal mode and safe mode, I get the following. Sorry for the poor quality but it's moving really fast, it's the best the camera could capture.

     

    It has been running parity checks and even a rebuild recently with the exact same configuration for ever a year. Here are the two last ones, not long ago.
    image.thumb.png.08f2faa560a19e8ba648cb201c9019d1.png
     

    Any clues?

     

    image.thumb.png.6e816ae5d0c63bb6516153c8966de553.png

    image.thumb.png.024f16e6e9df511e9daeb89bfc884142.png

  5. Hi, I have noticed every time I move a large number of large files using dynamix the docker containers and the web interface of unraid nearly freeze (they respond very sporadically). I find this very strange given it is running on an i7-8700 with load of below 50% during the move.

     

    Is there something I can tweak to fix this or is this a known issue that will be resolved with future updates?

    Please let me know how I can help you help me.

    Thank you

    nas-syslog-20230330-1236.zip

  6. Short story

    On the server there are currently 2 USB devices attached. The Unraid stick and an SD-CARD reader with a card inside. Upgrade succeeded but device was no longer bootable. Upgrading with ONLY the usb stick present it worked fine.

     

    Artifacts

    After the upgrade I now have the following device listed and no way of removing it. My cache drive appears to be all in order and properly listed. Is this something I should worry about? How can I remove it?

    image.thumb.png.c8bea6db5aab124e199a36d0ab0c129f.png

     

     

    The Odyssey

    New upgrade out. Checking the change log, I read about fixes with regard to data corruption so must have upgrade ASAP.

     

    But first things first... fresh Flash Backup... (seriously, NEVER ever skip that step. IMO it should be part of the upgrade process - please guys seriously consider adding it in the upgrade work flow).

     

    I click upgrade... downloads upgrade, installs it and asks to reboot.

    Flawless... ok, reboot.

    10 minutes later... unraid nowhere to be seen in the network... searching for that bloody HDMI or DP cable (how I miss having a KVM)... after a looong search, I found one I could use.

     

    Latest messages on the terminal, boot device failed trying again click a button to reboot...

     

    Ok, let's recover using the back-up (never ever ever upgrade without a backup).

     

    Downloading the creator (mac)..., plugging in the Unraid stick... no USB stick listed. I click refresh, nothing. I must be doing something wrong... onward to the wiki. In the documentation it says USB stick size MAX 32GB (NOT TRUE). Really? I've been using the 128GB stick for years now... damn, might have something changed and it's now an issue?

     

    Taking out another stick 32GB max size, formatting FAT32, loading the creator, refresh... nothing. No stick listed... OK, so USB creator broken on mac... onward to Win10... all steps repeated... NO STICK LISTED (you can only get away with such things because most users are experienced, but seriously, even the open source apps copy-files-and-make-stick-bootable can always find the usb stick).

     

    Ok, so creator is broken (btw, was also broken years ago when I first purchased the unraid license).

    I'm thinking, it's UEFI, just copy the bloody files over and run the make_bootable_mac should do it. So I copy the files, open terminal, run the make_bootable_mac... will not run, unknown developer and the such... oook, preferences, allow... again... another binary that it tries to run same issue... same process, again... and another binary, same issue same process... DONE! All messages look ok except for that cannot get current directory (multiple times).

     

    ❯ ./make_bootable_mac
    INFO: make_bootable_mac v1.3
    
    INFO: The following drive appears to be the unRAID USB Flash drive:
    /dev/disk2
     123.7GB
    
    Permit UEFI boot mode [Y/N]: Y
    To continue please enter your admin Password:
    Sorry, try again.
    Password:
    INFO: Unmounting /dev/disk2
    Forced unmount of all volumes on disk2 was successful
    INFO: Writing MBR on /dev/disk2
    0+1 records in
    0+1 records out
    447 bytes transferred in 0.004785 secs (93417 bytes/sec)
    INFO: Mounting /dev/disk2
    Volume(s) mounted successfully
    syslinux for Mac OS X; created by Geza Kovacs for UNetbootin unetbootin.sf.net
    shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
    "/dev/disk2s1" unmounted successfully.
    shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
    /dev/disk2s1        	DOS_FAT_32                     	/Volumes/UNRAID
    shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
    mountpoint is '/Volumes/UNRAID'
    checkpoint1
    checkpoint2
    checkpoint3
    Processing: /Volumes/UNRAID/ldlinux.sys
    checkpoint4
    checkpoint5
    checkpoint6
    checkpoint7
    Processing: /Volumes/UNRAID/ldlinux.c32
    checkpoint8
    checkpoint9
    checkpoint10
    shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
    "/dev/disk2s1" unmounted successfully.
    checkpoint11
    checkpoint12
    checkpoint13
    checkpoint14
    checkpoint15
    shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
    /dev/disk2s1        	DOS_FAT_32                     	/Volumes/UNRAID
    syslinux installed successfully!
    
    INFO: the Unraid OS USB Flash drive is now bootable and may be ejected.

     

    Stick back to the NAS, booting... no luck, same errors 😰

    Last try, stick to a windows machine, run the make_bootable.bat.... just a single line saying it's done.

    Stick back to the NAS, booting... YAY.

     

    But I REALLY want the upgrade... there are only 2 USB devices on the machine. Let's remove the SD-CARD reader... maybe something got it confused? 🤔

     

    Upgrade... reboot... success! YAY!

     

     

    btw, It feels embarrassing that windows make_bootable worked when the mac one was failing.

     

     

  7. Hi, I am trying to set a device script to run after automounting. 

    I switched on automounting for the device (works fine) then set a filename in the

    Device Script: /boot/config/plugins/unassigned.devices/scripts/import-script

    paste the script in the big text-area and click save at the bottom of the page.

     

    I can see the file is in place and generated with the script in-place with permissions

    -rw------- 1 root root  5189 Jun  4 11:25 import-script

     

    In the /boot/config/plugins/unassigned.devices/unassigned.devices.cfg I see the following "suspicious" entry 

    command.1 = "/boot/config/plugins/unassigned.devices/packages/"

     

     

    I unmount then automount... script has not run.

     

    I open the settings again, Device Script shows packages folder.

    I switch to the scripts folder, no file is shown for selection.

     

    What am I doing wrong?

  8. Hi, I am trying to set a device script to run after automounting. 

    I switched on automounting for the device (works fine) then set a filename in the

    Device Script: /boot/config/plugins/unassigned.devices/scripts/import-script

    paste the script in the big text-area and click save at the bottom of the page.

     

    I can see the file is in place and generated with the script in-place with permissions

    -rw------- 1 root root  5189 Jun  4 11:25 import-script

     

    In the /boot/config/plugins/unassigned.devices/unassigned.devices.cfg I see the following "suspicious" entry 

    command.1 = "/boot/config/plugins/unassigned.devices/packages/"

     

     

    I unmount then automount... script has not run.

     

    I open the settings again, Device Script shows packages folder.

    I switch to the scripts folder, no file is shown for selection.

     

    What am I doing wrong?

  9. I have a cache only share (`/mnt/user/tmp`). Lately fix common problems has found a bit troublesome too often that the cache-only tmp share is appearing on the array disks. This is not new, I have seen it in the past but it was way too rare like once a year. Now all of the sudden it starts becoming once a week issue.

     

    Some facts

    • The settings of that share are set for over 2 years now. Actually all my shares have not been modified (settings wise) for over 2 years.
    • The last two times it happened the folders found on the array are cache-only folders mounted to dockers running sonarr and radarr.
      • Sonarr (move files)
      • Radarr (move files)
    • The last two times it happened, I find only empty folders on the array drive (that belong to the cache-only share).

     

    How can I help you help me?

    And how likely is it that I have some underlying bigger issue? Should I worry about my data?

  10. Hi, I just upgraded my rig from an old A10 to a decent i7 with an Asus Prime with Z390 chipset. Now that I have the fire-power I wanted to play a bit with the VMs. I wanted to passthru the ALC but it's nowhere to be found in the audio devices and then also nowhere to be found in the IOMMU groups. My A10 board also had an ALC sound card (really old one) and it was always there listed. The one shipped with the Asus Prime Z390 is ALC S1220A 8-Channel (which normally should be by default included with the Linux Kernel).

     

    Any ideas / clues / hints how to make unraid aware of the audio card?

     

  11. Update to v6.9.0 was flawless so I installed v6.9.1 without much thought.... lesson learned.

     

    • 1st reboot... nothing working. No network no nothing. Cold sweat started running already... move the NAS to a display or a display to the NAS.... either way this is going to be a disaster... 
    • 2nd reboot... (just in case)... NAS is accessible through the network!

    Not too bad! Starting the array... (more cold sweat)

    Array stared... Parity Check started (oooookey, so no clean shutdown/reboot obviously there),

     

    • Opening Docker page... big red letters, Containers are starting...
    • 5 minutes later... no change, reload page, still containers are starting... damn... ok, parity does add some performance hit to the A10 tiny CPU running it xD 
    • 8 minutes later, reloading page.... Ahhhh the red letters are gone, all containers appear to be running!

    Next check, running some manual tests to verify all custom user-scripts have run... 

    To my big surprise at this point, they did.

     

    This was a bit more excitement that what I am looking for from a NAS but definitely time for some beer after all that cold sweat.

    Cheers, guys, thank you for all the hard work and for the Docker upgrades that appear to be coming out xD

     

    • Haha 1
  12. I am trying to run some command and push them the in background but the GUI terminal seems to get stuck and the script doesn't really seem to finish (I do not see the echo in the window and running ps aux | grep avahi-publish show the process does not exist).

    Any suggestions?

    #!/bin/bash
    #description=Generate avahi aliases
    #foregroundOnly=true
    #name=Avahi aliases (sub-domains)
    
    echo "plex.inas.local"
    /usr/bin/avahi-publish -a -R plex.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &

     

  13. Almost got it working (was easier than expected).

     

    Here's what I have so far - for however might be interested in going down that path

     

    Step-1: Change the default ports for the unraid user interface

    Go to http://inas.local:8008/Settings/ManagementAccess and change http and https ports. I set them to 8008 and 8443. Just remember to check your ports are not already used by some docker container, etc.

     

    Step-2: Install Nginx Proxy Manager (docker from user-apps) - or build your own container. I had my own configs ready from the past server but Nginx proxy manager works just as well and as an added bonus you get a comfortable web-gui with it.

     

    Step-3: Configure your subdomains (depends on how you went with step-2)

     

    Step-4: Install user-scripts plugin (can put them in /boot/config/go but better skip the pain. Click to create a new script and add the avahi commands in there. For each subdomain add the following command

    /usr/bin/avahi-publish -a -R subdomain.domain.tld unraid-ip &
    
    ##
    ## In my case, for my unrain registered with inas.local
    ## To add jackett.inas.local
    ## Instead of the IP I used avahi-resolve to actually aquire the IP for the inas.local domain
    ## This way if the unraid IP changes you do not need to change IPs in the script.
    
    /usr/bin/avahi-publish -a -R jackett.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &
    

     

    This is how my final script looks like

    #!/bin/bash
    #description=Generate avahi aliases
    #foregroundOnly=true
    #name=Avahi aliases (sub-domains)
    
    echo "sonarr.inas.local"
    /usr/bin/avahi-publish -a -R sonarr.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &
    
    echo "plex.inas.local"
    /usr/bin/avahi-publish -a -R plex.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &
    
    echo "proxy.inas.local"
    /usr/bin/avahi-publish -a -R proxy.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &
    
    echo "sonarr.inas.local"
    /usr/bin/avahi-publish -a -R sonarr.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &
    
    echo jackett.inas.local
    /usr/bin/avahi-publish -a -R jackett.inas.local $(avahi-resolve -4 -n inas.local | cut -f 2) &

     

    Next steps (help wanted and appreciated)

     

    1. Utilise the args in user-scripts (can we have variable number of variables?) and adjust the script to use them accordingly.

    2. Wrap the avahi-publish in a script to monitor for network changes and react accordingly (interface went up, restart avahi-publish, etc).

     

    I acknowledge my documentation is insufficient to help people with little knowledge on the subject so feel free to ask and I'll do my best to answer.

  14. First tests - just manually running avahi-publish, reveal I forgot one important component. I need to also proxy the unraid web interface. This means I will need to move it to a different port and put it also behind nginx proxy. Is it feasible to change the unraid web interface port or everything will break?

  15. Hey guys,

     

    Before moving to unraid I had an ubuntu server installation for NAS. The only thing I miss from my old setup is the aliases I had configured for avahi. I still have the script I had build for the aliases which was utilising systemd to start bind to network events.

    I would really love to put that back in action.

     

    In practice, this is the service I had created for systemd in ubuntu server

    [Unit]
    Description=Publish %I as alias for %H.local via mdns
    Wants=network-online.target
    After=network-online.target
    BindsTo=sys-subsystem-net-devices-enp5s0.device
    Requires=avahi-daemon.service
    
    [Service]
    Type=simple
    ExecStart=/bin/bash -c "/usr/bin/avahi-publish -a -R %I $(avahi-resolve -4 -n inas.local | cut -f 2)"
    
    [Install]
    WantedBy=multi-user.target
    

     

    To use the service it was as simple as creating and starting a parameterised service

    systemctl enable [email protected]
    systemctl start [email protected]

     

    I know unraid is using avahi but I am not familiar with the folder structure and how to make the changes also permanent (some startup scripts that copy files I presume?)

    So, where are the files I would need to touch? Also, given unraid is not using systemd, I presume I will need to write an equivalent init.d script or there's some way with unraid?

     

    Looking forward to info to get this started. Once ready I will share off course whatever scripts will come out of it so anyone can easily reuse them.

     

    Cheers.

  16. Update

    coming next morning, the files that were removed from the share but where still on the disk reappeared on the share 🤔

    I removed them again from the share and this time they're gone for good.

     

    Quote

    how did you check that the file does not exist in the share?

    ssh, `ls /mnt/user/movies > /mnt/user/tmp/movies_share.txt` and `ls /mnt/disk1/movies > /mnt/user/tmp/movies_disk1.txt` and run a diff on the files (though I'm dead certain there's a faster way with pipes)

     

    Quote

    Parity is not aware of files (or file systems) as it works purely at the disk sector level.

    I understand how 1+1+1+1 = 1 works (🤣)

    I thought maybe during parity build it locks some sectors? If not the case and considering it's fs driver level perhaps something relating to concurrency. If it happens again

     

    Quote

    No idea if it will help but posting your system diagnostics zip file

    I highly doubt it will be of any use but here it is for what it's worth

     

    inas-diagnostics-20201026-1003.zip

  17. Here's the situation, I deleted some files (movies) using Plex. The titles got removed from Plex and the files no longer exist in the respective share. After noticing that the disk space is not freed I checked the actual hard disk (i.e. /mnt/disk1/movies) and the files are still there.

     

    Clues

    • The array is entirely on xfs (latest version of unraid).
    • At the same time the parity was being built (parity first time - just added the parity drive today)
    • There was some "intense" activity throught the array and caches (fyi: this share does not use cache, cache = NO)

     

    Is this a bug or something expected?

    Will the files be automatically removed later (ie. after the parity build is finished) or I have to manually go and remove them?

     

  18. Changing the cache to YES for that share would actually do me more harm.

     

    I have a limited amount of cache (SSD drives) and they're primarily used with shares related to video-editing.

    If I would set the movies share to cache=YES then it would easily saturate the cache 'causing issues to the shares that really need it.

     

    Regarding the network based approach, moving files in a machine from disk to disk is significantly faster than doing it over samba through another machine. The same file that I can move on the machine in 1 minute doing it over Finder as mounted network storage (sambar) will take significantly more time.

     

    Given there's no "native" file-explorer in Unraid another temporary solution would be perhaps to "monitor" the FS for changes and when this behaviour is recorded invoke the mover for the particular files (and god forbid your array does not run out of space in the meantime 😂).

    As this behaviour could 'cause issues and headaches to people unaware of it (I only noticed it on time because the "Fix Common Problems" plugin spotted it), in my opinion this should be addressed in OS level so the user settings are 100% applied in real-time.