Jump to content

NAS-newbie

Members
  • Posts

    56
  • Joined

  • Last visited

Posts posted by NAS-newbie

  1. I have now run like 60% of the parity check and there seem to be no more errors discovered in addition to the 513 that was found in the first 8% of the array. My array consists of a 3 "enterprise grade" disks (20TB parity, 20 TB data and 10TB data) and 3 smaller and older "consumer grade" disks that remain from when I first tried out UnRAID (3TB and 2x2TB) that do not contain any data (I moved what little was on them to the new drives a long time ago and the new ones have not filled up enough to start using the small disks since). 

    Assuming no more parity errors are found I feel quite sure one or more of these old drives (that I probably should have removed when adding the new much larger disks) are to be blamed for the parity errors.

    As they add relatively little capacity to the array and seem likely to cause these serious problems I am planning to remove them from the array.

    Given that I have not selected "correct parity" in the check I have ongoing what is the best way to do this operation?

    As I have backups for all critical data I am thinking of risking to go the "new configuration route" (leaves a window of about 36 hours I do not have parity protection for the array and may have to go through restoring backups if I have a failure) and let that build new parity.

    Any suggestions for a faster/safer/better procedure are appreciated!

  2. I disabled all containers and VMs (to avoid more writes until I have done some more tests) and let the test continue running. Late at night I started receiving a lot of messages like this in the syslog (possibly when "mover" run). Seems strange with memory allocation error as the box has 64GB memory so must be out of a size limited pool....

    May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: nchan: Out of shared memory while allocating message of size 9059. Increase nchan_max_reserved_memory. May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: *274131 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/devices?buffer_length=1 HTTP/1.1", host: "localhost" May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: MEMSTORE:00: can't create shared message for channel /devices May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [crit] 6892#6892: ngx_slab_alloc() failed: no memory May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: shpool alloc failed May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: nchan: Out of shared memory while allocating message of size 277. Increase nchan_max_reserved_memory. May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: *274140 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/update1?buffer_length=1 HTTP/1.1", host: "localhost" May 20 23:50:22 NAS nginx: 2023/05/20 23:50:22 [error] 6892#6892: MEMSTORE:00: can't create shared message for channel /update1 May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [crit] 6892#6892: ngx_slab_alloc() failed: no memory May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [error] 6892#6892: shpool alloc failed May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [error] 6892#6892: nchan: Out of shared memory while allocating message of size 3603. Increase nchan_max_reserved_memory. May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [error] 6892#6892: *274142 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/var?buffer_length=1 HTTP/1.1", host: "localhost" May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [error] 6892#6892: MEMSTORE:00: can't create shared message for channel /var May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [crit] 6892#6892: ngx_slab_alloc() failed: no memory May 20 23:50:23 NAS nginx: 2023/05/20 23:50:23 [error] 6892#6892: shpool alloc failed

    and eventually the test was paused at about 35% (with no more errors found than at ~8% interestingly).

    New diagnostics included if anybody can have a look. Have now restarted the parity check to see if more errors are encountered all over the disks or if the "only" ones are the initialt ~500...

    nas-diagnostics-20230521-0645.zip

  3. I have recently replaced my motherboard & CPU of my UnRAID server with a new one and after this I wanted to test if parity was ok but contrary to what I expected/hoped I have already got over 500 errors in under 10% of test done.  Now it was sadly a few months ago I did the last test so cant say for sure if the problem is with the new hardware or existed also before the switch (lets call it a learning for next hardware change to do one just before). I have not specified to have the errors corrected to be able to investigate better if this type of problem would occur. With the new hardware I initially had some problems to get UnRAID to start properly so had to do a few unclean shutdowns but I do not think data was written at those times (as the system most likely did not start the array). The new motherboard has ECC memory and I did run memory test for a number of hours without any errors before testing it with UnRAID. I also tried installing Linux on an SSD that I conneted in turn to each of the six SATA channels I use for UnRAID (2 are regular separate SATA and four go thorugh a "slim SAS cable with 4 SATA connectors in the other end) and they did at least work reliably enough to seemingly run Linux without any errors.

    For what it is worth here are the diagnostics. What would be the best action to take in this situation. Run some diag

    nas-diagnostics-20230520-2051.zip

  4. I temporarily added a video card (in addition to the built in one on the CPU) and then I got to the console and UnRAID also booted as it should again in normal mode. I then did a fresh backup of the flash drive just in case and removed the video card again and then it still boots as it should so cant reproduce the problem. Lets hope it was a one-time thing that will not come back and haunt me again 🙂
    Thanks for the help anyhow!

  5. Yes BIOS show normal message to enter it by pressing del but then screen goes blank. No complaint that OS image is not bootable so assume UnRaid boot start but hang or console directed "elsewhere" (or both eventually as machine do not reach DHCP and get an IP as before)....

     

    Does UnRaid boot if it for some reason finds no graphics board/chip (headless)?

  6. I rebuilt my UnRAID system a few days ago with a new CPU/MOBO and it have worked perfectly until I tried enabling CPUs Quick Sync (just downloaded a plugin that did this as well as another plugin that provided a dashboard information about GPU) with the intention of later trying it with Plex and for a completely different reason had to restart the server. I initiated this cleanly from the web console but after this I cant connect over HTTP and the native machine console does not show anything even during UnRAID boot (maybe a natural consequence of enabling GPU that I was not aware of or the system crashed/hang during boot) 😞

     

    Is there a way to force UnRAID into "safe boot" (or perhaps enable a serial console or in some other way debug the problem) in this situation or do I have to rebuild a new USB stick (my backup is quite old so would preferably like to avoid that)?

  7. I have been playing with a VM with Home Assistant for a few days where it has worked great. Now I realized I have located the VM primary file on a share with backup=yes rather than in the domains share and thought I could fix this by:

    1. Stopping the VM
    2. Moving the directory with the qcov2 file to domains
    3. Edit the VM to specify the new primary disk location (under domains) of the qcov2 file.
    4. Start the VM again.

    but sadly the VM do not boot any longer and when I go to the console is just shows "mapping table..." and a "press ESC in one second ..." text rather than booting and no network interfaces are displayed on the VM main page.

    I even tried moving the directory back and changing the VM definition back but with the same result i.e. still no boot 😞

     

    I tried running "qemu-img check" on the image and this did not find any errors.

     

    May I still somehow have destroyed the VM just by moving it or what may have happened? Any tips on things to try?

     

  8. Sorry for net being very clear - I use a container with privproxy and it accepts OpenVPN config files from any VPN provider like PIA, NordVPN etc. and as I mentioned it actually work (passes all leakage tests I have tried etc) so is ok from a privacy point of view but for some reason I see the mentioned latency problem when looking up new web pages and I assume the reason is some setup problem on my side and was hoping osmebody else have seen a similar problem and solved it or have some ideas how I can diagnose it...

  9. Replying to an old thread but I am having a similar problem but with NordVPN set up as a "proxy server" in UnRAID and it is not the bandwidth but rather the "latency" when going to a new page that is the problem - bandwidth is 200MBit/s or more but each time I go to a new page it takes like 5 seconds until the page renders... Feels to me like it could be DNS that for some reason is EXTEREMELY slow. Any suggestions on what may be wrong?

  10. On 3/12/2023 at 2:33 PM, Nodiaque said:

    It's because of the docker. The official docker image does a chmod on startup to its own user. I myself created a openhab user and assigned it the Id of the docker. You can check on the official docker image for doc about that. 

    Thanks for the info - I found information here https://www.openhab.org/docs/installation/docker.html about creating the openhab user but I am not very knowledgeable about docker (I just install ready made containers in UnRAID and have not had this problem before) so I do not understand how to follow these instructions - should this be done in UnRAID before or after installing the container (I have already done a lot of setup in openhab already so do not want to re-do the installation) or is it something I should do inside the container somehow?

  11. I am trying to pass in my DVD burner to a Ubuntu Linux VM to be able to BURN DVDs/BDs with data from the array. The burner is on the same controller as some of the array disk.

    "lsscsi" and "System devices tab" lists it as "[9:0:0:0]    cd/dvd  Optiarc  DVD RW AD-7200A  1.08  /dev/sr0"

    Executing "cat /proc/sys/dev/cdrom/info" in UNRAID shows:
    CD-ROM information, Id: cdrom.c 3.20 2003/12/17

    drive name:             sr0
    drive speed:            48
    drive # of slots:       1
    Can close tray:         1
    Can open tray:          1
    Can lock tray:          1
    Can change speed:       1
    Can select disk:        0
    Can read multisession:  1
    Can read MCN:           1
    Reports media changed:  1
    Can play audio:         1
    Can write CD-R:         1
    Can write CD-RW:        1
    Can read DVD:           1
    Can write DVD-R:        1
    Can write DVD-RAM:      1
    Can read MRW:           0
    Can write MRW:          0
    Can write RAM:          1

     

    I do not know KVM virtualization very well so trying to follow various answers in forums and is right now trying:

     

    <hostdev mode='subsystem' type='scsi' managed='no'>

          <source>

            <adapter name='scsi_host9'/>

            <address bus='0' target='0' unit='0'/>

          </source>

          <address type='drive' controller='1' bus='0' target='0' unit='0'/>

    </hostdev>

    but I am not sure if this is right or how to find what to specify for "controller". The commands "hwinfo --storage-ctrl" and "hwinfo --disk" is not avaialble in UNRAID. 

     

    Anyhow this setting do result in the DVD to show up in the VM but NOT as a writer 😞

     

    In the VM "lsscsi" shows:
    [1:0:0:0]    cd/dvd  QEMU     QEMU DVD-ROM     2.5+  /dev/sr1
    [4:0:0:0]    cd/dvd  Optiarc  DVD RW AD-7200A  1.08  /dev/sr0
    [9:0:0:0]    cd/dvd  QEMU     QEMU DVD-ROM     2.5+  /dev/sr2
    and "cat /proc/sys/dev/cdrom/info":
    CD-ROM information, Id: cdrom.c 3.20 2003/12/17

    drive name:             sr2     sr1     sr0
    drive speed:            4       4       1
    drive # of slots:       1       1       1
    Can close tray:         1       1       1
    Can open tray:          1       1       1
    Can lock tray:          1       1       1
    Can change speed:       1       1       0
    Can select disk:        0       0       0
    Can read multisession:  1       1       1
    Can read MCN:           1       1       1
    Reports media changed:  1       1       1
    Can play audio:         1       1       1
    Can write CD-R:         0       0       0
    Can write CD-RW:        0       0       0
    Can read DVD:           1       1       0
    Can write DVD-R:        0       0       0
    Can write DVD-RAM:      0       0       0
    Can read MRW:           1       1       0
    Can write MRW:          1       1       0
    Can write RAM:          1       1       0

     

    Any suggestions on what I can try to make it available with write capability?

  12. I have all my shares set to either cache:prefer or cache:yes but I still see some periodic activity is causing one disk of the array to never spin down and also causing the parity drive to remain spun up (so seem to be "updates of some kind").
     

    Naturally I want my disk to spin down so need to figure out what is causing this...

     

    Are there any type of disk update activity (changed ownership/permissions or anything else) that is not cached and instead go directly to the array or any other thoughts on what this may be or how to debug the problem further (when file activity plugin seem to be "bypassed" showing no activity what so ever)?

  13. I have a backup share with a number of folders under UnRaid 6.11.5. Most folders in the share seem to work as expected but I found one folder (my Image folder) that when I look at the content using the UnRAID terminal or by navigating to the share content using the UnRAID web console contain 192 subfolders but when viewed from Windows 10 only show 7 folders and the content of these folders are not even correct (one folder does for instance seem empty from Windows but contains a number of files on the UnRAID system).

    I have not changed any SMB parameters (just enabled sharing) and I have tried setting all permissions and ownerships for the share to the default using the "New Permissions" tool. The share uses caching Yes.
    The only unusual thing I can think of is that I have some folder names with space and international characters but it is not consistent that these are the folders that are missing and other folders that also contain this kind of names do work as expected....
    I can also mention that I have now tried mounting the share with NFS and then all the directories and files are present and that I tried the SMB share on another computer and it shows the exact same wrong content so it does not seem to be a problem with the client computer.

     

    Edit: I received help by UnRAID support - after disabling "Settings->SMB Settings->Enhanced macOS interoperability:" the problem disappeared and all files now seem to show up, 

    • Upvote 1
  14. 4 hours ago, strike said:

    It should, but I have found out that's not the case,remember having this issue myself loong ago. Also you should stop the container before making any changes, then restart it. I have serveral times changed the core.conf file while the container was running then restarted, and found the changes was not sticking. 

    Found the problem - my deluge thin client was too old so it refused to connect to the newer version installed on my UnRAID server.

  15. 18 minutes ago, NAS-newbie said:

    As I mentioned the user name and password I have provided in the auth file do work when I connect using the connection manager in the web interface to 127.0.0.1:58846 so in my view that file cant be the problem - and yes I am sure there are no other lines in the auth except the default one and the one I added.

    I had NOT commented out the default one (for localclient) nor have I done so on my raspberry Pi where I currently run deluge and where I CAN connect to 58846 remotely without any problem.

    Just to verify I tried commenting out the default line (and restart the container) but this did NOT make any difference except that I then cant log in using localclient either any longer in the connection manager of the web gui).

    Should it not be the point that you CAN have multiple deluge users by having several lines in this file?!
     

  16. 5 minutes ago, strike said:

    You're sure you don't have another line in the auth file that is not commented out? Can you post the content of the auth file? 

    As I mentioned the user name and password I have provided in the auth file do work when I connect using the connection manager in the web interface to 127.0.0.1:58846 so in my view that file cant be the problem - and yes I am sure there are no other lines in the auth except the default one and the one I added.

  17. I have manged to get deluged to work perfectly with my VPN provider (NordVPN using custom setting) and can connect through the web interface, download torrents with good speed etc but cant for the world get remote connection over port 58846 to work (I am trying to connect over my local LAN not through Internet/VPN).

    I have added a line to "auth" file ("user:password:10") and using the web GUI preferences deamon enabled "remote connection" and verified that the port is indeed 58846 but still no luck...

    Using the connection manager in the WebGUI I can connect to 127.1.1.0:58846 with my defined user name and password so that part seems to work but if I use the servers real IP it does not work from the GUI either suggesting to me this could be a networking issue with the container but I do not know how to debug it further...

×
×
  • Create New...