Jaster

Members
  • Posts

    409
  • Joined

  • Last visited

Posts posted by Jaster

  1. Hi Guys,


    I do own 4 licenses and would like to get some more. But the current licenses are bound to different e-mails. Could I move the existing licenses to the same address and get the new once there?

     

    aaaaand, is there a plan for "floating" licenses?

  2. I'm running four production servers and two labs. Ramping up to six production within the next two weeks.

    Data loss has never occurred - I had a lot of "bad luck" with BTRFS, but I moved away to XFS and am fine now.

    unraid is VERY stable, however some "rogue" VM's and docker were able to cause some issues (by flooding the log, etc) - but that would also kill any other OS/distro with a lv1 hypervisor.

     

    Two things are missing from my point of view:

    1.  some kind of reliable notifications - this is especially annoying if you fail over on WAN and don't receive notifications.

    2. VM snapshots (there are some solutions, but those ain't great).

     

    Again, that stuff does not come out of the box on other OS either.

  3. 40 minutes ago, dada051 said:

    I never had problems with BRTFS, and I think a lot of users either. But why not. If you want to do VM backups, use can use QCOW2 as disk images, and https://github.com/wbynum/QEMUBackup as a docker to create backups.

     

    This is just an rsync job to copy the vm. Non incremental and "best" when VM is shut down.

     

    I get a lot of comments, that BTRFS is reliable - hasn't ever been for me. Sure, if you barley do something with it, it seems stable. Once you put some pressure on the FS is starts degrading pretty fast.

  4. I voted for qemu, but frankly speaking I don't want any of those features. If I had a wish it would be the same as many other users have; snapshots/vm backups. I am running 4 productive servers and two lab ones. Here is why I think that there is a HUGE gap between what is supposed to work and how it actually works;

     

    BTRFS: Ever since, the only "bad" experience I ever had, was related to BTRFS - no matter how I did set it up, it always failed at some point. I was running hourly checks, regular balances, etc. Still it kept failing, so I moved away to XFS. Therefore snapshots on BTRFS are not an option.

     

    ZFS: I'm not a huge fan of a Filesystem, that eats tons of memory. For me, memory is my bottleneck for virtualization as it isn't shareable across VMs and/or other services. Due to my needs I use HEDT hardware over server stuff for the VM boxes and memory is limited to 128 or 256gb there. So that one is out as well.

     

    VM Backup Script/Plugin: That thing is nice, but it does require the VM to be shutdown and will always take a full backup. My VMs (and for a lot of people out there) are running 24/7. Some are servers, others are just a pain in butt to start (hello to all the developers out there!). Also being a full backup, it takes extremely long to create, transfer, etc. Not talking about the extra disks it eats. Storing about 30 days per vm is somewhere around 1~3TB. That's a ton of space!

     

    QEMU Tools: virtbackup isn't stable yet and not integrated in unraid. There are also some modifications required on the VM which tend to be overwritten (sometimes) by editing the VM via the UI or a server reboot. Also The interaction is not very comfortable (you can't remove or restore VMs via the UI). Rememer, you also need some docker to run a distro which would deliver these tools to your server.


    As for now, I am working on a backup/snapshot solution based on a QEMU docker, but there is a million things which do require attention. Once It's done, I'll be providing it to the community, but I'd say this is a feature which should be included in unraid.

    • Like 2
  5. Hi Guys,

     

    I'm wondering if it is possible to add some custom notification agent. My issue with all the existing once is the lack of support for mqtt (or any reliable messaging architecture) and therefore a loss of messages in several use-cases.


    Ideas`?

  6. WTF?! I tried to fix the btrfs issues and ran into this...

    Please do NOT say "memtest" - the memory is fine, I tested it.

     

    If I do boot in save mode, I don't get any panic... how can I track the issue down?!

  7. Hi Guys,

     

    My BTRFS pool just went read-only.

     

    Aug 10 22:28:49 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11354, flush 1, corrupt 1, gen 0
    Aug 10 22:28:49 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11355, flush 1, corrupt 1, gen 0
    Aug 10 22:28:50 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11356, flush 1, corrupt 1, gen 0
    Aug 10 22:28:50 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11357, flush 1, corrupt 1, gen 0
    Aug 10 22:28:50 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11358, flush 1, corrupt 1, gen 0
    Aug 10 22:28:50 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11359, flush 1, corrupt 1, gen 0
    Aug 10 22:28:51 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11360, flush 1, corrupt 1, gen 0
    Aug 10 22:28:51 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11361, flush 1, corrupt 1, gen 0
    Aug 10 22:28:51 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11362, flush 1, corrupt 1, gen 0
    Aug 10 22:28:51 Vortex kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 12, rd 11363, flush 1, corrupt 1, gen 0

     

    Scrub does not seem to do anything.

     

    a check -> btrfs dev stats -c /mnt/cache returns....

     

    [/dev/nvme0n1p1].write_io_errs 0
    [/dev/nvme0n1p1].read_io_errs 0
    [/dev/nvme0n1p1].flush_io_errs 0
    [/dev/nvme0n1p1].corruption_errs 0
    [/dev/nvme0n1p1].generation_errs 0
    [/dev/nvme1n1p1].write_io_errs 12
    [/dev/nvme1n1p1].read_io_errs 12152
    [/dev/nvme1n1p1].flush_io_errs 1
    [/dev/nvme1n1p1].corruption_errs 1
    [/dev/nvme1n1p1].generation_errs 0

     

    SOS please :)

     

     

    vortex-diagnostics-20210810-2230.zip

    vortex-syslog-20210810-2033.zip

  8. Hi, welcome.

    Let's clarify on a few on your assumptions fist:

    In general you run an array (parity based) and some btrfs pools (raid based).  Depending on the setup you have 1-2 Disks for the Parity array and whatever you define on raid level which can fail without effecting your data.
    If a disk fails, you can replace it using hot-swap... Or you can also "preinstall" a disk into unraid, but not use it for any array. Once a disk fails, you could stop the array and replace it (inside the configuration, not physically!) with the spare. This operations takes less than a minute and unraid will begin to reconstruct the data to your spare disk. The reconstruction can take a while, but your data is AVAILABLE all the the time!

     

    As for Plex; I do use a similar setup and travel more often than I am close to my media. I do not struggle to stream anywhere in the world - if you do, it is probably due to your upload bandwidth on your server (usually ISP based). Having that in mind, it takes significantly more time to actually synchronize data compared to streaming a reduced (adopted to the available bandwidth, which plex does for you) just part of it...

     

    An offsite backup IS a good idea, but it is not how/why you would address your actually topics.

  9. How is unraid involved into this? What you are looking for some kind of bare metal issue. The "raw" draft would be having the controller and some disks in one bay. A bunch of disks in the other bay and connect those via some cables to another (or the same) controller inside the first bay.

  10. Hi,

    My Server seems to have 100% log - I can't post the diagnostics (won't donwload it).

    but what I see in the log is

     

    May 31 04:20:22 Natasha nginx: 2021/05/31 04:20:22 [crit] 7857#7857: ngx_slab_alloc() failed: no memory
    May 31 04:20:22 Natasha nginx: 2021/05/31 04:20:22 [error] 7857#7857: shpool alloc failed
    May 31 04:20:22 Natasha nginx: 2021/05/31 04:20:22 [error] 7857#7857: nchan: Out of shared memory while allocating message of size 9431. Increase nchan_max_reserved_memory.
    May 31 04:20:22 Natasha nginx: 2021/05/31 04:20:22 [error] 7857#7857: *80243 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    May 31 04:20:22 Natasha nginx: 2021/05/31 04:20:22 [error] 7857#7857: MEMSTORE:00: can't create shared message for channel /disks
    May 31 04:20:22 Natasha nginx: 2021/05/31 04:20:22 [alert] 10902#10902: worker process 7857 exited on signal 6
    May 31 04:20:23 Natasha nginx: 2021/05/31 04:20:23 [crit] 7876#7876: ngx_slab_alloc() failed: no memory
    May 31 04:20:23 Natasha nginx: 2021/05/31 04:20:23 [error] 7876#7876: shpool alloc failed
    May 31 04:20:23 Natasha nginx: 2021/05/31 04:20:23 [error] 7876#7876: nchan: Out of shared memory while allocating message of size 9431. Increase nchan_max_reserved_memory.
    May 31 04:20:23 Natasha nginx: 2021/05/31 04:20:23 [error] 7876#7876: *80247 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    May 31 04:20:23 Natasha nginx: 2021/05/31 04:20:23 [error] 7876#7876: MEMSTORE:00: can't create shared message for channel /disks
    May 31 04:20:24 Natasha nginx: 2021/05/31 04:20:24 [crit] 7876#7876: ngx_slab_alloc() failed: no memory
    May 31 04:20:24 Natasha nginx: 2021/05/31 04:20:24 [error] 7876#7876: shpool alloc failed
    May 31 04:20:24 Natasha nginx: 2021/05/31 04:20:24 [error] 7876#7876: nchan: Out of shared memory while allocating message of size 9431. Increase nchan_max_reserved_memory.
    May 31 04:20:24 Natasha nginx: 2021/05/31 04:20:24 [error] 7876#7876: *80250 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    May 31 04:20:24 Natasha nginx: 2021/05/31 04:20:24 [error] 7876#7876: MEMSTORE:00: can't create shared message for channel /disks
    May 31 04:20:24 Natasha nginx: 2021/05/31 04:20:24 [alert] 10902#10902: worker process 7876 exited on signal 6

     

    Can't even open a console :(

    Any "simple" solution?

  11. My issue seems not be the same, as the drived do spin down, but spin "magicly" up again after a short while... and spin down again...

    Tried the SAS PlugIn, but hasn't changed much.

    Shall I post in that thread you linked?