hawihoney

Members
  • Posts

    3497
  • Joined

  • Last visited

  • Days Won

    7

Everything posted by hawihoney

  1. Thanks for your answer. Both M.2 NVMe cache pool disks report around 24 TB written within 2,5 months of operation. At the same time 1,5 TB were read. I bet Amazon refuses RMA when a disk has been completely written 24 times (1 TB disks).
  2. This is the second time that my cache pool dropped a disk and Unraid didn't notice. I did reboot and now, after the reboot, Unraid recognized that a cache pool disk has been gone. A full balance did start automatically for the remaining disk and is running currently. My questions: Why doesn't Unraid recognize when a cache pool disk has been gone. Is there something I can do? I do have a script that queries the cache status but that does inform me - but Unraid behaves as if nothing had happened. Can I get rid of that filesystem BTRFS for my cache pools and use a different FS for the cache pool? Whenever I do experience problems with Unraid most of the time it's BTRFS related. Thanks in advance.
  3. 187? Uah, that's bad. In that situation I usually would replace that bad disk first. You need a 2 TB replacement disk for that step. IMHO this is a bad situation. There's a safe way and a risk way. I always go the safe way (see above).
  4. It will not rewrite the system, it will rewrite a replaced disk. And yes, you'll need to rebuild these three new disks one after another. Start with the parity disk.
  5. Perhaps it's something completely different. I issue that script thru User Scripts. Is there a difference between calling a script thru User Scripts and from command line? ***EDIT*** script didn't work from command line as well. Forget about it. I don't want to butt to much into this thread. My Linux knowledge is somewhere around 0,01%. So it's ok if I can't get it to work. I live with that since nearly three decades.
  6. Thanks for your change. Unfortunately it didn't work neither - even after 100 tries to move the temporary folder. Without the hidden temporary folder and it's finishing 'mv' the script works perfect. Please forget it. It must be something on my side I think. No idea what. I would suggest to remove that particular change because it's related to one (mine) system only. [...] Try #99 to make backup visible mv: cannot move '/mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen/.20201106_072853' to '/mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen/20201106_072853': Permission denied Preserve failed backup: .20201106_072853
  7. Interesting. When issued manually from console on source server it works: root@Tower:~# mv '/mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen/.20201105_183346' '/mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen/20201105_183346' root@Tower:~# This is what I see on target server after the mv: root@TowerVM01:/mnt/disk17/Backup# ls -lisa /mnt/disk17/Backup/disk17/Notizen/ total 0 10952386795 0 drwxrwxrwx 3 hawi users 37 Nov 5 18:34 ./ 8877660570 0 drwxrwxrwx 3 hawi users 29 Nov 5 18:33 ../ 30738631 0 drwxrwxrwx 3 hawi users 51 Nov 5 18:33 20201105_183346/ This is the mount command: mount -t cifs -o rw,nounix,iocharset=utf8,_netdev,file_mode=0777,dir_mode=0777,uid=99,gid=100,vers=3.0,username=hawi,password=******** '//192.168.178.101/disk17' '/mnt/hawi/192.168.178.101_disk17' Looks ok to me. Is it possible that something from within the script is being hold when issued via SMB to a remote server?
  8. I don't use User Shares so I did use Disk Shares and these don't work well with this script. I know, I know, it's not designed that way but I want to share my experience. The reason why I don't use User Shares is, that I do have two different remote Backup Locations and using User Shares with huge directories and files over SMB to remote locations often crash. Using Disk Shares helps most of the time: What I did is: [...] source_paths=( "/mnt/disk17/Notizen" ) backup_path1="/mnt/hawi/192.168.178.101_disk17/Backup" #backup_path2="/mnt/hawi/192.168.178.102_disk1/Backup" That's the result: Create backup of /mnt/disk17/Notizen Backup path has been set to /mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen Create full backup 20201105_130528 sending incremental file list Notizen/ [...] sent 61,223 bytes received 759 bytes 41,321.33 bytes/sec total size is 57,872 speedup is 0.93 mv: cannot move '/mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen/.20201105_130528' to '/mnt/hawi/192.168.178.101_disk17/Backup/disk17/Notizen/20201105_130528': Permission denied Preserve failed backup: .20201105_130528 DONE In your original post you mention 'shares' but disk shares are shares as well. I don't want to rant, I just want to add feature requests: 1.) Always use the last subdir and don't check for /mnt/user resp. /mnt. So in my case 'Notizen' would be used as new subdir for the backup path. It's irrelevant if it's /mnt/user/Notizen or /mnt/disk17/Notizen then. The resulting backup path would be better then and independent from a user share or a disk share. 2.) Don't know why there's a permission problem. My own rsync jobs work that way since years. Any idea?
  9. Sorry, need to ask an additional question: Consider an existing full backup. Two more daily inremental backups exist as well. Now I delete a file. What's the state after the next run of the script? Does the file exist in the full backup folder? Should be. Does the file exist in the two incremental backup folders? Should be. Does the file exist in the new latest incremental backups? Should not. Thanks. This stuff is new to me.
  10. cow? Hmm, system share on cache is set to Auto. What does that mean? Nevermind, found it in help text. Auto is COW for BTRFS. Thanks a lot.
  11. Oops, never seen that. I have no clue what both do. Ok, did run a corrective Scrub. Does that mean everythings ok now?
  12. I'm running the stats regulary. That's why I saw the errors. But Unraid didn't notice the errors til now. If I call stats that's the result: [/dev/nvme1n1p1].write_io_errs 0 [/dev/nvme1n1p1].read_io_errs 0 [/dev/nvme1n1p1].flush_io_errs 0 [/dev/nvme1n1p1].corruption_errs 0 [/dev/nvme1n1p1].generation_errs 0 [/dev/nvme0n1p1].write_io_errs 0 [/dev/nvme0n1p1].read_io_errs 0 [/dev/nvme0n1p1].flush_io_errs 0 [/dev/nvme0n1p1].corruption_errs 0 [/dev/nvme0n1p1].generation_errs 0 Looking at syslog at the same time shows: Nov 2 11:00:04 Tower kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 15050349 off 1114939392 csum 0x382b6324 expected csum 0x54474642 mirror 2 Nov 2 11:00:04 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 15050349 off 1114939392 (dev /dev/nvme1n1p1 sector 534070312) Nov 2 11:12:38 Tower kernel: BTRFS error (device nvme0n1p1): parent transid verify failed on 1481548267520 wanted 16496481 found 16461691 Nov 2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548267520 (dev /dev/nvme1n1p1 sector 337334848) Nov 2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548271616 (dev /dev/nvme1n1p1 sector 337334856) Nov 2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548275712 (dev /dev/nvme1n1p1 sector 337334864) Nov 2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548279808 (dev /dev/nvme1n1p1 sector 337334872) The link in your answer mentioned scrub. Is scrub another name for balance? Many thanks in advance.
  13. Thanks. After 70 hours I had to hard-reset the server because I was not able to force stop the script nor the DD executable. I gave up on that project (tried to remove and zero two empty disks from the array to build a second cache pool in the future).
  14. Update: I could get Docker/VM services to start. I had to delete the docker.img file. This one was corrupt. All Dockers were reconstructed and are running currently. BUT: BTRFS still shows errors on my cache pool. What do I need to fix these? Many thanks in advance.
  15. Yesterday, out of sudden, I did receive errors on my cache pool (2x NVMe M.2 disks). The Unraid main page didn't report these errors even when disk2 of that pool went offline. Today I did restart the server, Unraid comes up, but can't start the docker service. In syslog I see lots of BTRFS errors but Unraid still does not show any problems. It seems that the cache pool does not work any longer but Unraid is working as if nothing had happened. What are the steps to get the cache pool - and Dockers and VMs - back into operation? Rebalance? Diagnostics attached. Many thanks in advance. tower-diagnostics-20201102-0757.zip
  16. The script issues the same command exactly. You run in maintenance mode, I'm in production. Hmm? Now i'm confused. Something wrong with my array? Is there a way to find out at what position dd is working?
  17. I want to throw out a disk from the array and used that documentation: https://wiki.unraid.net/Shrink_array I'm using the save method (parity always valid) and the User Script mentioned to zero a 3 TB disk. Turbo write was activated prior start. This process is running since 51 hours now and there's no end in sight. The array usually writes at 50 MB/s with dual parity. Zeroing "runs" at 10 MB/s. What's wrong with that process? Why isn't it writing/zeroing at similar speed? Why is writing zeros to a disk over 5 times slower than writing random bytes of file content? Any insights are highly appreciated. Many thanks in advance.
  18. What's the product name of your HBA? LSI 3008 is the name of the chip - not the product.
  19. Go to that site: https://www.broadcom.com/support/download-search Product Group: Storage Adapters, Controllers, and ICs Product Family: SAS/SATA/NVMe Host Bus Adapters --> Now select your HBA under Product Name (e.g. SAS 9300-8i or SAS 9300-8e, etc)
  20. What are the steps I need to do before moving from 6.83 to 6.90? I'm talking about passthru of adapters (2x HBA for two VMs, 2x USB for two VMs, 1x GPU for docker). Currently on 6.83 I have possibly redundant settings as follow: 1.) /boot/syslinux/syslinux.cfg: xen-pciback.hide (hide both HBAs, this one redundant?) [...] label unRAID OS menu default kernel /bzimage append xen-pciback.hide=(06:00.0)(81:00.0) initrd=/bzroot [...] 2.) /boot/config/vfio-pci.cfg: BIND (hide both HBAs, or this one redundant?) BIND=06:00.0 81:00.0 3.) First VM: 1x HBA, 1x USB [...] <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='usb' managed='no'> <source> <vendor id='0x0930'/> <product id='0x6544'/> <address bus='2' device='4'/> </source> <alias name='hostdev1'/> <address type='usb' bus='0' port='1'/> </hostdev> [...] 4.) Second VM [...] <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='usb' managed='no'> <source> <vendor id='0x8564'/> <product id='0x1000'/> <address bus='2' device='10'/> </source> <alias name='hostdev1'/> <address type='usb' bus='0' port='1'/> </hostdev> [...] What do I need to remove/add before booting with the new 6.90 system? Thanks in advance.
  21. Some questions allowed? 1.) The Plex folder on my cache contains "trillions" of directories and files. Those files change rapidly. New files are added at high frequency. Does that mean that the result are trillions of hardlinks? Stupid question, I know. But I never worked with rsync that way. 2.) What about backup to remote locations? I do feed backups to Unraid servers at two different remote locations (see below). Will this work and create hardlinks at the remote location? rsync -avPX --delete-during --protect-args -e ssh "/mnt/diskx/something/" "user@###.###.###.###:/mnt/diskx/Backup/something/" Thanks in advance.
  22. Did add it to User Scripts (Array Start). Will re-think upon 6.9 stable. Thanks, man.
  23. Can you please elaborate a little. A small Google search shows this as a BTRFS thing. Reading the 6.9 announcements shows that this is a Samsung, etc./Unraid incompatibility when using different sector alignments (that's what I understand).
  24. Don't know. I just read the 6.9 beta notes. In one of the past beta releases a complete handling has been introduced to move data off these disks, reformat them and move the data back. Beta releases are no option here, so I'm looking for a way to do something similar on stable 6.8.3. ***EDIT*** Your values look as if you are using the cache disk as a cache disk 500 TB read, 233 TB written look reasonable. I use the cache pool as docker/VM store only. My values are 473 GB read, 22 TB written. This is definitely not reasonable.