  1. If you follow the details of the thread, my research shows the the problem is NOT SMB, the problem is the Unraid FUSE filesystem. Mount your SMB share under a user share (Unraid FUSE code in play), poor performance. Mount your SMB share under a disk share (no Unraid code), normal performance.
  2. For the time being I gave up on Unraid fixing this, I moved to Proxmox with ZFS: Removed link at the request of @limetech
  3. Seems like a lot of dangling volumes for the once or twice I changed a tag, and I can't control what the container authe does. Should these not be cleaned up the same way orphan containers are detected and cleaned? I'll delete them for now.
  4. Can I safely delete them using docker volume prune?
  5. Unfortunately not, but we (90266) only have about two months of 90F+ in my garage, rest of the year mid 70's. My garage is insulated and drywalled, I have a 24/7 extractor fan in the ceiling, I have 4 top mounted extractor fans on the rack, and SNMP and email alert environment monitoring in the rack. On really hot days I run a high speed floor fan and leave the garage door about 6" open. The biggest problem I have is when we park hot engine cars in the garage, the servers don't like that. I've trained the family to let the cars cool down before parking inside in the summer. I wish I had planned better when we built the house and ran ducting for the rack to the outside.
  6. Hi, I'm wondering if BTRFS is the right solution for a resilient cache / storage solution? I run two Unraid servers, primary 40TB disk plus 2TB cache (4 x 1TB SSD), secondary 26TB disk plus 2TB cache (4 x 1TB SSD). On two occasions I've lost my entire Cache volume due one of the drives "failing". I say failing, but really both times it was my own fault, I didn't want to shut down, and I pulled the wrong drive, and immediately plugged it back in. But this is no different to a drive failing, or a connection failing. Pulling disks during certification of large resilient storage systems is a perfectly good test. One would expect the loss of 1 disk in a 4 disk BTRFS RAID10 config to be a non-issue, not so, first the log started showing BTRFS corruption issue, ok, seems it is not being auto fixed, then I run a cache scrub, no errors, still errors in the log, scrub with repair, reported repaired. Then I started getting docker write failures, seems my cache became read-only, and BTRFS corrupt. In both cases I resorted to rebuilding the cache from scratch, and restored appdata backups, lost the VM's (unlike docker stop/restart no easy way to backup VM's). I've run hardware RAID for a long time, including hardware that uses SSD caching, I've lost disks, pulled disks, but in all cases the array eventually comes back on its own. I simply do not have the same trust in Unraid's cache, I think it is fragile, I think it is unreliable to the point where it needs to be backed up constantly. I'd like to see the Unraid/Limetech publish their resiliency test and performance plans? What is tested for, what are known failure scenarios, what are known recoverable scenarios, are my expectations of resiliency and performance unfounded? And this is not about BTRFS, this is about Unraid, I don't care what Unraid uses for the cache volume, it could have supported SSD's in data volumes and no cache would be required, it could have used ZFS and we would have different problems, BTRFS was an Unraid choice, and I find it fragile. What are your experiences with cache resiliency?
  7. How are your SSD's in the array? My understanding is that SSD's are not supported in the array, something to do with how parity calculations are done.
  8. My two servers are in a rack in my garage, I do all work remotely, BIOS updates, BIOS config, OS installs, boot media selection, etc. No need for keyboard or monitor, no need to physically go to the rack for software maintenance tasks. I would not use a mobo without IPMI support.
  9. When I installed the SAS3 cards, I figured I may just as well go with a SAS3 backplane. So I bought refurbished SAS3 backplanes and replaced my SAS2 backplanes, much cheaper compared to a SAS3 case. See: https://blog.insanegenius.com/2020/02/02/recovering-the-firmware-on-a-supermicro-bpn-sas3-846el1-backplane/
  10. I use LSI-9340-8i, SM SAS3 backplane, and 4 x 860 Pro's in BTRFS cache in two Unraid servers. I won't call it a recommendation, but I've not seen errors with this hardware combo.
  11. I tested EVO 840, Pro 850 and Pro 860. Pro 860 works with LSI and TRIM. See: https://blog.insanegenius.com/2020/01/10/unraid-repeat-parity-errors-on-reboot/
  12. I am having a weird problem that I can't explain. I use HandBrakeCLI to convert a file \\server\share\foo.mkv to \\server\share\foo.tmp I rename \\server\share\foo.tmp to \\server\share\foo.mkv I store the \\server\share\foo.mkv modification time for later use. The \\server\share\foo.mkv attributes are read by MediaInfo, FFprobe, and MKVMerge. I come back later, and the stored time no longer matches the file modified time. No other apps are modifying the file, and I can't figure out why the modification time would change. I am running from Win10 x64 to Unraid over SMB to a user share. I ran ProcessMonitor on the Win10 machine, filtering by the file name, and after HanBrakeCLI exists, no modification are done from the Win10 system to the file. The pattern is HandBrakeCLI open write close, My code open read attributes close, FFprobe/MediaInfo/MKVMerge open read close. Wait, My code open read attributes close, timestamp changed from last read timestamp. I wrote a little monitoring app that will compare the file modified time with the previous time every second. The pattern is always the same, file modified time changes on HandBrakeCLI exit, then it changes two more times. E.g. 4/30/2020 10:31:25 PM : 4/30/2020 10:31:21 PM != 4/27/2020 7:40:43 PM 4/30/2020 10:31:45 PM : 4/30/2020 10:31:36 PM != 4/30/2020 10:31:21 PM 4/30/2020 10:31:48 PM : 4/30/2020 10:31:48 PM != 4/30/2020 10:31:36 PM E.g. 4/30/2020 10:54:02 PM : 4/30/2020 10:54:01 PM != 4/27/2020 7:40:43 PM 4/30/2020 10:54:12 PM : 4/30/2020 10:54:03 PM != 4/30/2020 10:54:01 PM 4/30/2020 10:54:17 PM : 4/30/2020 10:54:17 PM != 4/30/2020 10:54:03 PM E.g. 4/30/2020 11:16:13 PM : 4/30/2020 11:16:12 PM != 4/27/2020 7:40:43 PM 4/30/2020 11:16:24 PM : 4/30/2020 11:16:15 PM != 4/30/2020 11:16:12 PM 4/30/2020 11:16:26 PM : 4/30/2020 11:16:26 PM != 4/30/2020 11:16:15 PM I am speculating that the file modification is happening on the Unraid side. A wild guess; maybe the FUSE code buffers the write, and the last buffered write updates the modified time, and when Samba comes back later, the now modified time is read, instead of the time at SMB file close? Any other ideas?
  13. I have some code that uses dotnetcore filesystemwatcher to trigger when changes are made to directories. When the directory is a SMB share on Unraid, and changes are made from a docker container the underlying directory, the SMB share does not trigger the change. E.g. SMB share \\server\share\foo points to /mnt/user/foo Windows client monitors for changes in \\server\share\foo Docker container writes changes to /mnt/user/foo Windows client is not notified of the changes. Is this expected behavior with Samba on Linux (I did not test), or is this something with Unraid and user shares that are not triggering Samba changes?
  14. Seems highly unlikely that this is a LSI controller issue. My guess is the user share fuse code locks all IO while waiting for a disk mount to spin up.
  15. I noticed that existing SMB network IO will halt while a new disk spins up, even if that disk has nothing to do with servicing existing IO requests. E.g. start a ffmpeg encode session for source and destination media on the SMB network share, wait for other disks to spin down, ffmpeg chugs along, open file explorer and browse around the SMB filesystem, every time you hit a share with disks spun down there will be a delay while the disks will spin up, while the disks are spinning up ffmpeg transcoding halts until the disk is spun up. Expected behavior is that existing IO is not halted while unrelated disks are spun up that have nothing to do with servicing that IO.