Jump to content

robobub

Members
  • Posts

    75
  • Joined

  • Last visited

Everything posted by robobub

  1. Thanks, will update that. Was just a temporary lazy way of having it dynamically SMB exported without modifying smb-extra.conf. Better ideas welcome, I am new to unRAID. Out of curiosity I did see if it helped the "not supported" issue but unfortunately not.
  2. I have a 3 drive array + 1 parity disk and 1 cache drive. For a share that has cache=yes and a directory that hasn't been written to recently (seems like when the directory doesn't exist yet on /mnt/cache/), writes are initially "not supported" but then soon after become possible. EDIT: It's not a time delay. It's only making one directory at a time on /mnt/cache/$share to match what is on /mnt/user/$share, see my test here Arrays are Btrfs-encrypted, cache is xfs-encrypted, tons of free space on both, everything spun up. This does not occur on shares with cache=no. Ran some tools like Docker Safe New Permissions, no effect. Reproducible by just going to a different directory that exists on the array but not in /mnt/cache yet. Occurs in safe mode as well, diagnostics on latest post. On 6.8.1 Windows via smb says: Diagnostics: tower-diagnostics-20200118-1549.zip
  3. So while it was a conscious decision to exclude cache, but me and many others like to store some data longer term on cache. Is it possible to have an option to check cache disks for integrity? It happens to be my only disk without Btrfs, so no scrub is available either
  4. I have an issue where the vpn goes down but the watchdog doesn't detect it. I've attached by supervisord.log with debug turned on. From looking at watchdog.sh it seems that I still technically have a vpn connection (port, ip address), it's just I can no longer access anything (e.g. pinging google or anything else doesn't work). Since I know it's likely just my VPN provider is poor and has a unique failure case, what's the best way to add / modify a script to the docker (e.g. checking ping and then touching /tmp/portclosed) to check this yet still get any updates from binhex's upstream docker? I could do this externally with a user.script and docker exec -d, but I'm not seeing a way around having to run this regularly and checking if the script is running as there's no way to synchronize with docker starting the container or restarting it. 20200116_vpn_notresponding_supervisord.zip
  5. I had similar symptoms, using an older Samsung 830 SSD as a single Btrfs LUKS-encrypted cache. When copying very large file, iowait would hit the 80's and then at some point the system became unresponsive, and write speeds were around 80 MB/s. Howerver, moving to XFS LUKS-encrypted did not help things at all. In my case, it had to do with LUKS-encryption. Moving to non-encrypted cache, either Btrfs or XFS, iowait would be much lower, and write speeds at 200. However, I'm on an i7-3770 which has AES acceleration and have barely any CPU utilization One guess is that the 830 controller doesn't handle incompressible data as well, but looking at reviews, that's where it shined compared to Sandforce controllers. Some searching lead me to this post: Setting the IO Scheduler to none for my cache drive helped a bit, but lowering nr_requests with any IO scheduler helped more, at least in my case.
  6. So the issue is one of my failing drives, that I'm running tests on, has become unresponsive and doesn't respond to smartctl. It is interesting though since it is still making a bit of progress on badblocks, and the smartctl query that's hung even has a timeout parameter, but that doesn't help. So the GUI is also querying that drive and hanging. Nothing in dmesg about the drive. Removing that drive restores everything without rebooting.
  7. Thanks for the suggestions. I tried closing all browser sessions. I even sent SIGSTOP to preclear and badblocks, paused all docker containers, and the issue remains. The system is very responsive via terminal and the docker container GUIs even before any of the above. Preclear and badblocks have been running for days (I'm doing a lot of passes, as these are old drives and I'm paranoid) before this issue. I do have a swapfile on my SSD on this machine until more memory arrives, but nothing is being swapped in and out (according to dstat). I've captured top before and after pausing everything (36% idle, 56% iowait and 72% idle, 25% iowait respectively), let me know if there's anything else I can provide. Is restarting the nginx service safe to do and a potential way of recovering? Though perhaps it's worth keeping in this state to figure out what happened. top-2020-01-14.zip
  8. I uploaded a zip file just now. diagnostics still hasn't completed and isn't using very much CPU, so it seems like it's another potentially related issue, so that's why I just grabbed the syslog. Let me know what other information I can provide to help debug both the GUI issue and diagnostic collection. I'm comfortable with modifying some php code in /usr/local/sbin/diagnostics
  9. unRAID's GUI stopped responding to me, reporting "500 Internal Server Error" and lots of "upstream timed out (110: Connection timed out) " and "auth request unexpected status: 504 while sending to client" in the syslog. All of my docker GUIs, shares, smb, ssh, etc. all still seem to work fine. I did run diagnostics on the command line but it seems to be taking a very long time to run. I am running a fair amount of stuff (duplicati doing a large backup, downloads, preclears, badblocks on drives that are unassigned which discovered issues) but it had been working fine for hours. I'll post my syslog for now, and diagnostics when that process finishes. Along with finding a cause, is there a way to recover the GUI without rebooting? Issues started around Jan 14 13:30 this is a recent setup, started with 6.8.0 a few weeks ago, and upgraded to 6.8.1 a few days ago
  10. Unraid has redundancy, it's just distributed differently. And yes, the idea is parity would not be synced in this scenario because the first read after bitrot occurs, the individual disk filesystem notices a different checksum. Unraid would then need to read all the corresponding sections from other disks to figure out which exact bits changed in the extent. Yes, literally rebuilding from parity, but automatically and just that one corrupted extent. Now, this would require more synchronization between a checksumming filesystem and Unraid and maybe that is not easily achievable.
  11. I see where the misunderstanding is coming from. You've missed an important part of my feature request: integration with checksumming. That is what both BTRFS and Dynamix File Integrity that I mention offer for detecting errors. That tells you which drive has the error, and where. Then it's a matter of using parity to determine which bit(s) are corrupted. This is essentially how BTRFS and ZFS can do that silent corruption repair when they have parity. Does that make sense?
  12. Please help me understand how that statement is incorrect. Unless you're being pedantic and would prefer "the data can be reconstructed from parity and the other drives"
  13. Presumably with a watt meter that the server's PSU plugs into.
  14. +1 I have drives of varying size and it's not easy to get the smaller ones used automatically. High-water bases it off of the largest drive. The way I have it set up now is have certain shares exclude my large drives which isn't ideal.
  15. That error means the data didn't transfer correctly, but it was able to correct itself. Since it's just that error and none of the other important fields have issues, I would just monitor it and see if it increases. It could've been some random issue at the time that no longer is occurring. If it is still increasing, try changing the SATA cable and make sure it's not wrapped around any power cables https://hardforum.com/threads/what-is-udma_crc_error_count.1558825/
  16. I would love to see silent corruption detection natively integrated with parity checking for automatic repairing of the corruption. BTRFS only supports determining if there is corruption with a single drive, but the data exists in parity. Shouldn't have to go dig up a backup for this. Same with the Dynamix File Integrity plugin, from what I understand.
  17. I love that you can grow your storage gradually. I would love to see silent corruption detection natively integrated with parity checking for automatic repairing of the corruption. BTRFS only supports determining if there is corruption with a single drive, but the data exists in parity. Shouldn't have to go dig up a backup for this. Same with the Dynamix File Integrity plugin, from what I understand.
×
×
  • Create New...