ptr727

Members
  • Posts

    139
  • Joined

  • Last visited

Everything posted by ptr727

  1. Flexible, yes, exactly the reason I chose to go from hardware RAID6 to Unraid. Dead disk single disk loss, don't care, a partial loss when viewed as a whole is a loss when consistent state recovery can only be done by restore from backup. But, if fuse breaks file locking, that normally works on the underlying XFS disks, then my view to the logical fs is broken. I know I'm starting to sound like a broken record, and I don't want my feelings to reflect negatively on the great work done by lsio, but if it really is fuse that breaks file locking, then that is a big deal to me, and I'll dig down the rabbit hole
  2. Ok, I didn't know that individual disks remain protected. As for the rabbit hole, the fact that it works for some users and not for others, makes it even more scary to me, in my mind a fs is foundational, and it needs to be rock solid, always. I now regret not giving the ZFS based solutions a second look.
  3. With "our issue", do you mean Unraid or LSIO, yes, not a LSIO issue, but absolutely an Unraid issue, breaking fs locking is a big no-no? <rant-on> If appdata is "designed" to be on cache, then I have not seen any such pre-purchase docs that tell me that without a cache my Unraid will break if any apps require fs locking to you know, work, before I spent good money on buying two pro licenses, and converting two working systems to Unraid. <rant-off> As for it only breaking sqlite, it probably breaks any code that relies on the filesystem locking semantics to work, but sqlite in docker is just prevalent. Now that I know what to search for, google and github is full of reports of docker based Sonarr, Radarr, Plex, etc. SQLite corruption on Unraid. I don't like the idea of having to use a dedicated disk to bypass fuse breaking locking, two alternatives come to mind; change the sqlite locking semantics used in docker apps running on unraid, and obviously unraid fixes locking. Has there been any attempts at collaboration with Sonarr / Radarr / Lidarr / SQLite to try and have a sqlite configuration that works on unraid fuse?
  4. Wow, really, that seems like a major problem for any code that relies on file system locking to work like a local fs? If I am now forced to use a single disk for any app that relies on fs locking, it also breaks the intent of a resilient filesystem, i.e. my app dies when the disk dies, vs. my app dies when disk plus parity dies? If I did add a cache, would BTRFS be impacted by fuse, I assume not? Any pointers to docs about the issue with fuse and locking, or plans to address the issue?
  5. Here is my command: root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker run -d --name='sonarr' --net='br0' --ip='192.168.1.16' -e TZ="America/Los_Angeles" -e HOST_OS="Unraid" -e 'TCP_PORT_8989'='8989' -e 'PUID'='99' -e 'PGID'='100' -v '/mnt/user/download/':'/downloads':'rw' -v '/mnt/user/media/':'/media':'rw' -v '/mnt/user/appdata/sonarr':'/config':'rw' 'linuxserver/sonarr' 31eff3d78164112cccc1355564430e0f218094f8dc062d58488b6b3e56bbc260 I do not currently have any cache drives. I use /mnt/user/appdata for /config. Btw, I now notice that Plex is also misbehaving. Plex was unresponsive, restarted container: ... Sqlite3: Sleeping for 200ms to retry busy DB. Sqlite3: Sleeping for 200ms to retry busy DB. Sqlite3: Sleeping for 200ms to retry busy DB. Sqlite3: Sleeping for 200ms to retry busy DB. Sqlite3: Sleeping for 200ms to retry busy DB. Sqlite3: SleepinCritical: libusb_init failed [cont-finish.d] executing container finish scripts... [cont-finish.d] done. [s6-finish] waiting for services. [s6-finish] sending all processes the TERM signal. [s6-finish] sending all processes the KILL signal and exiting. [cont-init.d] 60-plex-update: exited 0. [cont-init.d] 99-custom-scripts: executing... [custom-init] no custom scripts found exiting... [cont-init.d] 99-custom-scripts: exited 0. [cont-init.d] done. [services.d] starting services Starting Plex Media Server. [services.d] done. Starting Plex Media Server. Starting Plex Media Server. Starting Plex Media Server. Starting Plex Media Server. Starting Plex Media Server. Starting Plex Media Server. ...
  6. For the USB creator tool, please enable UEFI by default, it took a bit of head scratching to figure out why the BIOS shows UEFI as a boot option for USB device, but it does not boot. Once I recreated the USB drive and selected the UEFI option under advanced settings, it showed UEFI and booted UEFI. From what I understand about the boot format, adding the UEFI option will still allow legacy booting, but not having the UEFI option will break UEFI booting.
  7. Big deal, I don't know, who said anything about a big deal? I just asked if there is an alternative to recreating the container to get a log, and since it is such a big deal (see what I did there ) to get the log, suggested an enhancement to make the details end up in logs. Btw, I found several other reports of DB corruption in Sonarr, Radarr, Docker, Unraid, could be coincidental. E.g. https://github.com/Sonarr/Sonarr/issues/1886 https://github.com/docker/for-win/issues/1385 https://forums.sonarr.tv/t/nzbdrone-db-constant-corruption-docker/17658 https://forums.sonarr.tv/t/database-file-config-nzbdrone-db-is-corrupt-restore-from-backup-if-available/21928 And this FAQ: https://github.com/Sonarr/Sonarr/wiki/FAQ#i-am-getting-an-error-database-disk-image-is-malformed I'll report the docker log when I get home.
  8. Because it is currently running, and to get the logs as suggested, I need to stop and recreate the container. Was hoping that there will be a historic log file I can reference, seems important enough to log?
  9. It could be my system, but I would expect to see other symptoms, I am not. It could just be Sonarr that messes up its own DB, and maybe it has nothing to do with Docker or the config path mapping. Where can I get a log of my docker run command without recreating the container?
  10. Hi, I am having DB corruption issues with Lidarr and Sonarr. Unraid 6.7.0, install containers, add lots of media, run for a bit, and then errors. E.g. System.Data.SQLite.SQLiteException (0x80004005): database disk image is malformed database disk image is malformed at System.Data.SQLite.SQLite3.Reset (System.Data.SQLite.SQLiteStatement stmt) [0x00083] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at System.Data.SQLite.SQLite3.Step (System.Data.SQLite.SQLiteStatement stmt) [0x0003c] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at System.Data.SQLite.SQLiteDataReader.NextResult () [0x0016b] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at System.Data.SQLite.SQLiteDataReader..ctor (System.Data.SQLite.SQLiteCommand cmd, System.Data.CommandBehavior behave) [0x00090] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at (wrapper remoting-invoke-with-check) System.Data.SQLite.SQLiteDataReader..ctor(System.Data.SQLite.SQLiteCommand,System.Data.CommandBehavior) at System.Data.SQLite.SQLiteCommand.ExecuteReader (System.Data.CommandBehavior behavior) [0x0000c] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at System.Data.SQLite.SQLiteCommand.ExecuteScalar (System.Data.CommandBehavior behavior) [0x00006] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at System.Data.SQLite.SQLiteCommand.ExecuteScalar () [0x00006] in <61a20cde294d4a3eb43b9d9f6284613b>:0 at Marr.Data.QGen.InsertQueryBuilder`1[T].Execute () [0x00046] in C:\BuildAgent\work\5d7581516c0ee5b3\src\Marr.Data\QGen\InsertQueryBuilder.cs:140 at Marr.Data.DataMapper.Insert[T] (T entity) [0x0005d] in C:\BuildAgent\work\5d7581516c0ee5b3\src\Marr.Data\DataMapper.cs:728 at NzbDrone.Core.Datastore.BasicRepository`1[TModel].Insert (TModel model) [0x0002d] in C:\BuildAgent\work\5d7581516c0ee5b3\src\NzbDrone.Core\Datastore\BasicRepository.cs:111 at NzbDrone.Core.Messaging.Commands.CommandQueueManager.Push[TCommand] (TCommand command, NzbDrone.Core.Messaging.Commands.CommandPriority priority, NzbDrone.Core.Messaging.Commands.CommandTrigger trigger) [0x0013d] in C:\BuildAgent\work\5d7581516c0ee5b3\src\NzbDrone.Core\Messaging\Commands\CommandQueueManager.cs:82 at System.Dynamic.UpdateDelegates.UpdateAndExecute4[T0,T1,T2,T3,TRet] (System.Runtime.CompilerServices.CallSite site, T0 arg0, T1 arg1, T2 arg2, T3 arg3) [0x00136] in <35ad2ebb203f4577b22a9d30eca3ec1f>:0 at (wrapper delegate-invoke) System.Func`6[System.Runtime.CompilerServices.CallSite,NzbDrone.Core.Messaging.Commands.CommandQueueManager,System.Object,NzbDrone.Core.Messaging.Commands.CommandPriority,NzbDrone.Core.Messaging.Commands.CommandTrigger,System.Object].invoke_TResult_T1_T2_T3_T4_T5(System.Runtime.CompilerServices.CallSite,NzbDrone.Core.Messaging.Commands.CommandQueueManager,object,NzbDrone.Core.Messaging.Commands.CommandPriority,NzbDrone.Core.Messaging.Commands.CommandTrigger) at NzbDrone.Core.Messaging.Commands.CommandQueueManager.Push (System.String commandName, System.Nullable`1[T] lastExecutionTime, NzbDrone.Core.Messaging.Commands.CommandPriority priority, NzbDrone.Core.Messaging.Commands.CommandTrigger trigger) [0x000b7] in C:\BuildAgent\work\5d7581516c0ee5b3\src\NzbDrone.Core\Messaging\Commands\CommandQueueManager.cs:95 at NzbDrone.Core.Jobs.Scheduler.ExecuteCommands () [0x00043] in C:\BuildAgent\work\5d7581516c0ee5b3\src\NzbDrone.Core\Jobs\Scheduler.cs:42 at System.Threading.Tasks.Task.InnerInvoke () [0x0000f] in <6649516e5b3542319fb262b421af0adb>:0 at System.Threading.Tasks.Task.Execute () [0x00000] in <6649516e5b3542319fb262b421af0adb>:0 Is there a systemic problem with LSIO and Unraid 6.7.0, or is there something wrong with Sonarr/Radarr/Lidarr?
  11. Hi, I just installed on a new server running 6.7.0, and I pre-cleared 24 drives. When I got back the web UI, I got a send telemetry alert, ok, send again, ok, send again, ok, slower send again, ok, slower slower slower, send again, no response. It is really annoying to have to send for what appears to be every drive, please just send once and be done with it, or ask if I want to send and then send automatically. There is also some problem that the system becomes less and less responsive with every send. I have now waited several minutes, refreshed the page, and still no UI, I can see from the Chrome trace that the dynamix.js is doing something, and it took about 400s for it to render the page after the last send button press. See timeline in Chrome developer tools screenshot.
  12. Unraid 6.7.0 Server name is Server-2, local TLD is set to "home.insanegenius.net". Static IP, DNS entry for server-2.home.insanegenius.net. I am using my own wildcard certificate for *.home.insanegenius.net. CN = *.home.insanegenius.net OU = PositiveSSL Wildcard OU = Domain Control Validated I copy my PEM file to config/ssl/certs/certficate_bundle.pem This server is called server-2.home.insanegenius.net. When I access https://server-2.home.insanegenius.net all is well, and it uses the *.home.insanegenius.net certificate as expected. When I access http://server-2.home.insanegenius.net, I get a 302 redirect, and the browser tries to open https://%2A.home.insanegenius.net/. I use Google Chrome developer tools, and I can see the 302 redirect as follows: Request: Request URL: http://server-2.home.insanegenius.net/ Request Method: GET Status Code: 302 Moved Temporarily Remote Address: 192.168.1.36:80 Referrer Policy: no-referrer-when-downgrade Response: Connection: keep-alive Content-Length: 154 Content-Type: text/html Date: Thu, 16 May 2019 15:56:28 GMT Location: https://*.home.insanegenius.net:443/ Server: nginx Nginx incorrectly returns an invalid URI, "https://*.insanegenius.net:443" instead of the correct "https://server-2.home.insanegenius.net". When I look in the Nginx emhttp-servers.conf file, I can see that whatever code created this config, created an invalid 302 redirect value. server { # # Redirect http requests to https # listen *:80 default_server; listen [::]:80 default_server; return 302 https://*.home.insanegenius.net:443$request_uri; } It looks like the logic incorrectly uses the SSL cert CN instead of the server FQDN. The logic should be fixed, or can be avoided by using something generic like "return 302 https://$host$request_uri;"
  13. Ok, I discovered a few interesting things. Unraid SMB fails to set correct timestamps if the NTFS are not linux friendly, i.e. not between 1970 and 2038. I had several files with NTFS timestamps that were either the min value 1601 or the max value 30828. Explorer does not display a timestamp if older than 1970. I discovered some folders that ended in a ., and Unraid SMB discards the . I had a folder called "LocalState" and another folder "LocalState..", when copying the "LocalState.." folder is copied it into the "LocalState" folder. I wrote a little app (C# .NET Core) that iterates through all the files I want to copy to Unraid, and it fixes the timestamps, and warns me of files or folders that end with a dot. Once all my files were sanitized, robocopy completed successfully.
  14. I don't think this is a permission problem, this is a file timestamp problem. The files that will always be "older", are files that have a created date, but not a modified or accessed, interpreted as "1/1/1970 12:00:00 AM" UTC. I wrote a .NET Core app to set the destination file timestamps to the source timestamps, does not work. To confirm I wrote a native C++ app to do the same, same behavior. robocopy NTFS to NTFS ok. robocopy NTFS to SMB W2K16 NTFS ok. robocopy NTFS to SMB Unraid fail. I did find various bug reports of CIFS not correctly preserving times, and that in Linux there is no created time, just changed (attributes), modified (contents), and accessed times. I am not certain, but it also looked as if the timestamps I set were changed during the night, i.e. I set them last night, this AM they were back to 1980 dates. I will have to look more. Try it out using the attached 7z file, extract to NTFS. robocopy.exe "C:\Users\piete\Downloads\ttk" "\\UNRAID\Transfer\test" /mir /fft The expected behavior is robocopy once, but actual behavior is older and copy over and over. ttk.7z
  15. I am copying files from my W2K16 server to Unraid using robocopy over SMB. I noticed a problem where it appears that the files and folders are not being correctly synced. I do: C:\Users\Administrator\Desktop>robocopy.exe D:\Backup \\Unraid\backup /mir /r:1 /w:1 /fft /mt ------------------------------------------------------------------------------- ROBOCOPY :: Robust File Copy for Windows ------------------------------------------------------------------------------- Started : Wednesday, May 8, 2019 8:58:45 PM Source : D:\Backup\ Dest : \\Unraid\backup\ Files : *.* Options : *.* /FFT /S /E /DCOPY:DA /COPY:DAT /PURGE /MIR /MT:8 /R:1 /W:1 ------------------------------------------------------------------------------ *EXTRA Dir -1 \\Unraid\backup\Profiles\SZ170R6V2\piete\AppData\Local\Packages\Microsoft.Windows.Cortana_cw5n1h2txyewy\L8EVMY~1\ I can run robocopy over and over, and that same directory will never delete. While "rd \\Unraid\backup\Profiles\SZ170R6V2\piete\AppData\Local\Packages\Microsoft.Windows.Cortana_cw5n1h2txyewy\L8EVMY~1\" does delete the folder. See the attached image, the directory on Unraid is there, but there are also directories that are not syncing, the "LocalState.." directory, note the double dots on the end of the folder name. When I rerun robocopy, I will get the hundreds of files being reported as older on unraid, when those files have never been changed. I am using the fat file time /FFT option to work around the lack of nanosecond granularity on EXT4/XFS vs. NTFS. Same as the directory that does not delete, I can run robocopy clean, run it again, and it will copy a bunch of files again. It is as if the file timestamps are even less granular then /FFT, or the file timestamps get changed by unraid after copy, invalidating the previous operation. When I looked at two files where the timestamps were wrong, I can see that the unraid timestamps is not the same as that of the W2K16 server. Is this an unraid / SMB timestamp problem? See image. I am concerned that my data copy is not accurate. Any known problems with robocopy or differences between windows handling of names and unraid handling of names of SMB?
  16. I didn't say it is easy, but it can be done (time, money, resources), e.g. SMB handle is different to FS handle, can be remapped as needed. A cache that needs reserved space in anticipation of a large file is wasted space, e.g. thin provisioned VM image grows beyoind size and fails permanently, e.g. copy a file that is too big and fail permanently. An alternative is obviously to support SSD drives as data storage, then there is no longer a need to use a cache of SSD's when the main array is made of SSD's.
  17. Hi, I am trying to use ssi, but it looks like it is not working as expected. Is ssi support enabled for nginx in this container? Update: After waiting a few minutes and refreshing the page SSI started working. Weird, on first use it takes about 15s for the first page load, no SSI, come back later, refresh, SSI works. Any ideas how to get he first page load faster, and how to get SSI working on first load?
  18. Hi, I temporarily removed my cache to make space for new HDD's, and noticed the container would not start. I think the issue is the config path explicitly points to a cache path instead of the appdata share. Maybe wrong: /etc/netdata/override : /mnt/cache/appdata/netdata/ Works with no cache: /etc/netdata/override : /mnt/user/appdata/netdata
  19. Hi, this seems like a really cool project. I currently run nginx with a handcrafted configuration and custom ssl certs. I will be trying to replace my custom config with yours. One thing I would like to ask you to add is hosting a simple static site using nginx? E.g. I have a static page that has links to all the services in my home, my wife has a bookmark to it, and she uses it whenever she needs access something I host. In my config the default site, and an explicit host, points to a directory, and I manually edit the content as needed. If you can add such a feature, I would not have to run a second nginx instance.
  20. Yes please. The cache needs to be transparent, and never interfere with file operations. Set cache high and low watermark, e.g. % of space, or absolute space. High watermark start moving files, e.g. less than 10% free start moving files. Low watermark stop moving files, e.g. more than 50% free stop moving files. Pick files to move by age and access count, e.g. move least accessed files first. Many well behaved apps and bulk copy apps, e.g. robocopy, will reserve space before copying file contents. This is an ideal opportunity for "thin provisioning" systems to allocate the storage in a physical location with enough space. E.g. min free space set 2GB, app creates a new file, app sets the file size to 4GB, before the app starts writing the content, move the file creation to a drive with space, then when the write starts happening, there is enough space. In this scenario the only failure case will be create, write, write, out of space.
  21. Ok, so the only time a file copy should fail is if it is larger than the min free space, and the copy operation did not try to reserve space before copying? My experience shows that performance starts to severely degrade as soon as space near runs out. I stopped copying, started the mover, will take the array offline and increase min free space and try again. It does seem like a high and low watermark for the cache with auto move is missing.
  22. Ok, that is fine, but that means it better never run out of space. Will it start moving files when it is near out of space?
  23. Looks like cache min free space is at default 2000000. Per the help that looks like "2000000 => 2,048,000,000 bytes" = 2GB. Some files may be larger, but if the cache is not large enough for the file, should it not go direct to HDD? I understand that based on how the write happens the filesystem may not know the end size, but still, run out of space write to HDD, or get near out of space start moving files. Or is it run out of space and fail?
  24. I am busy moving several TB of data from my existing server to Unraid. I have a 4 x 1TB SSD cache setup, plus several HDD's. I tested performance with the share set to not using cache, and it is running at about 400Mbps, I can see disk IO is split between a storage drive and the parity drive. If I enable cache preferred, I get about 1Gbps, expected saturation of the network, and all IO is to the SSD cache. As the cache is filling up, I get tons of warnings, and around 95% speed becomes drops between 0 and few Kbps, and the client starts timing out. I expect full speed until the cache is near full, then back to HDD speeds around 400Mbps. I do not expect the system to fall over or crawl to a halt. I had to turn the cache function off as it is not working as I expect. Is there anything I can do to prevent the system become unresponsive when the cache gets full?
  25. Ah, got it, working now. Please add the instructions to your git md, and the support thread, will save many people lots of time to get things working.