NFS, Hardlinks, and Postgresql Issues (Read: Lots of Issues)

peachyojon · April 28, 2022

I should preface this by saying all was going well until I changed the structure (cache pools/size etc) of my array, moved some files with unBalancer, and then all hell broke loose - but only for Postgresql!

Also I don't have backups and I've accepted my loss, which isn't too bad as while the db was important it only had recent data on it. I am sourcing a proper backup solution and won't be changing the array again, but I'm new to all this and it has been fun.

My setup includes a HP Dl380P G8 running 6.10.0-rc4. I've got 50+ containers running something or other, many with a mix of NFS share mounts or standard binds to /mnt/user/data or /mnt/user/appdata.

I also had issues with my NFS shares where they would drop off after a single read/write and I would have to restart the container to get it back. After reading the forums to fix this I have to: disable hard links, remove cache, or use CIFS. I chose to disable hard links as I assumed any were viable options.

Note: I started using NFS shares as my *arr stack, jellyfin, nzbd, and qbit/deluge were having permission issues even though they're mostly lsio containers running on 1000:1000 (iirc). Trying to have '/mnt/user/data/shows/??' available to multiple containers was the issue I think. Advice here welcome too.

This fixed my NFS issue but lead me to the Postgres issue I think - even though Postgres is not using an NFS share, it is bind mounted to /mnt/user/appdata/postgres.

My current issue is once the array was reconfigured (1 parity, 6 disks) and a new cache drive, I had mostly everything working with no problems, trying to start Postgresql official docker running 14.2, I started to get an error:

2022-04-28 18:48:12.879 ACST [29] LOG:  could not link file "pg_wal/xlogtemp.29" to "pg_wal/000000010000000000000001": Function not implemented
2022-04-28 18:48:12.882 ACST [29] FATAL:  could not open file "pg_wal/000000010000000000000001": No such file or directory

Stackoverflow posts tell me my db is f*****. Fair enough, they recommended to simply reinstall the Postgres instance and copy the /data directory back. I did this but the issue persisted. I thought maybe disabling the hardlinks caused it (reference 'could not link file') but the kicker is I had it working last night with a fresh DB (literally dozens of re installs and trying nothing new and it worked one time) but now it's not working.

As noted, while the new container worked, I had issues simply creating a new container having the same issue as above, even though the files should not exists.

Questions:

- Could disabling the hard link support in Global Share Settings be causing this issue?, and

- Why would the link issue persist even when I completely reinstall the container?

I believe I know my long-term fix would be to:

- Properly establish bind mounts for all containers, or

- Migrate any share to CIFS/SMB

Please let me know which log files you need.

Edited April 28, 2022 by peachyojon

peachyojon · April 30, 2022

Enabled hard links and migrated to CIFS to fix isues.

NFS, Hardlinks, and Postgresql Issues (Read: Lots of Issues)

Recommended Posts

peachyojon

Link to comment

peachyojon

Link to comment

Join the conversation