February 11, 201115 yr This is a problem for which there are some obvious solutions, but I gotta believe it's already been done. Want something lightweight and flexible. I have a LOT of files on unraid. How can I have some sense of assurance that the fileset is complete and correct? Obvious solutions are to write scripts to walk the subdirectories, stick the files in a database (or maybe NOSQL in this case, as there isn't much benefit to a relational db) with data like - drive - path - filename - actual filesize - hash type (md5 , sha, etc) - [each hash] - date/time first seen - date/time last checked Bonus points if it's written in something portable like Python. Lot of advantages to such a db; could use it as a means of hash-deduping; could use it as a means of heavy backups (create multiple, independent instances of each and every file, to build in LVM quorum-like things with 3 copies of everything). Detect bit rot; correct it if backups are available. This is more important than ever because all our life's work has left the physical domain and entered the digital one. In my case, I'm thinking about thousands of pictures of my kids. Surely you guys must share this problem?
Archived
This topic is now archived and is closed to further replies.