zaraki1311 Posted December 21, 2018 Share Posted December 21, 2018 Hello, Just posting to see if anyone has any thoughts as to what happened or how to prevent this issue in the future. So last night just be for midnight I got the errors below on my cache pool. My cache pool consists of 4 250gb Samsung 860 evos that are very well aged somewhere between 2-3 years on time each. Looking at the cache it is setup to be btrfs raid 1. When I looked at my unraid server this morning there was an error on the cache pool saying there were was no file system. I rebooted my server and it still did not come back. The only thing I see with the disks is that one disk as some crc errors in the smart data. I decided to format the pool so I could get up and running again, but I need to know if there is something I can do to protect myself in the future or if this might just be a drive failing and I should get new ones and the cache to a different raid level. Dec 20 23:37:26 ArlongPark kernel: BTRFS critical (device sdg1): corrupt leaf: root=2 block=723569819648 slot=105, unexpected item end, have 12754 expect 12818 Dec 20 23:37:26 ArlongPark kernel: BTRFS: error (device sdg1) in __btrfs_free_extent:6953: errno=-5 IO failure Dec 20 23:37:26 ArlongPark kernel: BTRFS info (device sdg1): forced readonly Dec 20 23:37:26 ArlongPark kernel: BTRFS: error (device sdg1) in btrfs_run_delayed_refs:3058: errno=-5 IO failure Dec 21 03:40:01 ArlongPark crond[2324]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Link to comment
trurl Posted December 21, 2018 Share Posted December 21, 2018 syslog snippets are seldom sufficient. Did you get Diagnostics before rebooting? Link to comment
zaraki1311 Posted December 21, 2018 Author Share Posted December 21, 2018 I did not specifically grab them at the time, but I do have a diagnostics file on the flash drive that contains logs from dec 17th to 9:20 this morning arlongpark-diagnostics-20181221-0920.zip Link to comment
JorgeB Posted December 22, 2018 Share Posted December 22, 2018 Error points to metadata corruption, but nothing jumps out, pool appeared to be working normally until the error, sorry, not much help. Link to comment
zaraki1311 Posted December 22, 2018 Author Share Posted December 22, 2018 Is there anything I should possibly try to do to maybe protect myself better? I have been thinking about adding an nvme or 3 to the mix and retire the old drives. So in that case would putting it in raid 5 be a better solution as it almost seems that the raid 1 had no redundancy at all? Also is there a good way to backup the cache? I am running a few vms 100% in the cache so am in bit of a bind as the windows backups may or may not have been working. I currently have the CA AppdataBackup configured to back up my appdata but that doesn't cover my vms Link to comment
JorgeB Posted December 22, 2018 Share Posted December 22, 2018 4 hours ago, zaraki1311 said: Is there anything I should possibly try to do to maybe protect myself better? You should backup frequently, you can either snapshot and use send/receive to another device (this is what I do) or use for example rsync, both can be scripted to run daily, example on how to setup snapshot with send/receive here: https://forums.unraid.net/topic/51703-vm-faq/?do=findComment&comment=523800 Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.