Jump to content

Cache btrfs - write time tree block corruption


Go to solution Solved by JorgeB,

Recommended Posts

Having more issues with my cache drives. They're a 970 Evo and 970 Evo Plus in raid0. In the past the Evo Plus was getting zero-log corruptions and that was a simple command to fix along with an array restart. This all started when I issued a shutdown and changed the power strip the server was connected to. As far as I could tell the server was fully shut down before removing power.

 

Right now I have appdata, system, and domains directories stored on the cache. I have a partial old backup of a VM on the array, and full backups of appdata on the array. No backup of system.

 

I originally thought I would benefit from the cache drives, but really I only use it for the VM and quick docker container startups.

My questions are; will my SSDs run full speed if they are adopted into the array? The array has a 4TB HDD parity drive.

Second; How much do you believe is to be corrupted on the cache? Could I safely backup what left of the VM. I plan to move the entire VM image to the array as I am also having issues with the VM manager.

Third; Is there anyway for me to fix the cache and continue current setup (though I still believe my best setup now is having the SSDs in the array)?

 

 

The problems the bad cache drives are causing me:

 

When I launch the VM I have it tells me it can't launch because the file system is read only. Right now VM manager is disabled while I trouble shoot the docker containers. I haven't tried launching the VM when the cache was in a good operating state yet.

 

Docker stuff will run initially when unraid first boots. Docker containers will run fine for 5 minutes or up to about an hour. Once I get cache errors some of the containers will stop working while others will continue operating fine. While in an error state none of the containers will be able to restart, even the good ones, I get a generic server error or a code 403.

 

 

Things of note:

I'm aware of the low disk space. I'm currently removing old files to make room for the VM image. And I'm saving for more drives. I hear its possible to increase the parity drive size with some effort.

 

Also, the parity drive has some SMART errors. I've run multiple deep scans and the errors have stopped. Still planning to replace this drive.

 

The array is XFS

waffle-diagnostics-20240521-1737.zip

Link to comment

@JorgeBI've not had any issues with cache until I added that 2nd nvme. I also switched from xfs to btrfs for the cache drives. As far as I know I have not overclocked the RAM. I think the RAM was rated for 4000mhz but I've not been able to run it past whatever it is at right now (3666mhz? 3800?) I understand supported frequencies for the board and cpu, I'll adjust RAM to supported frequencies. 

 

How should I scrub and reset pool stats?

 

Any downside to having the 2 nvme drives in the array along with the HDDs? 

Link to comment
14 minutes ago, offroadguy56 said:

How should I scrub and reset pool stats?

See the link for the stats, scrub you just need to click on the pool, then scroll down to the scrub section, run a correcting one.

 

14 minutes ago, offroadguy56 said:

Any downside to having the 2 nvme drives in the array along with the HDDs?

If you mean assigned to a pool, no.

  • Like 1
Link to comment

I've got ram back to recommended speeds of 3200. I found that it was at 4000 and I honestly don't remember setting it to 4000, I could have sworn I left it at 3600. To match the speed of the infinity fabric, or so I've read it helps performance/stability.

 

Anyway, I ran scrub on the cache. And I don't see any indication of the server doing anything. Be aware the cache went into it's error state within a minute and scrub was ran while in error state. I see in the log:

May 23 00:03:51 waffle ool www[23548]: /usr/local/emhttp/plugins/dynamix/scripts/btrfs_scrub 'start' '/mnt/cache' ''

 

I don't see anything else and it's been several minutes. Here is what the scrub status block shows. image.png.3dab44875bb7b52c721a3c2c716141d7.png

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...