Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[Solved] BTRFS Error (RAM seems ok)

Featured Replies

Hi there :)

Since some days (i'm running solid since 4 years), my dockers stops gradually and it's very hard to make all the systems run again fine (and it shortly redo the same errors :/)
When i try to restart them, unraid show an Execution Error so dead end


I've searched and found some post talking about bad RAMs and i ran 2 (paralleled and unparalleled) MEMTEST86, which PASSed fine

Also found ie this topic that talk about a full cache but it doesn't seem to be the case for me either, so i'm a little lost :/

 

root@Tower:~# btrfs dev stats /mnt/cache
[/dev/mapper/sdk1].write_io_errs    192912
[/dev/mapper/sdk1].read_io_errs     145891
[/dev/mapper/sdk1].flush_io_errs    1086
[/dev/mapper/sdk1].corruption_errs  628
[/dev/mapper/sdk1].generation_errs  0
[/dev/mapper/sdj1].write_io_errs    0
[/dev/mapper/sdj1].read_io_errs     0
[/dev/mapper/sdj1].flush_io_errs    0
[/dev/mapper/sdj1].corruption_errs  0
[/dev/mapper/sdj1].generation_errs  0
[/dev/mapper/sdl1].write_io_errs    0
[/dev/mapper/sdl1].read_io_errs     0
[/dev/mapper/sdl1].flush_io_errs    0
[/dev/mapper/sdl1].corruption_errs  0
[/dev/mapper/sdl1].generation_errs  0

 

I'm attaching diagnostics and some logs if anyone can point me where i can head next to fix this issue please :)

 

Thank you very much in advance !

logs.txt tower-diagnostics-20230924-1631.zip

Edited by AinzOolGown
typo

Solved by JorgeB

Not necessarily your problem, but corruption issues are generally related to bad RAM (as your searching has already indicated).  FWIW, your particular RAM is not on either the motherboard's QVL for memory and G-Skill doesn't list that motherboard on the memory's QVL.  Personally, I only ever buy RAM from the MB QVL for the most trouble free experience.  But take it with a grain of salt.  Just because neither the memory or the MB says that they are compatible with each other doesn't mean that they are not.

  • Author

Hi Squid, thank you :)

Yep, i figured that and done testings when i built the server, but all the tests was successfull so i decided to keep them :)

It's been 4 years and never had a RAM issue since. If it is really a RAM problem, then i don't understand how it can be all ok since 4 years and suddently is not ok anymore :/

Memtest86 also tells me everything's fine 😭

  • Community Expert
20 hours ago, AinzOolGown said:
[/dev/mapper/sdk1].write_io_errs    192912
[/dev/mapper/sdk1].read_io_errs     145891
[/dev/mapper/sdk1].flush_io_errs    1086

These suggest the device dropped offline sometime in the past, see here for more info and better pool monitoring.

  • Author

Thanks JorgeB ;)

 

That reminds me... Sorry i forgot to tell you, just some days before this behavior, one disk of my array went offline 2 or 3 days. Unraid emulated it and i noticed just some hours later by chance. I changed a cable and rebuilt array. All seemed good to me, but in fact, it might have corrupted the array ?

 

If this is it, how can i correct that please ?
i read about some "scrub" command, i have it for cache array but not for data array
Will it erase all cache ? I wont lose any data right ?

 

Thank you :)

 

P.S.: Installed Squid's script from your link, thanks again !
 

Included most recent logs, it start to worry me :/
In the meantime, i had a notification "Error on cache pool - No description"

LastLogs.txt

Edited by AinzOolGown

  • Community Expert
23 minutes ago, AinzOolGown said:

If this is it, how can i correct that please ?

See the link above.

  • Author

Hi :)

 

So, i followed this process and changed all shares that use cache to "Yes". VM & Dockers are disabled but mover have finished and shares (appdata/system) remain on it.

 

I opted to completely replace my cache drives/SATA Cables

 

Can i just save the remaining content elsewhere and move it back when the new cache pool will be operationnal ?

Is there a better way to move the remaining files to the array ?

And, how to do this the right way ? (to conserve permissions and such) i have a Mac and PathFinder ready for that :)

 

Also, i have a share set to "no" for cache, but strangely the last files created on it are on cache O_o

 

Thank you :)

Edited by AinzOolGown

  • Community Expert
1 hour ago, AinzOolGown said:

but mover have finished and shares (appdata/system) remain on it.

Enable mover logging, run the mover, post new diags.

  • Community Expert

The filesystem is read-only, so the data cannot be moved, it can be copied though.

  • Author

Can i just copy the content via SMB and do the opposite when the new drives will be installed ? (sorry i prefer to double-check, there's some prod Dockers for my business and that would be hell to lose anything :/)

Edited by AinzOolGown

  • Community Expert
21 minutes ago, AinzOolGown said:

Can i just copy the content via SMB and do the opposite when the new drives will be installed ?

That should work, you can also use for example rsync.

  • Author

Sorry JorgeB, rsync gives me errors too :/
cmd was :

 

rsync -a /mnt/cache/appdata/ /mnt/user/Backup-Saves/TowerDockers/Cache/appdata/



I tried a CA Backup and same, it gives errors 😱

Ran a Filesystem checks on Cache, attached logs

rsync errors.txt
 

Cache filesystem check.txt

Edited by AinzOolGown

  • Community Expert

Any corrupt files will fail to copy, you can try btrfs restore here, it will ignore corrupt files and still copy them but they will still be corrupt.

  • Author

Hi  :)

I copied what i can

Deleted the cache pool / pluged new SSDs / recreated a new cache pool and assigned the new SSDs with new cables

 

Then i reswitched appdata on "only" cache and done a CA plugin restore
Plugin said
restoring complete but i have a notification saying error occured
Syslog indicate a lot of btrfs error

I switched to Discord somewhere inbetween, and it seems that my RAID controller card is the culprit
Synd & Kilrah helped me and they advised to replace my card by a 9300-8i, so i ordered one and now i'm waiting for it to arrive and retry a CA restore

Thank you JorgeB for your help and patience, when you're stressed, it's a big help to have support ! :)

Will repost the next steps when i'll try with the new card.

  • 2 weeks later...
  • Author

Hi :)

So i received HBA Card and was wondering how to proced the cleanest way

Do i scrub SSD like this (when they're still connected to RAID Controller) before replacing RAID Controller by HBA Card

or :
1/
Replace RAID Controller with HBA Card

2/ Scrub SSD

3/ delete + redo Cache pool

4/ Restore Backup ?

 

Please :)

Edited by AinzOolGown

  • Community Expert

If you are going to re-format the pool there's no much point in scrubbing, but you could do it before or after.

  • Author

Ok, thank you very much :)

  • Author

Hi :)

So, i replaced the card just now, and tried to reactivate VM, but the list is empty :/


 

Maybe it is linked with the cache settings in shares

I changed them, but maybe have to invoke mover ?

 

Thanks !

 

P.S. I have a backup of libvirt.img

 

EDIT : Nevermind, i replaced the libvirt.img and all VMs reappeared ;)

Edited by AinzOolGown

  • Author

To follow, i see anothers btrfs errors, after invoking mover :/

 

I don't understand, SSD are new, câbles and HBA card are new too, i'm in despair

 

Here's the fresh diagnostic

tower-diagnostics-20231021-1207.zip

 

When i deleted/redone cache pool, there was no reformating proposed or done by the system
Do i need to force one ?

Scrub was saying "no error found" before invoking mover

Edited by AinzOolGown

  • Author

Thank you very much JorgeB ;)

I think i'm damned :/

 

Snolly's post say that to view the firmware information, we need to "click on the drive's name and go to identity tab"
Mine's look like this :
1937980679_Capturedecran2023-10-21a18_30_15.thumb.png.532d40d9e8f8f52f0afe30dc005ebb14.png

 

Also, my Crucial SSDs are not listed in /dev/

maybe because Unraid can't view past the HBA card ?
That doesn't make sense since it can mount them successfully :/

 

Edited by AinzOolGown

  • Community Expert
16 hours ago, AinzOolGown said:

Mine's look like this :

The devices dropped offline, power cycling the server should bring them back.

  • Author

Hi JorgeB,

 

Thanks to bear with me ^^

 

I'm thinking of changing the SSD, this time i'm searching possible unraid incompatibility for each SSD i think can be good but every model return plenty bad result...

Do you have some brand/model to recommend please ?

  • Community Expert

I've been happy with my MX500, other models I've been using without issues so far are the 860 and 870 EVO.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.