Jump to content

kernel: BTRFS error (device md8p1): bdev /dev/md8p1 errs: wr 0, rd 0, flush 0, corrupt 32996, gen 0

This topic contains 20 posts. A summary containing the most significant posts is available

Featured Replies

Posted

I am getting the logs filled with errors all of the sudden. 

 

kernel: BTRFS error (device md8p1): bdev /dev/md8p1 errs: wr 0, rd 0, flush 0, corrupt 32996, gen 0 

and 

failed command: READ FPDMA QUEUED

 

image.thumb.png.da1a65650c07af8f8af8255253cb3563.png

 

something very odds that I can not find what device is md8p1. its not listed in the tools-> system devices page. 

I can find the ata32, i think, its the [32:0:0:0] disk ATA ST4000NE001-2MA1 EN01 /dev/sdi 4.00TB but no idea what the md8p1 could be. 

 

Also a prity sync is running at the moment and the speed varies from 140 MB/sec to under 5 MB/sec. 

 

Any help figuring out what the problem is would be greatly appreciated. Diagnostics are attached. 

 

cerint-diagnostics-20240714-1923.zip

Edited by [email protected]
added more info

Solved by [email protected]

Go to solution
  • Community Expert

md8 is disk8 on the Main tab.   The /dev/md? type devices are created by the Unraid driver for each disk in the main array when it is started.

  • Community Expert

Replace cables for that disk or swap with a another one, then run a scrub.

  • Author

I am using Mini SAS HD SFF-8643 toSFF-8087 cables. Each on is connected to 4 drives. If it was an issue with the cable shouldn't more than one drives have errors?

  • Community Expert
Just now, [email protected] said:

I am using Mini SAS HD SFF-8643 toSFF-8087 cables. Each on is connected to 4 drives. If it was an issue with the cable shouldn't more than one drives have errors?

I have some cables like that where I can only get 3 of the 4 SATA connectors to work correctly so it is definitely possible to not have all the drives affected.

  • Author
1 minute ago, itimpi said:

I have some cables like that where I can only get 3 of the 4 SATA connectors to work correctly so it is definitely possible to not have all the drives affected.

These cables connect to a back plane and not individual drives. 

https://www.amazon.co.uk/ipolex-Internal-SFF-8643-SFF-8087-Foldable/dp/B0868H6L9D/ref=sr_1_4

 

I would expect thigs to happen in groups of 4 drives. i will move the cables around and see if another drive on another row starts throwing read errors an report back. 

  • 2 weeks later...
  • Author

SO i removed all the drives and started adding them back in one by one. Each time doing a "new config" and re-creating the parity. I am at the point where i added drive number 6 and during the parity sync it started showing read errors. I moved the drive to a different slot in tha case and got read errors again. So its not the cables as each slot is connected to a different port/cable. Any other thoughts? 

 image.thumb.png.f80ea7a7b67e29435ea7df072c0b9323.png

 

Thanks,

G

  • Author

Pausing and resuming the sync and pausing again created the following log:

 

sdq is the drive in the array that now was the read errors. no idea what this log actually means and why it is saying that sdq is now sdv when in the array its still sdq

image.png.fdc649727ae37ade607e0afd59c02de5.png

image.thumb.png.e6e830e26ff03f9c5c590d672cc496b9.png

  • Community Expert

If you have used different cables/slot and the errors persist it may be the disk.

  • Author

Ok more updates. I added 3 x 4TB drives to my array using the new config tool. I started a parity sync and almost immediately one of the existing 16TB drives and one of the newly added 4TB drives started with the read errors. I stopped the sync, removed 2 of the 4 TB drives and started the sync again. the 16TB drive did NOT throw any errors and the sync completed. I have no idea why the 16TB gave read errors when the 3x4tb drives got added at the same time. 

 

I have ordered some replacement drives and will remove the last 4tb drive. Very odd behaviour. 

image.png

  • Author

I did a reboot and got some new errors now:

image.thumb.png.b912a377386a84572691fce6080bf823.png

 

For some reason disk 1 is now read only

  • Author

Moving the data out of disk1 causes more errors:

image.thumb.png.e33a83c04be8edfd5c8d406fd9581b47.png

  • Community Expert

All those disk errors are likely going to cause issues for btrfs.

  • Author

I am trying to move back to XFS but its going to be a slow process. need to empty each drive, remove from the array, and add it back with xfs format :(  it's going to take weeks... 

7 hours ago, [email protected] said:

I have no idea why the 16TB gave read errors when the 3x4tb drives got added at the same time. 

Maybe the power got overloaded? How many drives do you have on each PSU lead? What size is your PSU? Any splitters? What kind? Molex-SATA or SATA-SATA?

  • Author
20 hours ago, JonathanM said:

Maybe the power got overloaded? How many drives do you have on each PSU lead? What size is your PSU? Any splitters? What kind? Molex-SATA or SATA-SATA?

Got a corsair 850watt brand new. in total its 12SSDs and 12HDDs 2 power leads from the PSU, 1 molex and 1 sata with adaptor. 

Also i run a parity sync the day before and was fine (no read errors, and no errors in logs) and today with the same drives same config, run parity sync again and i am getting :

 

kernel: I/O error, dev sdk, sector 31251758424 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2

on an array drive that never before had any errors.

 

can figure this out. from the moment i build this new server it has been a pain. I may need to scrap it and start new at this point. 

 

image.png

12 minutes ago, [email protected] said:

12SSDs and 12HDDs 2 power leads from the PSU, 1 molex and 1 sata with adaptor.

Is there any way you can use more leads? Preferably 4 pin molex style, they can handle much more current. Each SATA connector can really only handle 2 drives, so if you try to use SATA splitters it's very easy to run out of current. It's much safer to use molex -> SATA splitters if you really have to use splitters at all.

 

Ideally you shouldn't use any splitters, rather source a PSU with enough connections available. Many modular PSU's have extra cables available specifically for them.

  • Author

Its a modular PSU so i'll get a second psu to molex cable and remove the sata completely. 

18 minutes ago, [email protected] said:

Its a modular PSU so i'll get a second psu to molex cable and remove the sata completely. 

Make sure you either source directly from the manufacture for that SPECIFIC power supply, or verify the pinouts with a tester of some sort before attaching any drives. There have been multiple accounts on this forum alone of people frying multiple hard drives with cables that physically fit perfectly but were pinned differently, feeding the drives 12V where they were expecting 5V.

  • Author
  • Solution

It was the SAS ports on the motherboard. If i had drives connected to those ports the array would become unstable. What was odd was that not only drives connected to those 2 ports would be throwing errors. As soon as I connected the drives to a new pcie HBA it all sorted it self out. No more read errors, parity sync completes fine, etc...

 

Thanks everyone for their help,

G

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...