Jump to content

kernel: BTRFS error (device md8p1): bdev /dev/md8p1 errs: wr 0, rd 0, flush 0, corrupt 32996, gen 0


Go to solution Solved by [email protected],

Recommended Posts

I am getting the logs filled with errors all of the sudden. 

 

kernel: BTRFS error (device md8p1): bdev /dev/md8p1 errs: wr 0, rd 0, flush 0, corrupt 32996, gen 0 

and 

failed command: READ FPDMA QUEUED

 

image.thumb.png.da1a65650c07af8f8af8255253cb3563.png

 

something very odds that I can not find what device is md8p1. its not listed in the tools-> system devices page. 

I can find the ata32, i think, its the [32:0:0:0] disk ATA ST4000NE001-2MA1 EN01 /dev/sdi 4.00TB but no idea what the md8p1 could be. 

 

Also a prity sync is running at the moment and the speed varies from 140 MB/sec to under 5 MB/sec. 

 

Any help figuring out what the problem is would be greatly appreciated. Diagnostics are attached. 

 

cerint-diagnostics-20240714-1923.zip

Edited by [email protected]
added more info
Link to comment
Just now, [email protected] said:

I am using Mini SAS HD SFF-8643 toSFF-8087 cables. Each on is connected to 4 drives. If it was an issue with the cable shouldn't more than one drives have errors?

I have some cables like that where I can only get 3 of the 4 SATA connectors to work correctly so it is definitely possible to not have all the drives affected.

Link to comment
1 minute ago, itimpi said:

I have some cables like that where I can only get 3 of the 4 SATA connectors to work correctly so it is definitely possible to not have all the drives affected.

These cables connect to a back plane and not individual drives. 

https://www.amazon.co.uk/ipolex-Internal-SFF-8643-SFF-8087-Foldable/dp/B0868H6L9D/ref=sr_1_4

 

I would expect thigs to happen in groups of 4 drives. i will move the cables around and see if another drive on another row starts throwing read errors an report back. 

Link to comment
  • 2 weeks later...

SO i removed all the drives and started adding them back in one by one. Each time doing a "new config" and re-creating the parity. I am at the point where i added drive number 6 and during the parity sync it started showing read errors. I moved the drive to a different slot in tha case and got read errors again. So its not the cables as each slot is connected to a different port/cable. Any other thoughts? 

 image.thumb.png.f80ea7a7b67e29435ea7df072c0b9323.png

 

Thanks,

G

Link to comment

Ok more updates. I added 3 x 4TB drives to my array using the new config tool. I started a parity sync and almost immediately one of the existing 16TB drives and one of the newly added 4TB drives started with the read errors. I stopped the sync, removed 2 of the 4 TB drives and started the sync again. the 16TB drive did NOT throw any errors and the sync completed. I have no idea why the 16TB gave read errors when the 3x4tb drives got added at the same time. 

 

I have ordered some replacement drives and will remove the last 4tb drive. Very odd behaviour. 

image.png

Link to comment
20 hours ago, JonathanM said:

Maybe the power got overloaded? How many drives do you have on each PSU lead? What size is your PSU? Any splitters? What kind? Molex-SATA or SATA-SATA?

Got a corsair 850watt brand new. in total its 12SSDs and 12HDDs 2 power leads from the PSU, 1 molex and 1 sata with adaptor. 

Also i run a parity sync the day before and was fine (no read errors, and no errors in logs) and today with the same drives same config, run parity sync again and i am getting :

 

kernel: I/O error, dev sdk, sector 31251758424 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2

on an array drive that never before had any errors.

 

can figure this out. from the moment i build this new server it has been a pain. I may need to scrap it and start new at this point. 

 

image.png

Link to comment
12 minutes ago, [email protected] said:

12SSDs and 12HDDs 2 power leads from the PSU, 1 molex and 1 sata with adaptor.

Is there any way you can use more leads? Preferably 4 pin molex style, they can handle much more current. Each SATA connector can really only handle 2 drives, so if you try to use SATA splitters it's very easy to run out of current. It's much safer to use molex -> SATA splitters if you really have to use splitters at all.

 

Ideally you shouldn't use any splitters, rather source a PSU with enough connections available. Many modular PSU's have extra cables available specifically for them.

Link to comment
18 minutes ago, [email protected] said:

Its a modular PSU so i'll get a second psu to molex cable and remove the sata completely. 

Make sure you either source directly from the manufacture for that SPECIFIC power supply, or verify the pinouts with a tester of some sort before attaching any drives. There have been multiple accounts on this forum alone of people frying multiple hard drives with cables that physically fit perfectly but were pinned differently, feeding the drives 12V where they were expecting 5V.

Link to comment
  • Solution

It was the SAS ports on the motherboard. If i had drives connected to those ports the array would become unstable. What was odd was that not only drives connected to those 2 ports would be throwing errors. As soon as I connected the drives to a new pcie HBA it all sorted it self out. No more read errors, parity sync completes fine, etc...

 

Thanks everyone for their help,

G

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...