Jump to content

Parity failure on every block, every time


Recommended Posts

Posted

Every time I run a parity check, it looks like every single block fails.  Corrected block count is huge.

Nothing is catching my attention in the logs.  I replaced the parity drive, did a parity check, then ran a second parity check which showed the same problem.

smartctl -a looks clean.  Everything is functioning but obviously concerned about a drive failure.

All drives are btrfs.

Could a file system problem on one of the drives cause this problem?

 

Attached are the diags.

tower-diagnostics-20200604-1145.zip

Posted (edited)

I replaced the RAM just to eliminate that problem and ran memtest for 8+ hours with no errors.

I also rebuilt the usb boot drive (keeping /config) to eliminate that.

Same error, except it does not appear to be every block.  There is a large group at the beginning that seems OK.

New logs attached.  Ran parity for a couple of minutes, stopped, then ran it again.  Same blocks having problem.

Attached new diags.

I am not going to swear to it but it seems like this problem started when I went to 6.8.2 or 6.8.3.  

 

tower-diagnostics-20200606-1333.zip

Edited by tasmaniac
Posted (edited)

Pls check BIOS, does controller set to IDE mode instead AHCI which parity disk attach.

 

[1:0:1:0]    disk    ATA      ST4000VN008-2DR1 SC60  /dev/sdc   /dev/sg2 
  state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/1:0:1:0  [/sys/devices/pci0000:00/0000:00:14.1/ata1/host1/target1:0:1/1:0:1:0]

 

 

00:14.1 IDE interface [0101]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 IDE Controller [1002:439c] (rev 40)
    Subsystem: Gigabyte Technology Co., Ltd SB7x0/SB8x0/SB9x0 IDE Controller [1458:5002]
    Kernel driver in use: pata_atiixp
    Kernel modules: pata_atiixp

 

 


Jun  6 13:26:14 Tower kernel: ata1.01: ATA-10: ST4000VN008-2DR166,             ZM4166N6, SC60, max UDMA/133
Jun  6 13:26:14 Tower kernel: ata1.01: 7814037168 sectors, multi 16: LBA48 NCQ (depth 0/32)
Jun  6 13:26:14 Tower kernel: ata1.01: limited to UDMA/33 due to 40-wire cable

 

Edited by Benson
  • Like 1
Posted

Nice catch.  Yes, bios was set to ide for a couple of sata ports.  Now all AHCI.

Ran parity for 10+GB, restarted parity, and the first 10GB came up clean.

Doing a full parity now.

Thanks for the help.  I will reply one last time once complete parity check done.

 

Posted

Just a thought...   If the SATA settings in BIOS were originally set to AHCI and reverted without user intervention, then it's just possible that the battery that keeps the BIOS settings CMOS RAM alive is in need of replacement.  It's easy to forget, since they often last for very many years, but I have had a couple die on me in the past. 

Posted
13 hours ago, Benson said:

controller set to IDE mode instead AHCI which parity disk attach.

Good catch! I've seen weird issues before with those controllers AMD set to IDE before, like not being able to format or even assign a device, but first time I see this!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...