tasmaniac Posted June 4, 2020 Share Posted June 4, 2020 Every time I run a parity check, it looks like every single block fails. Corrected block count is huge. Nothing is catching my attention in the logs. I replaced the parity drive, did a parity check, then ran a second parity check which showed the same problem. smartctl -a looks clean. Everything is functioning but obviously concerned about a drive failure. All drives are btrfs. Could a file system problem on one of the drives cause this problem? Attached are the diags. tower-diagnostics-20200604-1145.zip Quote Link to comment
JorgeB Posted June 4, 2020 Share Posted June 4, 2020 Please start another correcting check without rebooting and post new diags after it runs for a couple of minutes. Quote Link to comment
tasmaniac Posted June 4, 2020 Author Share Posted June 4, 2020 I did it twice just to make sure. No reboot. Correction always starts at 0. Looks like it is not caused by a reboot. tower-diagnostics-20200604-1225.zip tower-diagnostics-20200604-1227.zip Quote Link to comment
JorgeB Posted June 4, 2020 Share Posted June 4, 2020 That's very strange, try running memtest, though I would suspect if the RAM was so bad every single sector was incorrect you'd see other issues. Quote Link to comment
tasmaniac Posted June 4, 2020 Author Share Posted June 4, 2020 Will do. Strange as this machine has been running unraid for many years. I will post the results. Quote Link to comment
tasmaniac Posted June 6, 2020 Author Share Posted June 6, 2020 (edited) I replaced the RAM just to eliminate that problem and ran memtest for 8+ hours with no errors. I also rebuilt the usb boot drive (keeping /config) to eliminate that. Same error, except it does not appear to be every block. There is a large group at the beginning that seems OK. New logs attached. Ran parity for a couple of minutes, stopped, then ran it again. Same blocks having problem. Attached new diags. I am not going to swear to it but it seems like this problem started when I went to 6.8.2 or 6.8.3. tower-diagnostics-20200606-1333.zip Edited June 6, 2020 by tasmaniac Quote Link to comment
Vr2Io Posted June 6, 2020 Share Posted June 6, 2020 (edited) Pls check BIOS, does controller set to IDE mode instead AHCI which parity disk attach. [1:0:1:0] disk ATA ST4000VN008-2DR1 SC60 /dev/sdc /dev/sg2 state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/1:0:1:0 [/sys/devices/pci0000:00/0000:00:14.1/ata1/host1/target1:0:1/1:0:1:0] 00:14.1 IDE interface [0101]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 IDE Controller [1002:439c] (rev 40) Subsystem: Gigabyte Technology Co., Ltd SB7x0/SB8x0/SB9x0 IDE Controller [1458:5002] Kernel driver in use: pata_atiixp Kernel modules: pata_atiixp Jun 6 13:26:14 Tower kernel: ata1.01: ATA-10: ST4000VN008-2DR166, ZM4166N6, SC60, max UDMA/133 Jun 6 13:26:14 Tower kernel: ata1.01: 7814037168 sectors, multi 16: LBA48 NCQ (depth 0/32) Jun 6 13:26:14 Tower kernel: ata1.01: limited to UDMA/33 due to 40-wire cable Edited June 6, 2020 by Benson 1 Quote Link to comment
tasmaniac Posted June 6, 2020 Author Share Posted June 6, 2020 Nice catch. Yes, bios was set to ide for a couple of sata ports. Now all AHCI. Ran parity for 10+GB, restarted parity, and the first 10GB came up clean. Doing a full parity now. Thanks for the help. I will reply one last time once complete parity check done. Quote Link to comment
S80_UK Posted June 7, 2020 Share Posted June 7, 2020 Just a thought... If the SATA settings in BIOS were originally set to AHCI and reverted without user intervention, then it's just possible that the battery that keeps the BIOS settings CMOS RAM alive is in need of replacement. It's easy to forget, since they often last for very many years, but I have had a couple die on me in the past. Quote Link to comment
JorgeB Posted June 7, 2020 Share Posted June 7, 2020 13 hours ago, Benson said: controller set to IDE mode instead AHCI which parity disk attach. Good catch! I've seen weird issues before with those controllers AMD set to IDE before, like not being able to format or even assign a device, but first time I see this! Quote Link to comment
Vr2Io Posted June 7, 2020 Share Posted June 7, 2020 2 hours ago, johnnie.black said: first time I see this! Same Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.