JorgeB Posted September 17, 2023 Share Posted September 17, 2023 30 minutes ago, cyberstyx said: Rebuild finished without errors Unfortunately not: Sep 16 15:29:13 Tower kernel: md: disk9 read error, sector=219119384 Sep 16 15:29:13 Tower kernel: md: disk9 read error, sector=219119392 Sep 16 15:29:13 Tower kernel: md: disk9 read error, sector=219119400 Sep 16 15:29:13 Tower kernel: md: disk9 read error, sector=219119408 ... Sep 16 15:29:37 Tower kernel: md: disk8 read error, sector=3911496 Sep 16 15:29:37 Tower kernel: md: disk8 read error, sector=3911504 Sep 16 15:29:37 Tower kernel: md: disk8 read error, sector=3911512 Sep 16 15:29:37 Tower kernel: md: disk8 read error, sector=3911520 So the rebuilt disk will once again be corrupt, you have multiple disk issues, suggesting some hardware problem, like bad controller or PSU, you can try again with a new PSU if available. Quote Link to comment
cyberstyx Posted September 17, 2023 Author Share Posted September 17, 2023 12 minutes ago, JorgeB said: So the rebuilt disk will once again be corrupt, you have multiple disk issues, suggesting some hardware problem, like bad controller or PSU, you can try again with a new PSU if available. Since Disk1 was on MB controller and Disk5 was on PCI controller, it is probably a PSU issue. Will come back to this as soon as I get a new PSU and replace all power cabling. Thanks for you help JorgeB, have a nice Sunday. 1 Quote Link to comment
trurl Posted September 17, 2023 Share Posted September 17, 2023 Just thought I would comment on this: On 9/16/2023 at 4:26 AM, cyberstyx said: Disk 8 is an SSD disk (for VMs and containers) on an expansion PCI card. SSDs in the parity array cannot be trimmed, and can only be written at parity speed. The usual place for VMs and containers is an SSD in cache or other pool. If you have these on the array, they won't perform as well due to parity, and will also keep array disks spunup since these files are always open. 1 Quote Link to comment
cyberstyx Posted September 17, 2023 Author Share Posted September 17, 2023 14 minutes ago, trurl said: Just thought I would comment on this: SSDs in the parity array cannot be trimmed, and can only be written at parity speed. The usual place for VMs and containers is an SSD in cache or other pool. If you have these on the array, they won't perform as well due to parity, and will also keep array disks spunup since these files are always open. Thank you for that info. I will read more about this and ask you again when I restore the system Quote Link to comment
cyberstyx Posted September 30, 2023 Author Share Posted September 30, 2023 On 9/17/2023 at 12:28 PM, JorgeB said: So the rebuilt disk will once again be corrupt, you have multiple disk issues, suggesting some hardware problem, like bad controller or PSU, you can try again with a new PSU if available. After a week's delay from the shop to send me the bought PSU, I got the replacement a few days ago and had time today to remove the server PC from its installed location and swap the PSU. I followed the suggested steps again to make a new config and start the array and then replace the disk, the array is now being rebuilt. Hopefully it will finish tomorrow morning without any new surprises, and I will share my news then. Quote Link to comment
cyberstyx Posted September 30, 2023 Author Share Posted September 30, 2023 I stopped the operation, after 195,362,860 writes, Disk9 was giving Errors. I have attached the diagnostics file. While rebuilt was running, shared folders where not working properly. The configuration for them was there (in Shares tab), I could see the shared folders over the network but they were empty. When I checked a share folder from the console I got "/bin/ls: reading directory '.': Input/output error". Disk contents from /mnt disks were there. When I stopped rebuilding, the share folder contents where visible again. I started the Array in Maintenance Mode so I could do a file system check on Disk 9 (with flag -n). I got this: Phase 1 - find and verify superblock... superblock read failed, offset 0, size 524288, ag 0, rval -1 fatal error -- Input/output error I 've stopped for further instructions now. tower-diagnostics-20231001-0005.zip Quote Link to comment
JorgeB Posted October 1, 2023 Share Posted October 1, 2023 Disk 9 dropped offline, and this: Sep 30 21:55:52 Tower kernel: ata3: SError: { BadCRC } usually means a bad SATA cable, replace it and try again. Also the libvirt.img is corrupt, you'll need to restore from a backup if available. Quote Link to comment
cyberstyx Posted October 1, 2023 Author Share Posted October 1, 2023 4 hours ago, JorgeB said: Disk 9 dropped offline, and this: Sep 30 21:55:52 Tower kernel: ata3: SError: { BadCRC } usually means a bad SATA cable, replace it and try again. Also the libvirt.img is corrupt, you'll need to restore from a backup if available. Hello JorgeB, Replaced SATA cable, did a filesystem check -n on the disk and restarted the rebuild. Quote Link to comment
cyberstyx Posted October 2, 2023 Author Share Posted October 2, 2023 On 10/1/2023 at 12:11 PM, JorgeB said: Disk 9 dropped offline, and this: Sep 30 21:55:52 Tower kernel: ata3: SError: { BadCRC } usually means a bad SATA cable, replace it and try again. Also the libvirt.img is corrupt, you'll need to restore from a backup if available. The rebuild finished successfully with 0 errors. I also did a sample check on file structure, all files are there and all are working as should. I have also attached the diagnostics file if you think you want to have a look. I will start tackling the libvirt.img corruption issue probably tomorrow, I have some unraid OS backups if needed and the config of the VMs has not changed in a long time. I will also check the suggestion about the proper usage of SSD in unraid as suggested. Thank you all again for your help, especially JorgeB. Having 4+ hardware fails one after the other (one PSU and various SATA cables) and many errors due to that was something tackled only by experts. Christos. tower-diagnostics-20231002-1857.zip Quote Link to comment
JorgeB Posted October 2, 2023 Share Posted October 2, 2023 Everything looks good, except the already mentioned libvirt corruption. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.