darkside40 Posted October 11, 2023 Share Posted October 11, 2023 Hi there, i am doing a parity check which shows my 213 errors till now. The last one also had errors but i thought that was because of an unclean shutdown before. Is there any way to see if there is any disk faulty etc? They are mounted without problems, so i dont think there is a smart problem etc. htms-diagnostics-20231011-1808.zip Quote Link to comment
JorgeB Posted October 11, 2023 Share Posted October 11, 2023 And you are sure the previous one was a correcting check? If so start by running memtest. Quote Link to comment
darkside40 Posted October 11, 2023 Author Share Posted October 11, 2023 Yes i am sure. Okay than i will try that tomorrow, because i have to hook up a monitor or that. Quote Link to comment
darkside40 Posted October 15, 2023 Author Share Posted October 15, 2023 Okay i did 2 Memtest runs now. No Errors? What would be the next step? I have an alternative controller Card (Silverstone ECS06) lying around. Quote Link to comment
JorgeB Posted October 16, 2023 Share Posted October 16, 2023 You can try that, don't forget that you need to run at least 2 checks after changing something, the first one may still find errors. Quote Link to comment
darkside40 Posted October 16, 2023 Author Share Posted October 16, 2023 So i assume the first run must be a correcting Parity check, so that the second run should not find any errors more. Quote Link to comment
darkside40 Posted October 26, 2023 Author Share Posted October 26, 2023 Okay took me a time had some really busy days. So i exchanged all the Sata Cables at first. Run a first correcting Parity check that 1839 errors, did a second check (non correcting) some hours later that found 978 Errors: So i dont think it was the cabling. Next would be the controller i thing with the same procedure. I also attached the diagnostics, so maybe someone with more experience than me could have a look at them and maybe find something obvious. htms-diagnostics-20231026-0714.zip Quote Link to comment
JorgeB Posted October 26, 2023 Share Posted October 26, 2023 Try the controller, can also be a disk, and if it's that it can be a pain to find. Quote Link to comment
darkside40 Posted October 26, 2023 Author Share Posted October 26, 2023 I can imagine. Did a XFS Check on all disks, no issues. Had a look at all Disk's smart Values only one disk had 17 UDMA CRC Errors, maybe thats the cause. Quote Link to comment
JorgeB Posted October 26, 2023 Share Posted October 26, 2023 1 hour ago, darkside40 said: had 17 UDMA CRC Errors, maybe thats the cause. Unlikely, it would an internal disk problem, and not seen on SMART. Quote Link to comment
darkside40 Posted October 26, 2023 Author Share Posted October 26, 2023 Just to be sure: all the time i change something i have to do a correcting and after that a non correcting run? Only that way i can verify i found a solution. Quote Link to comment
JorgeB Posted October 26, 2023 Share Posted October 26, 2023 Correct, and if you start removing a disk at the time to see if that is the problem, you need to resync parity fist then run a check. Quote Link to comment
darkside40 Posted October 26, 2023 Author Share Posted October 26, 2023 Okay. I think the first and easiest thing now it to replaye the memory, althought Memtest found no error. If that does not solve the problem i will replace the AAR1430 with an ECS06 Controller, hopefully that does not makes such problems like the last ASM1166 i tried couple of months ago. If that does not help, yeah than i have to check every single disk. Quote Link to comment
JorgeB Posted October 26, 2023 Share Posted October 26, 2023 Yeah, the disks leave for last, it's very uncommon, but it can happen: 1 Quote Link to comment
darkside40 Posted October 30, 2023 Author Share Posted October 30, 2023 So its not the Ram and it is not the Parity Disk. I replaced it beacause i thought: The failures always appear around 95% (would be great if they would begin at 1% than i could save much time) maybe the failure is somewhere at the end of that disk if its checked sequentially. Next will be the controller. Quote Link to comment
darkside40 Posted November 24, 2023 Author Share Posted November 24, 2023 (edited) Okay the hunt goes on. Yesterday i placed all the Hardware in a new case to get rid of the old drive cages and i had to replace the old parity drive because of a broken power connector. Was working but to be sure. Than i did a Parity Sync, which completed but with 2039 Read Errors on Disk 3. If i replace the disk now with a new one and let it rebuild shouldnt there be faulty data on it in that case? I mean it show the Parity as valid, which i quite dont understand if there are read errors while building the Parity. Or does it mean that there read errors which could be resolved, by multiple readings etc? Edited November 24, 2023 by darkside40 Quote Link to comment
darkside40 Posted November 24, 2023 Author Share Posted November 24, 2023 My plan would now be to tra to copy the data from the disk with the Read errors to another fresh disk, place that in the array, rebuild parity and check that after that. Does that sound reasonable? Quote Link to comment
JorgeB Posted November 24, 2023 Share Posted November 24, 2023 2 hours ago, darkside40 said: Than i did a Parity Sync, which completed but with 2039 Read Errors on Disk 3. Post new diags. Quote Link to comment
darkside40 Posted November 24, 2023 Author Share Posted November 24, 2023 There you go htms-diagnostics-20231124-1020.zip Quote Link to comment
JorgeB Posted November 24, 2023 Share Posted November 24, 2023 It's not logged as a disk issues and the disk looks healthy, check/replace cables and try again. Quote Link to comment
darkside40 Posted November 24, 2023 Author Share Posted November 24, 2023 The cables are new (data and power) and also there is no drivecage inbetween anymore. Quote Link to comment
JorgeB Posted November 24, 2023 Share Posted November 24, 2023 Swap cables with another disk and retry, could also be a PSU issue. Quote Link to comment
darkside40 Posted November 24, 2023 Author Share Posted November 24, 2023 Okay i try switching the cable and run a correcting check. Quote Link to comment
darkside40 Posted November 25, 2023 Author Share Posted November 25, 2023 So now Disk 3 is in Error state during the last check. How to proceed? New log are attached. htms-diagnostics-20231125-0951.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.