NLS Posted September 28, 2022 Share Posted September 28, 2022 So my syslog got full, while rebuilding a disk from parity (as seen in my previous threads). The rebuild is still around 50% and progressing without any report of issues in the GUI, although it did pop up about a single error (probably bad sector in parity?)... but from that point no further issues and if I didn't notice the log getting full I would think things are ok. So the lines that fille up syslog are as follows: Sep 28 10:17:59 <my server> kernel: md: disk0 read error, sector=2169270496 (with the sector keep changing in the every line) Last entry (because it filled up) was 70 minutes ago (10:17:59 or something, local) and I am about 4 hours in the rebuild already. So I looked to find the FIRST such entry in the log. What I found was very interesting. The first entry in the log, more than 1.5 million lines above the last, WAS THE SAME MINUTE (10:17:06). It actuallly "burst" 1.7 million lines in the same limit, so I double it correctly identifies the error. (Is it realistic to find 1.7 million sectors with problem within 50 seconds? Plus who know how many more after log was full?) Also I am not sure which is "disk0" as I don't have that anywhere. Is it the parity? What is happening? I post a truncated version of the log. syslog.txt Quote Link to comment
NLS Posted September 28, 2022 Author Share Posted September 28, 2022 (edited) ...erm actually just noticed in the GUI... Current operation started on Wednesday, 28-09-2022, 07:34 (today) Elapsed time: 4 hours, 51 minutes Estimated finish: 2 hours, 32 minutes Finding 219564171 errors The number is a bit unrealistic. So, could again be a cable issue, and is that on parity? Yes it is on parity as I also see this: Parity WDC_WD40EFRX-68N32N0_WD-WCC7K6SYT2RN - 4 TB (sdb) * 493 984 963 5944 224 437 515 So, about half the reads (!?) fail? Any ideas? Also is there any disk test appropriate for the parity? (that I understand doesn't have a proper filesystem?) Edited September 28, 2022 by NLS Quote Link to comment
Solution JorgeB Posted September 28, 2022 Solution Share Posted September 28, 2022 ata6: SError: { UnrecovData 10B8B BadCRC } This is usually the result of a bad SATA cable. Quote Link to comment
NLS Posted September 28, 2022 Author Share Posted September 28, 2022 15 minutes ago, JorgeB said: ata6: SError: { UnrecovData 10B8B BadCRC } This is usually the result of a bad SATA cable. I am going to wait for the parity build to finish (it is more than 70% now). I am not close to the server anyway. Or should I just stop it so it doesn't come online? Assuming it is the cable, the proper procedure (after replacing) is what? How can I enforce to rebuild disk 9 from the start? Delete whatever partition it created? Also is there any disk check appropriate for parity? (before starting rebuild) Quote Link to comment
JorgeB Posted September 28, 2022 Share Posted September 28, 2022 You should cancel the rebuild, replace the cable and start over, since there's no other parity drive the sectors where the read errors happened will be skipped in the rebuilt disk, resulting in data corruption, assuming they had data. Quote Link to comment
NLS Posted September 28, 2022 Author Share Posted September 28, 2022 (edited) Seems it was indeed the cable. Which SUCKS as for whoever followed my three latest threads, can see I got all kinds of cable errors. And OK parity disk is on normal SATA cable that goes on mobo. The others are 1-to-4 SAS/SATA cables that cannot be replaced very fast (need to be ordered from ebay etc.). Anyway... now parity builds the emulated disk, with 0 errors until now (3.2%). Knock on wood. (then will replace the parity with bigger new, rebuild, then replace the very old temp 3TB that builds right now, with new bigger and build YET again, then use the old 4TB parity to replace one more older 3TB data disk and yes... rebuild AGAIN) Note: I never a reply on how someone could possibly (surface?) check the parity disk. Edited September 28, 2022 by NLS Quote Link to comment
ChatNoir Posted September 28, 2022 Share Posted September 28, 2022 1 hour ago, NLS said: Note: I never a reply on how someone could possibly (surface?) check the parity disk. Extended SMART test ? Quote Link to comment
NLS Posted September 28, 2022 Author Share Posted September 28, 2022 4 hours ago, ChatNoir said: Extended SMART test ? I thought of that but I don't think this is comparable to full actual surface test. Quote Link to comment
JorgeB Posted September 29, 2022 Share Posted September 29, 2022 It does a full surface test, though there are some utils you can use that also show for example slow sectors, and those won't be in a SMART test result. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.