axeman Posted April 30, 2022 Share Posted April 30, 2022 So I had one disk show read errors during the monthly parity check. I'm guessing read errors are a bit different from write errors in that UnRaid doesn't disable the drive? What are my next steps here? Just move the files off this disk and replace the disk? It's the oldest one, going at 10 yrs, I think. So I have no problem getting rid of it. Diags below Unraid 6.9.2 I have a bunch of these in syslog: Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949051960 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949051968 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949051976 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949051984 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949051992 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949052000 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949052008 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949052016 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949052024 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949052032 Apr 28 22:32:51 Tower kernel: md: disk12 read error, sector=2949052040 About Disk 12: Apr 19 07:39:52 Rigel kernel: xfs filesystem being mounted at /mnt/disk12 supports timestamps until 2038 (0x7fffffff) Apr 19 07:39:52 Rigel emhttpd: shcmd (6511): xfs_growfs /mnt/disk12 Apr 19 07:39:52 Rigel root: meta-data=/dev/mapper/md12 isize=512 agcount=4, agsize=122094532 blks Apr 19 07:39:52 Rigel root: = sectsz=512 attr=2, projid32bit=1 Apr 19 07:39:52 Rigel root: = crc=1 finobt=1, sparse=0, rmapbt=0 Apr 19 07:39:52 Rigel root: = reflink=0 Apr 19 07:39:52 Rigel root: data = bsize=4096 blocks=488378126, imaxpct=5 Apr 19 07:39:52 Rigel root: = sunit=0 swidth=0 blks Apr 19 07:39:52 Rigel root: naming =version 2 bsize=4096 ascii-ci=0, ftype=1 Apr 19 07:39:52 Rigel root: log =internal log bsize=4096 blocks=238465, version=2 Apr 19 07:39:52 Rigel root: = sectsz=512 sunit=0 blks, lazy-count=1 Apr 19 07:39:52 Rigel root: realtime =none extsz=4096 blocks=0, rtextents=0 WDC_WD20EARS-00MVWB0_WD-WCAZA5944835-20220428-2238 disk12 (sdo).txt Quote Link to comment
itimpi Posted April 30, 2022 Share Posted April 30, 2022 Do you have your monthly parity check as correcting or non-correcting? Quote Link to comment
axeman Posted April 30, 2022 Author Share Posted April 30, 2022 1 hour ago, itimpi said: Do you have your monthly parity check as correcting or non-correcting? Yes, it is. And result said zero errors. Quote Link to comment
trurl Posted April 30, 2022 Share Posted April 30, 2022 1 hour ago, axeman said: Yes, it is. And result said zero errors. "Yes" is not the correct answer to a multiple choice question. 3 hours ago, itimpi said: correcting or non-correcting? Quote Link to comment
trurl Posted April 30, 2022 Share Posted April 30, 2022 3 hours ago, axeman said: What are my next steps here? Attach diagnostics to your NEXT post in this thread. Disable spindown on the disk and run an extended SMART test. Quote Link to comment
axeman Posted May 1, 2022 Author Share Posted May 1, 2022 5 hours ago, trurl said: Attach diagnostics to your NEXT post in this thread. Disable spindown on the disk and run an extended SMART test. Attached - this was before the extended SMART, as soon as I got the error. I'll be running an extended now. 5 hours ago, trurl said: "Yes" is not the correct answer to a multiple choice question. Sorry - Yes, correcting. Disk 12 Errors SDO -diagnostics-20220428-2238.zip Quote Link to comment
JorgeB Posted May 1, 2022 Share Posted May 1, 2022 It's logged as a disk issue, and SMART shows some issues, wait for the extended test result. Quote Link to comment
axeman Posted May 1, 2022 Author Share Posted May 1, 2022 4 hours ago, JorgeB said: It's logged as a disk issue, and SMART shows some issues, wait for the extended test result. Extended shows no error... rigel-smart-20220501-1014.zip Quote Link to comment
JorgeB Posted May 2, 2022 Share Posted May 2, 2022 These can be intermittent, since the test passed disk is OK for now, but you should keep monitoring, especially these attributes: 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 126 If they climb you'll likely get more read errors. Quote Link to comment
axeman Posted May 8, 2022 Author Share Posted May 8, 2022 On 5/2/2022 at 5:00 AM, JorgeB said: These can be intermittent, since the test passed disk is OK for now, but you should keep monitoring, especially these attributes: 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 126 If they climb you'll likely get more read errors. Thanks will do! Quote Link to comment
trurl Posted May 8, 2022 Share Posted May 8, 2022 On 5/2/2022 at 5:00 AM, JorgeB said: especially these attributes Those attributes are not monitored by default, so you have to add them to the custom attributes you can access by clicking on the specific disk to get to its page. Quote Link to comment
axeman Posted May 11, 2022 Author Share Posted May 11, 2022 On 5/8/2022 at 11:28 AM, trurl said: Those attributes are not monitored by default, so you have to add them to the custom attributes you can access by clicking on the specific disk to get to its page. Thanks! Will do. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.