November 16, 20205 yr Solution: If you own WD Red Plus drives, and you're experiencing Errors - perform the following steps: 1) Run an Extended SMART test on the drive that is error'ing. 2) Download the SMART results when it's complete and check Raw_Read_Error_Rate in the .txt file 3) Raw_Read_Error_Rate should be zero for WD Red Plus drives specifically (this statement is not true for all HDD's, ask for help if you're using a different type of drive)! 4) If the Raw_Read_Error_Rate is not zero, your Red Plus drive will need replacing. RMA it if under warranty. 5) For extra certainty, run an Extended SMART test on the replacement drive to ensure it's working as expected. 6) Add "1,200" (no quotes) to the Smart Attribute Notifications of your WD Red Plus drives (textbox next to "Custom Attributes") OP Below: Hi all! New Unraid user here. Everything has been working swimmingly up until my first mover job (cache dumping contents onto spinny plates). My disk1 is receiving a crazy mount of errors, screenshot: This system, including all the drives are brand new. I downloaded my diagnostics, and found thousands of these in the syslog.txt Nov 16 10:08:58 Alexandria kernel: md: disk1 read error, sector=15032479712 Nov 16 10:08:58 Alexandria kernel: md: disk1 read error, sector=15032479720 Nov 16 10:08:58 Alexandria kernel: md: disk1 read error, sector=15032479728 Nov 16 10:08:58 Alexandria kernel: md: disk1 read error, sector=15032479736 Nov 16 10:08:58 Alexandria kernel: md: disk1 read error, sector=15032479744 Nov 16 10:08:58 Alexandria kernel: md: disk1 read error, sector=15032479752 I'm currently running the SMART extended self-test on disk1. Results TBD. My question is: Is disk1 bunk? Given the fact that all the drives are fresh off the press, so to speak, I would expect zero errors. Could there be a software reason for all these errors, outside of a bad disk? Looking for help here before moving forward with an RMA. Cheers! Edited December 3, 20205 yr by codearoni
November 16, 20205 yr Community Expert Syslog snippets are seldom sufficient. Without more information, best guess is bad connection, simply based on most frequent problem we see. 7 minutes ago, codearoni said: downloaded my diagnostics Give them to us and we will have more information to understand what is happening and make recommendations. Attach complete Diagnostics ZIP file to your NEXT post in this thread.
November 16, 20205 yr Community Expert This one looks like it may be a disk problem: Nov 16 03:40:22 Alexandria kernel: ata2.00: status: { DRDY SENSE ERR } Nov 16 03:40:22 Alexandria kernel: ata2.00: error: { UNC } Nov 16 03:40:22 Alexandria kernel: ata2.00: configured for UDMA/133 Nov 16 03:40:22 Alexandria kernel: sd 2:0:0:0: [sdc] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Nov 16 03:40:22 Alexandria kernel: sd 2:0:0:0: [sdc] tag#4 Sense Key : 0x3 [current] Nov 16 03:40:22 Alexandria kernel: sd 2:0:0:0: [sdc] tag#4 ASC=0x11 ASCQ=0x4 Nov 16 03:40:22 Alexandria kernel: sd 2:0:0:0: [sdc] tag#4 CDB: opcode=0x88 88 00 00 00 00 03 80 00 4c 18 00 00 05 40 00 00 Nov 16 03:40:22 Alexandria kernel: print_req_error: I/O error, dev sdc, sector 15032405016 Let us know how the extended SMART turns out.
November 16, 20205 yr Author Just posting an update: extended SMART is still at 40%. Might not have results ready until tomorrow. Thanks again for meandering this issue with me trurl
November 17, 20205 yr Author Attached is my smart report for disk1. The text below the report download says "Completed without error" alexandria-smart-20201117-0817.zip
November 17, 20205 yr Community Expert That WD Red disk went from zero to this on SMART attribute 1: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate PO-R-- 086 086 016 - 162 I would replace it.
November 17, 20205 yr Author Roger. Thank you so much Trurl! Just for my own notes and knowledge: can you briefly describe what you're seeing. Would a healthy disc have "000" for all of those fields?
November 17, 20205 yr Community Expert 1 minute ago, codearoni said: Just for my own notes and knowledge: can you briefly describe what you're seeing. Would a healthy disc have "000" for all of those fields? Different disk models interpret that attribute differently. For WD Red it should be zero. If you have any other disks of that model, you should click on it to get to its page and set Unraid to monitor that attribute.
November 19, 20205 yr Author Just an update: I spun down the array and removed the disk. It's currently in RMA. After I get the replacement I'll start a rebuild. When it's all said and done, I'll update the OP with my steps used to triage this issue. Hoping it'll help future WD Red owners.
November 28, 20205 yr Author Hi trurl! While I've been waiting on my RMA'd disk, I've been looking into setting up Unraid to monitor said attribute for my WD Red drives. I've looked at the wiki plus these forums, but am unsure how to add monitoring as discussed above. I assume I go to the disk page, and enter a custom attribute (screenshot of what I'm talking about attached)? Is this correct? What would the syntax for this custom attribute look like?
November 28, 20205 yr Community Expert Just as it says. Custom attributes (use comma to separate numbers) You want 1 and 200 so just put 1,200 in the blank and APPLY
November 30, 20205 yr Author Thanks trurl. Looks good now. I was making it more complicated than it needed to be. (i.e. "Attribute = 0" trying to match the checkboxes below). Final question: I'll be rebuilding the array soon. I am adding a 2nd parity drive and one more storage drive. Should I: 1) spin up the array with the replacement disk ONLY, and rebuild FIRST - followed by spinning down the array, and adding the new drives. or 2) spin up the array with the replacement disk, plus the new drives, and rebuild all together. Couldn't find any documentation on this particular scenario in the wiki. I would prefer to do #2 as I imagine it'll be faster, but am obviously interested in doing this correctly moreso than quickly.
November 30, 20205 yr Community Expert Assuming you mean the new data disk for a new slot you can do #2 with new config. If the new data disk isn't clear you will have to rebuild parity1 at the same time so no protection until done.
November 30, 20205 yr Author Thanks trurl. Just to be clear: I'll be moving from 1x Parity and 3x Data drives to 2x Parity and 4x Data drives. Sounds like adding a 2nd parity will require a rebuild on Parity #1...so I might be better off doing #1, just adding the replacement data disk and rebuilding the array. Then afterwards, spinning down the array, and adding Parity #2 and Data #4?
November 30, 20205 yr Community Expert 8 minutes ago, codearoni said:Then afterwards, spinning down the array, and adding Parity #2 and Data #4? I have a feeling that Unraid will not allow these to be done in one step as adding the extra data drive starts a clear operation and adding a parity drive starts a parity sync operation - and you cannot run both of these at the same time.
November 30, 20205 yr Author Right on, ty itimpi! So as a general rule of thumbs: adding multiple data drives at once = fine. Adding data + parity drives = not (do them as separate tasks). Makes sense. I'm just a new user and don't want to make any assumptions as to how unraid operates.
November 30, 20205 yr Community Expert I should have reviewed the thread since I overlooked the fact you were replacing a disk. Of course that has to be done separately and before any other changes. You can add data and parity drives at the same time (through new config), but you must replace / rebuild a disk separately. If the disk actually needs replacing due to problems then that should be done before anything else.
December 1, 20205 yr Author No worries trurl, I imagine you're managing thousands of threads on this board lol. I've begun a rebuild but it'll take 12 hours. Probably tomorrow I'll pop on and update the OP with a summary of steps taken for these particular drives (WD Red Plus).
December 2, 20205 yr Author Everything has been updating swimmingly, just taking a while given the drives I got (14 hours each). I had a question about extended SMART tests though: can I run them while the array is up and running? Will things like mover jobs be interrupted by extended SMART tests if I run them at night?
December 2, 20205 yr Community Expert 17 minutes ago, codearoni said: can I run them while the array is up and running? Yes, they ruin on the background, but avoid heavy i/o or they will take much longer.
December 3, 20205 yr Author Thanks to everyone for the help on this issue! I've updated my OP with my triage steps. Hopefully it'll help future WD Red users in the future. I've got my array back online. The rebuild process was incredibly easy. Hardest part of this whole thing was waiting on the RMA drive. It's only strengthened the idea that Unraid was the right choice for my NAS.
Archived
This topic is now archived and is closed to further replies.