Cazzy Posted October 1, 2020 Share Posted October 1, 2020 (edited) So I was half drunk the other day when I got an email that my drive failed a write, for some reason lol. I ran the short and extended SMART tests and both came back clean. I saw posts from previous users where they suggested to take it offline, remove from array, re-add, rebuild but I want to make sure that's my only route here? Any reason, based on the files attached, to believe that removing the drive from the array and resyncing it won't work? What do you suggest, otherwise? Diagnostics and Smart Test are attached. I don't have a drive to replace it with yet since Best Buy doesn't have them on sale right now (I know, I know. I should've gotten backup drives when they were on sale. Lesson learned, lol). Thanks in advance! plexhub-diagnostics-20201001-1051.zip plexhub-smart-20201001-1050.zip Edited October 1, 2020 by Cazzy Quote Link to comment
trurl Posted October 1, 2020 Share Posted October 1, 2020 SMART looks OK, but syslog seems to indicate these may be actual disk problems. Which controller is this disk on? Since it passed extended test, you can try to rebuild to the same disk and see if it works. Not necessary to remove, just Stop array Unassign disabled disk Start array with disabled disk unassigned Stop array Reassign disabled disk Start array to begin rebuild of disabled disk On Main, you should see Writes to rebuilding disk, Reads from parity and all other array disks, nothing in the Errors column. If there are problems post new diagnostics. 1 Quote Link to comment
civic95man Posted October 1, 2020 Share Posted October 1, 2020 Looks like your LSI card is resetting - may need to check cooling/ reseat it in the slot. Also, not sure if it's related, but your're running an older version of the firmware (20.00.04.00). You should look into updating that. 28 minutes ago, Cazzy said: Any reason, based on the files attached, to believe that removing the drive from the array and resyncing it won't work? What do you suggest, otherwise? does the emulated drive appear mountable? 1 Quote Link to comment
JorgeB Posted October 1, 2020 Share Posted October 1, 2020 29 minutes ago, civic95man said: Also, not sure if it's related, but your're running an older version of the firmware (20.00.04.00). This should definitely be updated, all p20 releases except 20.00.07.00 have known issues. 1 Quote Link to comment
Cazzy Posted October 1, 2020 Author Share Posted October 1, 2020 (edited) 7 hours ago, civic95man said: Looks like your LSI card is resetting - may need to check cooling/ reseat it in the slot. Also, not sure if it's related, but your're running an older version of the firmware (20.00.04.00). You should look into updating that. does the emulated drive appear mountable? I appreciate it! I just flashed it with 20.00.07.00 and confirmed it flashed correctly. Yes, it did appear as mountable. How were you able to tell the card was resetting, exactly, just for my knowledge? Would like to learn as I go along! Thanks, man! 7 hours ago, trurl said: SMART looks OK, but syslog seems to indicate these may be actual disk problems. Which controller is this disk on? Since it passed extended test, you can try to rebuild to the same disk and see if it works. Not necessary to remove, just Stop array Unassign disabled disk Start array with disabled disk unassigned Stop array Reassign disabled disk Start array to begin rebuild of disabled disk On Main, you should see Writes to rebuilding disk, Reads from parity and all other array disks, nothing in the Errors column. If there are problems post new diagnostics. Doing this now and it looks to be rebuilding. I'll update if any issues come up once it's done! Thank you for the help so far! Edited October 1, 2020 by Cazzy Quote Link to comment
civic95man Posted October 2, 2020 Share Posted October 2, 2020 (edited) 14 hours ago, Cazzy said: How were you able to tell the card was resetting, exactly, just for my knowledge? Would like to learn as I go along! In your syslog towards the end, its filled with this Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290504 Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290512 Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290520 Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290528 Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290536 That looks like a failing disk for one reason or another, but further back in the log, you see this Sep 27 16:56:08 PlexHub kernel: mpt2sas_cm0: fault_state(0x7e23)! Sep 27 16:56:08 PlexHub kernel: mpt2sas_cm0: sending diag reset !! Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: diag reset: SUCCESS Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.04.00), ChipRevision(0x03), BiosVersion(07.39.00.00) Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: Protocol=( Sep 27 16:56:09 PlexHub kernel: Initiator Sep 27 16:56:09 PlexHub kernel: ,Target Sep 27 16:56:09 PlexHub kernel: ), Sep 27 16:56:09 PlexHub kernel: Capabilities=( Sep 27 16:56:09 PlexHub kernel: TLR Sep 27 16:56:09 PlexHub kernel: ,EEDP Sep 27 16:56:09 PlexHub kernel: ,Snapshot Buffer Sep 27 16:56:09 PlexHub kernel: ,Diag Trace Buffer Sep 27 16:56:09 PlexHub kernel: ,Task Set Full Sep 27 16:56:09 PlexHub kernel: ,NCQ Sep 27 16:56:09 PlexHub kernel: ) Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: sending port enable !! Sep 27 16:56:16 PlexHub kernel: mpt2sas_cm0: port enable: SUCCESS Sep 27 16:56:16 PlexHub kernel: mpt2sas_cm0: search for end-devices: start -where mpt2sas is the driver for the LSI card. This is where the card s--t the bed and where all of the disk problems originated, probably due to the outdated firmware. 14 hours ago, Cazzy said: Yes, it did appear as mountable. And I asked because sometimes when a disk becomes disabled, the emulated disk appears unmountable. This can generally be fixed with a file system repair without losing much, if any, data. But in some extreme cases where it cannot be repaired or the recovered data is a mess, the contents of the physical drive may be a better choice. But really it all comes down to a case by case basis. Edited October 2, 2020 by civic95man Quote Link to comment
Cazzy Posted October 7, 2020 Author Share Posted October 7, 2020 On 10/2/2020 at 10:22 AM, civic95man said: -where mpt2sas is the driver for the LSI card. This is where the card s--t the bed and where all of the disk problems originated, probably due to the outdated firmware. Looks like it was the firmware. After re-flashing, the Drive was rebuilt. I checked the logs and those errors are gone. Ran a SMART test again, just to be safe. It passed. I think we're good now! (Fingers crossed) Thank you for your insight and helping me out! @trurl as well! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.