snuffy47 Posted August 28, 2021 Share Posted August 28, 2021 Howdy all Well I had a motherboard board taken out in a power outage My fault my UPS was not working - dead battery..... All is fixed now however my parity check hit 3200+ errors first run and now came back with 200+ Looking for some help :) tower-diagnostics-20210828-1226.zip Quote Link to comment
JorgeB Posted August 29, 2021 Share Posted August 29, 2021 First need to fix this: Aug 25 21:21:13 Tower kernel: md: disk2 read error, sector=54793824 Aug 25 21:21:13 Tower kernel: md: disk2 read error, sector=54793832 Aug 25 21:21:13 Tower kernel: md: disk2 read error, sector=54793840 Aug 25 21:21:13 Tower kernel: md: disk2 read error, sector=54793848 Disk appears to be failing, extended SMART test will confirm. Quote Link to comment
snuffy47 Posted August 29, 2021 Author Share Posted August 29, 2021 Think smart test completed. tower-smart-20210829-1204.zip Quote Link to comment
trurl Posted August 29, 2021 Share Posted August 29, 2021 # 1 Extended offline Aborted by host 80% 46043 - Disable spindown on that disk and run again. Quote Link to comment
snuffy47 Posted August 30, 2021 Author Share Posted August 30, 2021 Howdy Well that took some time but it completed - without errors is what it indicated Help is always appreciated tower-smart-20210830-1549.zip Quote Link to comment
trurl Posted August 30, 2021 Share Posted August 30, 2021 Still, that disk does have an attribute (3, spin up time, not usually monitored) with something in the FAIL column. And it has a pending sector. And it is over 5 years old. And you are having problems caused by the disk. I think I would retire it. Quote Link to comment
trurl Posted August 30, 2021 Share Posted August 30, 2021 From your earlier diagnostics, it looks like that disk2 (and disk8) are empty or mostly so. Is that expected? Quote Link to comment
snuffy47 Posted August 30, 2021 Author Share Posted August 30, 2021 Howdy You are correct regarding the 2 disks. A while back I upgraded some equipment and disks. They are associated with 2 other discs but the high water setting has not started using them again. If I was going to do anything with Disk 2 currently I would like to just remove it out of the system. I have not seen anything in performance that I can think of to date though my response time for Plex Movies seemed to be slow last few days but figured that was the Parity Checks and SmaRT Test causing some of that... Happy to provide further details Quote Link to comment
trurl Posted August 30, 2021 Share Posted August 30, 2021 All bits of all other disks must be reliably read to reliably rebuild a disk, so all disks in the array are important whether they have anything on them or not. 1 minute ago, snuffy47 said: If I was going to do anything with Disk 2 currently I would like to just remove it out of the system. https://wiki.unraid.net/Manual/Storage_Management#Removing_data_disk.28s.29 Quote Link to comment
snuffy47 Posted August 30, 2021 Author Share Posted August 30, 2021 Turl Well I have a disk I can replace it with so will go that route . Before I do is there anything else I should complete prior to this? Quote Link to comment
trurl Posted August 30, 2021 Share Posted August 30, 2021 Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? It is important to take care of any problem immediately so you don't get multiple problems that you may not easily or fully recover from. Quote Link to comment
snuffy47 Posted August 30, 2021 Author Share Posted August 30, 2021 I get email notifications just not sure how well I monitor that though Do you have a recommendation as I am sure it will run over the evening when I change it. Quote Link to comment
trurl Posted August 30, 2021 Share Posted August 30, 2021 39 minutes ago, snuffy47 said: when I change it Make sure you double check connections. That is the main reason people have problems with replace/rebuild. Quote Link to comment
snuffy47 Posted September 2, 2021 Author Share Posted September 2, 2021 (edited) In Painic mode..... Have not opened up box or changed anything yet What I started was to SMART Test disc 1 and disk 3 with the intentions of testing all my drives and posting. These are attached though my server started a schedule parity check that I forgot to turn off Disk 3 is spitting out a ton of Raw Read Errors I am wondering if there is something lose ;( The 1 item I should note I do have a 5 disk 5.25" insert. I was thinking I should emliminate this out of the system as it is old but I am 1 slot shy in my server to do that Going to back things up now guess go from there RAID Forums.zip tower-smart-20210902-1327 (1).zip tower-smart-20210902-1327.zip Edited September 3, 2021 by snuffy47 Quote Link to comment
snuffy47 Posted September 3, 2021 Author Share Posted September 3, 2021 Update Not sure if it was the correct approach but I had 2 disks that would allow me to back up lets say would rather not lose files but would rather not... Crazy part is 3/4 of my data is media that falls under that category. My do not want to ever lose I keep a external back up already Order some new drives if replacement is required also as backing up files used my spares. Like I said maybe not the best idea but still not sure what is causing all the problems The RAW error reads have seemed to stop now that I canceled the parity check General Plan at this point and hoping if I am off track that the more experienced may correct 1. Back up Data 2. Check Connections on every thing and run a few days. Struggling with this as I feel there is failing hardware but maybe not 3. If problem continues change drive 3 - Hope it rebuilds - run for a few days 4. Have not went past that - Quote Link to comment
trurl Posted September 4, 2021 Share Posted September 4, 2021 Serial Number: S2H7J1BZB15127 ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 001 001 051 NOW 41013 The SMART attributes are internal to the disks. Each disk monitors how well it is working as it is used, and records that in these attributes it keeps in its firmware. Unraid monitors some of these attributes by default and will warn you about those it monitors, but Unraid is only looking at some of them, and Unraid only reports exactly what the disk is telling it. If the disk says it is failing, believe it. Quote Link to comment
trurl Posted September 4, 2021 Share Posted September 4, 2021 Serial Number: ZCT0P3KA ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 5 Reallocated_Sector_Ct PO--CK 100 100 010 - 976 This one has way too many reallocated for my comfort. And this is one of the attributes Unraid monitors by default, so it should have been telling you about it. On 8/30/2021 at 4:24 PM, trurl said: All bits of all other disks must be reliably read to reliably rebuild a disk, so all disks in the array are important whether they have anything on them or not. You have single parity, and at least 2 disks that can't be trusted. That is one more untrustworthy disks than you have parity. If you try to replace/rebuild one of these, the other may make the rebuild unreliable or impossible. Copying important data off the array is probably the first priority. Do any of your other disks show SMART warnings on the Dashboard page? Quote Link to comment
trurl Posted September 4, 2021 Share Posted September 4, 2021 On 8/30/2021 at 4:01 PM, trurl said: Still, that disk does have an attribute (3, spin up time, not usually monitored) with something in the FAIL column. And it has a pending sector. And it is over 5 years old. And you are having problems caused by the disk. I think I would retire it. Forgot about that one. So that is at least 3 unreliable disks in your array. How did you let things get this bad? Quote Link to comment
snuffy47 Posted September 4, 2021 Author Share Posted September 4, 2021 (edited) Well arent I in a mess I wasnt having any problems up to the power outage knocking my MOBO out but guess need to learn how to read the HD counts better. You get use to no problems and then you ignore things The data back up will take another day Do I toss a dart at the wall and try replacing Disk 3 first.. The 2 TB drives and 8TB I ordered wonnt be here till mid week Edited September 4, 2021 by snuffy47 Quote Link to comment
trurl Posted September 4, 2021 Share Posted September 4, 2021 That screenshot shows that on 2 of your disks, some of the attributes that Unraid monitors aren't good. So Unraid has been warning you about them. The other disk isn't showing its problems because the attribute that shows it is failing isn't normally monitored by Unraid. Quote Link to comment
snuffy47 Posted September 7, 2021 Author Share Posted September 7, 2021 Well I can not win for losing on this one.... Just had another crazy storm roll through and I am having problems Mounting my external drive I was using to back up some of my data the messages I get are Quote Sep 7 18:56:19 Tower unassigned.devices: Adding disk '/dev/sdn1'... Sep 7 18:56:19 Tower unassigned.devices: Mount drive command: /sbin/mount -t 'ntfs' -o rw,auto,async,noatime,nodiratime,nodev,nosuid,nls=utf8,umask=000 '/dev/sdn1' '/mnt/disks/ST4000DM004-2CV104_ZFN3XMJZ' Sep 7 18:56:19 Tower unassigned.devices: Mount of '/dev/sdn1' failed: '$MFTMirr does not match $MFT (record 0). Failed to mount '/dev/sdn1': Input/output error NTFS is either inconsistent, or there is a hardware fault, or it's a SoftRAID/FakeRAID hardware. In the first case run chkdsk /f on Windows then reboot into Windows twice. The usage of the /f parameter is very important! If the device is a SoftRAID/FakeRAID then first activate it and mount a different device under the /dev/mapper/ directory, (e.g. /dev/mapper/nvidia_eahaabcc1). Please see the 'dmraid' documentation for more details. ' Any help is appreciated Quote Link to comment
trurl Posted September 7, 2021 Share Posted September 7, 2021 9 minutes ago, snuffy47 said: my external drive Is it NTFS? If so, put it in Windows and see it it can fix it. Quote Link to comment
snuffy47 Posted September 10, 2021 Author Share Posted September 10, 2021 Well your suggestion worked My drives are in hand I am going to exchange 3 now One thing that popped up that I have never seen before is this which guess it means things are really messed up Quote Event: unRAID file corruption Subject: Notice [TOWER] - bunker verify command Description: Found 7 files with BLAKE2 hash key corruption Importance: alert BLAKE2 hash key mismatch, /mnt/disk3/movies/A-X-L (2018)/A-X-L (2018).mkv is corrupted BLAKE2 hash key mismatch, /mnt/disk3/movies/50 50 (2011)/50 50 (2011).mkv is corrupted BLAKE2 hash key mismatch, /mnt/disk3/movies/Aquaman (2018)/Aquaman (2018).mkv is corrupted BLAKE2 hash key mismatch, /mnt/disk3/movies/Postcards from the Edge (1990)/Postcards from the Edge (1990).mkv is corrupted BLAKE2 hash key mismatch, /mnt/disk3/movies/Legend of Tarzan, The (2016)/The Legend of Tarzan2016.ISO is corrupted BLAKE2 hash key mismatch, /mnt/disk3/movies/Mandy (2018)/Mandy (2018).avi is corrupted BLAKE2 hash key mismatch, /mnt/disk3/movies/The Chronicles of Narnia Prince Caspian (2008)/The Chronicles of Narnia Prince Caspian (2008).mkv is corrupted Quote Link to comment
snuffy47 Posted September 10, 2021 Author Share Posted September 10, 2021 Well things are rebuilding..... Watching the drive I am very unsure why this is so high - its what the old drive indicated as a failure mode but the new drive is not flagging anything # ATTRIBUTE NAME FLAG VALUE WORST THRESHOLD TYPE UPDATED FAILED RAW VALUE 1 Raw read error rate 0x000f 079 076 006 Pre-fail Always Never 74902826 Quote Link to comment
trurl Posted September 11, 2021 Share Posted September 11, 2021 I edited your post and put that SMART into a code block instead of a quote block as you had it. Now it lines up under the headings. Something to consider for future posts. What model is that disk? Post new diagnostics Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.