drawde

Members
  • Posts

    332
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

drawde's Achievements

Contributor

Contributor (5/14)

0

Reputation

  1. not sure if there's a way to download a SMART self-test but got the following: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 2756 - Looks good to me. Going to run a preclear just for peace of mind before adding back into action (possibly as a 2nd parity for now). Thanks! EDIT: hmm, doesn't seem to want to mount under unassigned devices so i can run a preclear on it.. getting the following: Jul 20 17:31:19 Tower unassigned.devices: Adding disk '/dev/sdt1'... Jul 20 17:31:19 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdt1' '/mnt/disks/ST16000NM001G-2KK103_ZL2ATVRA' Jul 20 17:31:19 Tower kernel: XFS (sdt1): Mounting V5 Filesystem Jul 20 17:31:19 Tower kernel: XFS (sdt1): Log inconsistent (didn't find previous header) Jul 20 17:31:19 Tower kernel: XFS (sdt1): failed to find log head Jul 20 17:31:19 Tower kernel: XFS (sdt1): log mount/recovery failed: error -5 Jul 20 17:31:19 Tower kernel: XFS (sdt1): log mount failed Jul 20 17:31:19 Tower unassigned.devices: Mount of '/dev/sdt1' failed: 'mount: /mnt/disks/ST16000NM001G-2KK103_ZL2ATVRA: can't read superblock on /dev/sdt1. ' i tried xfs_repair -v /dev/sdt1 but it just started spamming .'s. I'll wait until parity is finished rebuilding on my other drive and try again later.
  2. Next time I will remember to capture the diagnostics immediately. However I also attached logs that go back to April, not sure if there is anything useful there, but you can see when the drive actually first started having issues yesterday.
  3. ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-- 071 064 044 - 12246579 7 Seek_Error_Rate POSR-- 080 060 045 - 92287026 Thanks! Is that raw value the actual number of sectors? Are numbers this big consistent with the drive experiencing some failures and being disabled only once?
  4. man, i really am messing up lol.. the drive in question is ST16000NM001G-2KK103 (sdt), also the only 16tb drive.
  5. LOL sorry, yes i did have a question. Is the hdd likely dying or could it be something else? i was never good at deciphering the smart reports, any feedback or tips would be appreciated. hardware-wise all i've done so far is reseat the drive. but took it out of service already. It's only been a few months. drive is still attached and running an extended SMART currently. if smart comes up good was thinking of running a preclear or two on it.
  6. I messed up and forgot to get diagnostics before rebooting (doh!) and I'm currently rebuilding parity on a new drive. I've attached diagnostics anyways. I do have syslogs dating back to april that hopefully is helpful though. tower-diagnostics-20210719-2046.zip syslog.txt.log
  7. hello all, trying to move some data around and i'm getting the following error: I: 2021/03/15 23:16:14 core.go:710: Command Started: (src: /mnt/disk10) rsync -avPR -X "TV/Show" "/mnt/disk6/" I: 2021/03/15 23:16:15 core.go:767: command:retcode(0):exitcode(0) I: 2021/03/15 23:16:15 core.go:1028: Command Finished I: 2021/03/15 23:16:15 core.go:1041: Current progress: 100.00% done ~ 0s left (10792.87 MB/s) I: 2021/03/15 23:16:15 core.go:972: removing:(rm -rf "/mnt/disk10/TV/Neds Declassified School Survival Guide") W: 2021/03/15 23:16:15 shell.go:51: transferProgress::(rm: cannot remove '/mnt/disk10/TV/Show Guide/banner.jpg': Read-only file system) W: 2021/03/15 23:16:15 shell.go:51: transferProgress::(rm: cannot remove '/mnt/disk10/TV/Show Guide/fanart.jpg': Read-only file system) W: 2021/03/15 23:16:15 shell.go:99: transferProgress:: waitError: exit status 1 W: 2021/03/15 23:16:15 core.go:985: Unable to remove source folder (/mnt/disk10/TV/Show): exit status 1 it appears the moved work, the files exist on the new drive but they are also still on the source drive.
  8. is this hitting million CPU suddenly for anyone else? had to stop it. every time i turn it on, my CPU usage goes through the roof.
  9. i'm having the same issue. thoughts? i've seen this in the past, i think my omvf doesn't work as in other VMs i've also had to use seabios. EDIT: nevermind, seems the instructions has been updated. followed and working now.
  10. After my last parity check I'm seeing some read errors on one of my drives. I did an extended smart and they did not increase, and also it's been a few days since my parity check and everything seems to be working okay. anything of concern? tower-smart-20191003-2355.zip tower-diagnostics-20191004-0356.zip
  11. CPU temp missing all of a sudden for anyone else?
  12. hello all, had unraid reboot on me randomly today. i have a script that copies the log to another location every 5 minutes, didn't see anything in there. when it came back online, Fix Common Problems reported an issue: Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged. I didn't have mcelog installed, so I installed it. But I guess this only logs going forward? In any case, in my logs from after the reboot (before mcelog was installed) showed the following message: May 28 00:16:00 Tower kernel: smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1) May 28 00:16:00 Tower kernel: Performance Events: Fam17h core perfctr, AMD PMU driver. May 28 00:16:00 Tower kernel: ... version: 0 May 28 00:16:00 Tower kernel: ... bit width: 48 May 28 00:16:00 Tower kernel: ... generic registers: 6 May 28 00:16:00 Tower kernel: ... value mask: 0000ffffffffffff May 28 00:16:00 Tower kernel: ... max period: 00007fffffffffff May 28 00:16:00 Tower kernel: ... fixed-purpose events: 0 May 28 00:16:00 Tower kernel: ... event mask: 000000000000003f May 28 00:16:00 Tower kernel: rcu: Hierarchical SRCU implementation. May 28 00:16:00 Tower kernel: smp: Bringing up secondary CPUs ... May 28 00:16:00 Tower kernel: x86: Booting SMP configuration: May 28 00:16:00 Tower kernel: .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 May 28 00:16:00 Tower kernel: mce: [Hardware Error]: Machine check events logged May 28 00:16:00 Tower kernel: mce: [Hardware Error]: CPU 13: Machine Check: 0 Bank 5: bea0000000000108 May 28 00:16:00 Tower kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff81654a1a MISC d012000101000000 SYND 4d000000 IPID 500b000000000 May 28 00:16:00 Tower kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1559016932 SOCKET 0 APIC d microcode 8001126 May 28 00:16:00 Tower kernel: #14 #15 May 28 00:16:00 Tower kernel: smp: Brought up 1 node, 16 CPUs Is the zenstates fix still required for 6.7? I have my C-states in BIOS disabled already. tower-diagnostics-20190528-0431.zip
  13. how concerned should i be? just upgraded to 6.7 a few hours ago, not sure if related but saw these errors in my syslog after CA updated my dockers (usual on sunday AM). May 12 00:19:32 Tower rc.diskinfo[4610]: SIGHUP received, forcing refresh of disks info. May 12 00:19:32 Tower kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. May 12 00:19:32 Tower kernel: Either enable ECC checking or force module loading by setting 'ecc_enable_override'. May 12 00:19:32 Tower kernel: (Note that use of the override may cause unknown side effects.) May 12 00:19:37 Tower kernel: sd 13:0:0:0: [sdb] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00 May 12 00:19:37 Tower kernel: sd 13:0:0:0: [sdb] tag#4 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00 May 12 00:19:37 Tower kernel: print_req_error: I/O error, dev sdb, sector 3907028992 May 12 00:19:37 Tower kernel: sd 13:0:3:0: [sde] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00 May 12 00:19:37 Tower kernel: sd 13:0:3:0: [sde] tag#1 CDB: opcode=0x88 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00 May 12 00:19:37 Tower kernel: print_req_error: I/O error, dev sde, sector 7814036992 May 12 00:19:41 Tower kernel: sd 13:0:1:0: [sdc] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00 May 12 00:19:41 Tower kernel: sd 13:0:1:0: [sdc] tag#2 CDB: opcode=0x88 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00 May 12 00:19:41 Tower kernel: print_req_error: I/O error, dev sdc, sector 7814036992 May 12 00:19:46 Tower kernel: sd 13:0:11:0: [sdm] tag#23 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00 May 12 00:19:46 Tower kernel: sd 13:0:11:0: [sdm] tag#23 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00 May 12 00:19:46 Tower kernel: print_req_error: I/O error, dev sdm, sector 3907028992 May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP received, forcing refresh of disks info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:46 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP received, forcing refresh of disks info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:47 Tower rc.diskinfo[4610]: SIGHUP ignored - already refreshing disk info. May 12 00:19:48 Tower rc.diskinfo[4610]: SIGHUP received, forcing refresh of disks info. the webUI does not show any errors though next to any of the disks and all disks are green. the rc.diskinfo and the ECC disabled i've seen before. I don't have ECC ram so i dont think that's something i should care about, and the rc.diskinfo i believe is just the preclear plugin, which i'm not running so i'm not too worried about. mostly the print_req_error I/O error i'm concerned about. it's the same 2 sectors across 4 different drives? May 12 00:19:37 Tower kernel: print_req_error: I/O error, dev sdb, sector 3907028992 May 12 00:19:46 Tower kernel: print_req_error: I/O error, dev sdm, sector 3907028992 May 12 00:19:37 Tower kernel: print_req_error: I/O error, dev sde, sector 7814036992 May 12 00:19:41 Tower kernel: print_req_error: I/O error, dev sdc, sector 7814036992 tower-diagnostics-20190512-0508.zip