Jump to content

xtrap225

Members
  • Posts

    69
  • Joined

  • Last visited

Posts posted by xtrap225

  1. stopped the array (i don't know if i had to do that or not, kinda wish i tried before doing that).

     

    then ran 'mount -o remount,size=10G /run'

     

    it seemed to work. but i doubt that will survive a reboot.  anyone know how to make that change permanent?

     

    can i put it in my /boot/config/go or /boot/config/extra.cfg (i don't know how to use that one, tried to look it up in the manual but couldn't find it).

     

    ** so i didn't have to stop my array to do the change cause i. did it again and made it 256M after thinking for more than a second and realizing i was potentially wasting a ton of RAM **

  2. i have many computers(windows, mac, linux) and have tried all their respective browsers, chrome, firefox, edge, safari.

     

    i can't just bring down the server to try safe mode? not sure i have used safe mode on unraid, is that an option when it boots? or is that like maintenance mode?

     

    i will have to plan that sort of shutdown.

  3. image.thumb.png.7534669fd488e4aa33b330b1779f27a5.png

     

    my log button on my unraid webpage hasn't worked for several versions. that is http://X.X.X.X/webterminal/syslog

     

    since syslog is working both internally; meaning it works from command line of course, but also from the browser at http://X.X.X.X/Syslog

     

    and externally(but really internal) to an observium docker. Since that is the case it has meant that i haven't bothered to open forum post for help. I thought it would be nice to fix this issue now. 

     

    if anyone can help it would be appreciated.

     

     

    UNRAID-diagnostics-20221212-1418.zip

  4. i want to add some tangential information for others that are reading this thread.

     

    i reinstalled the dynamix file integrity plugin again and it slowed my parity right back down.  its running blake3 btw.  my cpu is Intel® Xeon W-2235 so it has the proper avx extensions for it.

     

    so  i excluded the time-machine backup shares from its check, which is the only place i ever had so-called bitrot, but i am sure those were false positives.   as i have used them for restores more than a few times and never had any issues.

     

    i am unsure if you are not supposed to check time-machine shares anyway or not, but i don't think it works properly with those.  if someone else thinks otherwise and has the same issue, you may wish to go over to the support for that plugin and report it.

     

    but once i excluded those, my parity is running ~150MBps on the newer drives and ~115MBps on the older drives.   it has increased my average system load to around 4, but that is just while it catches up i assume.

     

    also i am re-enabling spectre mitigations and removing that plugin. (mainly cause i didn't see any difference with them enabled and the newer 6.11.0 has even more mitigations built in)

    and i will updates to 6.11.0 from 6.10.3 once this parity check i haven't gotten in so long is done.

     

    however i am now a little concerned, yesterday when i ran the upgrade assistance it said i was all good to go, now something has changed and it says nerdpack and devpack aren't compatible and should be uninstalled. so i might need to wait for an update on those, assuming that happens. i am pretty sure i need those, so i don't want to just rip them out.  i am lucky i didn't get the upgrade notification while i was doing my maintenance yesterday cause i probably would have just done it, and had major issues.

     

    i hope those packs (nerd & dev), aren't being abandoned and that they get an update soon that makes them compatible.

     

    so i don't want to reboot again until parity is done, and its about 50% done now, which is sooooooo much faster than it was running.

  5. i think i got it.

     

    all previously posts were with docker disabled.

     

    i have removed the dynamix file integrity plugin, and started up docker.

     

    so far so good. load average is floating around ~2.4 to ~3 ,, ah gone down to 2.2 and lower ~1.71

     

    so that is ~12 dockers running and the parity check

    image.png.98d654080de7a600dfa3674da1c9c3ac.png

     

    so thank you for your help but i believe all is well again, i guess that plugin is broken or defunct now, or not and i just need to rework it better if its still useful.

     

    • Like 1
  6. i know i said i would wait but with just SMB and VM service enable (none actually on), after 15min of parity check

    image.thumb.png.eb8349c4629553aa930f4d6b1590741d.png

     

    i really thought it would be something to do with docker

    i stopped automatic timemachine backups cause i found that had kicked off and cancelled it.

    i stopped my sonarr, radarr, and lidarr services on another machine which could cause network traffic to the unraid server's array.

    it didn't help

    image.thumb.png.a8b99db534949129909ff92b6032bf1e.png

    my load average has jumped back up to ~26 and possibly rising.

     

    dell-pc-diagnostics-20220924-1654.zip

  7. load average after reboot,

     

    nothing running array stopped = 0.15 to 0.3

     

    maint-mode parity started normally = 1.88

    also parity looks good.
    image.thumb.png.0c5b4cc0403586ecb2d15cdefff236bd.png

     

    array started normal (SMB still enabled), docker service disabled, vm service disabled. = ~3.02

    image.thumb.png.98d6c376f7a0df5eb26d9d95c31c44d4.png

    downloaded the diag after it was running like this for a  little bit.

     

    gonna stop it again and disable SMB and start it and see what happens, while eagerly await your replydell-pc-diagnostics-20220924-1536.zip.

    dell-pc-diagnostics-20220924-1536.zip

  8. okay i stopped the docker service and no significant change still over 30 load average.

     

    stopped the vm service (even  though there only one vm and it wasn't on) same, still over 30 load average.

     

    i not just paused but cancelled the currently running parity and still no change.

     

    turned off auto start the array in prep for reboot.  when it reboot and it  comes back, should i start in maint-mode? or just leave off those services and start the array normal and run a parity for a while before uploading?

     

    how long is a while?

    did i miss turning off any other services? like smb or whatever(i guess i would have to stop the array so it might as be after the reboot if you want it stopped) ? i checked and no time-machine backups are running (really there is only one anyway, on the laptop i am writing this from.

  9. i have recently migrated to a new server and added two new parity drives they are 12TB WD RED plus nas.

     

    they are internal to the server on the 'Intel Corporation C600/X79 series chipset SATA RAID Controller'

     

    the rest of the array besides the nvme cache and one for passthrough to a vm, which was off during the time the diag was taken.

     

    anyway the rest of the array is 22 disks via a 'Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)' which is connected to a NetApp DS4246, with IOM6 modules, (just using one).

     

    the drives are all seagate 3TB ST3000NM0033 except, 3 are 8TB WD RED NAS WD80EFAX drives.

     

    previously about a month ago after switching to this server already i was getting ...

     

    Parity-Check 2022-07-06, 00:28:08 8 TB 21 hr, 28 min, 7 sec 103.5 MB/s OK 0

     

    but that was before adding the new 12 TB parity on another internal controller.

     

    here is the parity sync for that.

     

    Parity-Sync 2022-08-05, 08:18:10 12 TB 1 day, 6 hr, 30 min, 13 sec 109.3 MB/s OK 0

     

    but since then i have had super slow parity can't finish it cause it will take half a year.

     

    Total size:12 TB

    Elapsed time: 18 hours, 44 minutes

    Current position: 126 GB (1.0 %)

    Estimated speed: 1.2 MB/sec

    Estimated finish: 115 days, 7 hours, 58 minutes

    Sync errors corrected: 0

     

    i thought maybe my vm was slowing it  down so i turned that off, which crashes the whole system but thats a different issue. and since then i have not turned on the vm.

     

    how can i troubleshoot my slow parity what should i look at.  i can't find any issues with the  drives or their speeds or their controllers or speeds. not sure where to look or test next.

     

    any help would be appreciated.

     

     

    dell-pc-diagnostics-20220924-1019.zip

  10. so ever since i added the following to my /boot/config/go file to enable hardware transcoding to plex.

    #Setup drivers for hardware transcoding in Plex
    
    modprobe i915
    chown -R nobody:users /dev/dri
    chmod -R 777 /dev/dri

     

    here is the lines im seeing in the syslog

    Sep 22 23:25:22 WORK-PC kernel: [drm] GPU HANG: ecode 7:6:0xacdfbffe, reason: hang on vecs0, action: reset
    Sep 22 23:25:22 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    
    ...
    
    Sep 23 01:14:03 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:11 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:19 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:27 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:35 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:43 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:51 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:14:59 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    Sep 23 01:15:07 WORK-PC kernel: i915 0000:00:02.0: Resetting chip for hang on vecs0
    ...
    Sep 23 01:45:27 WORK-PC kernel: Plex Media Scan[28422]: segfault at b0 ip 000014cfbfe01097 sp 000014cfb19fe000 error 4 in libcrypto.so.1.0.0[14cfbfcf0000+204000]
    Sep 23 01:45:27 WORK-PC kernel: Code: 8b 4f 1c 31 d2 4c 89 e0 48 f7 f1 49 8b 07 48 63 ca 4c 8b 2c c8 4d 85 ed 74 37 49 8b 6f 08 48 8d 1c c8 90 49 ff 87 a0 00 00 00 <4d> 39 65 10 75 11 49 ff 47 68 49 8b 7d 00 4c 89 f6 ff d5 85 c0 74

    when googling the error i found some people said that if you add some parameters to the kernel you can fix the issue.

    https://forum.manjaro.org/t/i915-gpu-hang-solved/37200/13

    Solved! Adding these parameters to the kernel is fixed: i915.modeset=1 i915.enable_rc6=1 i915.enable_fbc=1 i915.enable_guc_loading=1 i915.enable_guc_submission=1 i915.enable_huc=1 i915.enable_psr=1 i915.disable_power_well=0 i915.semaphores=1 It works in version 4.14 and 4.15

    on the other hand hoopster said in this thread that the Intel i5-9500 is not supported in kernel 4.19, but mine is an older cpu.

    Linux WORK-PC 4.19.56-Unraid #1 SMP Tue Jun 25 10:19:34 PDT 2019 x86_64 Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz GenuineIntel GNU/Linux
    Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz

    here is some more info if anyone thinks its relevant

    root@WORK-PC:~# systool -m i915 -av
    Module = "i915"
    
      Attributes:
        coresize            = "1261568"
        initsize            = "0"
        initstate           = "live"
        refcnt              = "0"
        taint               = ""
        uevent              = <store method only>
    
      Parameters:
        alpha_support       = "N"
        disable_display     = "N"
        disable_power_well  = "1"
        dmc_firmware_path   = "(null)"
        edp_vswing          = "0"
        enable_dc           = "-1"
        enable_dp_mst       = "Y"
        enable_dpcd_backlight= "N"
        enable_fbc          = "0"
        enable_guc          = "0"
        enable_gvt          = "N"
        enable_hangcheck    = "Y"
        enable_ips          = "1"
        enable_ppgtt        = "1"
        enable_psr          = "-1"
        error_capture       = "Y"
        fastboot            = "N"
        force_reset_modeset_test= "N"
        guc_firmware_path   = "(null)"
        guc_log_level       = "0"
        huc_firmware_path   = "(null)"
        invert_brightness   = "0"
        load_detect_test    = "N"
        lvds_channel_mode   = "0"
        mmio_debug          = "0"
        modeset             = "-1"
        nuclear_pageflip    = "N"
        panel_use_ssc       = "-1"
        prefault_disable    = "N"
        reset               = "2"
        vbt_firmware        = "(null)"
        vbt_sdvo_panel_type = "-1"
        verbose_state_checks= "Y"
    
      Sections:
        .altinstr_aux       = "0xffffffffa07a5b83"
        .altinstr_replacement= "0xffffffffa07a5638"
        .altinstructions    = "0xffffffffa07db727"
        .bss                = "0xffffffffa080ddc0"
        .data..read_mostly  = "0xffffffffa080da40"
        .data.once          = "0xffffffffa080d9d0"
        .data               = "0xffffffffa0809020"
        .exit.text          = "0xffffffffa07a5bdd"
        .fixup              = "0xffffffffa07a5bf4"
        .gnu.linkonce.this_module= "0xffffffffa080dac0"
        .init.text          = "0xffffffffa082f000"
        .note.Linux         = "0xffffffffa07a6024"
        .note.gnu.build-id  = "0xffffffffa07a6000"
        .orc_unwind         = "0xffffffffa07edebc"
        .orc_unwind_ip      = "0xffffffffa07dd1dc"
        .parainstructions   = "0xffffffffa07dc260"
        .rodata             = "0xffffffffa07a6080"
        .rodata.str1.1      = "0xffffffffa07bc59c"
        .smp_locks          = "0xffffffffa07dbc98"
        .strtab             = "0xffffffffa084c758"
        .symtab             = "0xffffffffa0830000"
        .text..refcount     = "0xffffffffa07a57d1"
        .text               = "0xffffffffa06fa000"
        .text.unlikely      = "0xffffffffa07a5c71"
        __bug_table         = "0xffffffffa080ad90"
        __ex_table          = "0xffffffffa080720c"
        __jump_table        = "0xffffffffa0809000"
        __ksymtab_gpl       = "0xffffffffa07a6040"
        __ksymtab_strings   = "0xffffffffa0807fd8"
        __param             = "0xffffffffa0807ab0"

     

    work-pc-diagnostics-20190923-0557.zip

  11. i just want to bottom line this for anyone who jumped into their setup like me, because they wanted a bunch of storage on the cheap.

     

    i own a NetApp DS4246, with all 3TB sata drives in it.  using a Dell LSI SAS 9202-16e 6Gb/s SAS Host Bus Adapter to plug it into a pretty generic Lenovo desktop computer w/ 32GB of ram and an Intel i5-4570.

     

    anyway, the big mistake i made for years while running UnRaid was to think that i was supposed to use SAS or SCSI settings on the SMART setup because of the NetApp and/or the HBA, that is NOT correct.  pretty obviously when you think about it, SMART is internal to the drives and the drives are SATA so set your SMART up as SATA and it will work perfectly.

     

    No extra setup required.  (i just thought i would add this final note for anyone reading this thread).

  12. put that in my go file and started another smartctl long test on a different drive that i have less faith in while witing for my preclear to finish on sdp.

     

    sdp and sdm (the one i am running long test on now) are the only drives that when the system is rebooted lose their write-cache setting, and i have to use hdparm -W1 /dev/sdm and sdp to turn it back on. ( i have also now added that to the go file). 

     

    i am only making note of these things now so that people can benefit form this experience and i will have the commands noted in this thread.

     

    root@WORK-PC:~# smartctl -d sat -t long -C /dev/sdm
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.56-Unraid] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Extended self-test routine immediately in captive mode".
    Drive command "Execute SMART Extended self-test routine immediately in captive mode" successful.
    Testing has begun.
    Please wait 369 minutes for test to complete.
    Test will complete after Mon Sep  9 23:29:45 2019
    

    and here is a copy of my /boot/config/go file.

    root@WORK-PC:~# cat /boot/config/go
    #Setup smartd.conf
    mv /etc/smartd.conf /etc/smartd.conf.backup
    cp -p /boot/smartd.conf.bak /etc/smartd.conf
    chmod 644 /etc/smartd.conf
    chmod 755 /etc/rc.d/rc.smartd
    /etc/rc.d/rc.smartd start
    
    #enable write cache on sdm and sdp
    hdparm -W1 /dev/sdm
    hdparm -W1 /dev/sdp
    
    #Setup drivers for hardware transcoding in Plex
    modprobe i915
    chown -R nobody:users /dev/dri
    chmod -R 777 /dev/dri
    
    #!/bin/bash
    # Start the Management Utility
    /usr/local/sbin/emhttp &
    

    and for anyone with a netapp here is the /etc/smartd.conf i will be using.

    root@WORK-PC:~# cat /etc/smartd.conf
    #DEVICESCAN
    # Monitor LSI's disk SMART through SCSI generic
    /dev/sdb -d sat -a -s L/../../3/02
    /dev/sdc -d sat -a -s L/../../3/03
    /dev/sdd -d sat -a -s L/../../3/04
    /dev/sde -d sat -a -s L/../../3/05
    /dev/sdf -d sat -a -s L/../../3/06
    /dev/sdg -d sat -a -s L/../../3/07
    /dev/sdh -d sat -a -s L/../../3/08
    /dev/sdi -d sat -a -s L/../../3/09
    /dev/sdj -d sat -a -s L/../../3/10
    /dev/sdk -d sat -a -s L/../../3/11
    /dev/sdl -d sat -a -s L/../../3/12
    /dev/sdm -d sat -a -s L/../../3/13
    /dev/sdn -d sat -a -s L/../../3/14
    /dev/sdo -d sat -a -s L/../../3/15
    /dev/sdp -d sat -a -s L/../../3/16
    /dev/sdq -d sat -a -s L/../../3/17
    /dev/sdr -d sat -a -s L/../../3/18
    /dev/sds -d sat -a -s L/../../3/19
    /dev/sdt -d sat -a -s L/../../3/20
    /dev/sdu -d sat -a -s L/../../3/21
    /dev/sdv -d sat -a -s L/../../3/22
    /dev/sdw -d sat -a -s L/../../3/23
    /dev/sdx -d sat -a -s L/../../3/24
    /dev/sdy -d sat -a -s L/../../3/25
    /dev/sdz -d sat -a -s L/../../3/26

    here is the /var/log/syslog of the smartd loading up with the new config

    root@WORK-PC:~# less /var/log/syslog 
    
    Sep  9 17:12:01 WORK-PC smartd[9481]: Device: /dev/sdu [SAT], not found in smartd database.
    Sep  9 17:12:01 WORK-PC smartd[9481]: Device: /dev/sdu [SAT], not capable of SMART Health Status check
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdu [SAT], is SMART capable. Adding to "monitor" list.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdv [SAT], opened
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdv [SAT], SEAGATE ST3000NM0033, S/N:Z1Y1T5V8, WWN:5-000c50-0675c7981, FW:NS00, 3.00 TB
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdv [SAT], not found in smartd database.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdv [SAT], not capable of SMART Health Status check
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdv [SAT], is SMART capable. Adding to "monitor" list.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdw [SAT], opened
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdw [SAT], SEAGATE ST3000NM0033, S/N:Z1Y1X2MN, WWN:5-000c50-0675c76b3, FW:NS00, 3.00 TB
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdw [SAT], not found in smartd database.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdw [SAT], not capable of SMART Health Status check
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdw [SAT], is SMART capable. Adding to "monitor" list.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdx [SAT], opened
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdx [SAT], SEAGATE ST3000NM0033, S/N:Z1Y1X2JX, WWN:5-000c50-0675c876f, FW:NS00, 3.00 TB
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdx [SAT], not found in smartd database.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdx [SAT], not capable of SMART Health Status check
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdx [SAT], is SMART capable. Adding to "monitor" list.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdy [SAT], opened
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdy [SAT], SEAGATE ST3000NM0033, S/N:Z1Y1X2NR, WWN:5-000c50-0675c859a, FW:NS00, 3.00 TB
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdy [SAT], not found in smartd database.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdy [SAT], not capable of SMART Health Status check
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdy [SAT], is SMART capable. Adding to "monitor" list.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdz [SAT], opened
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdz [SAT], SEAGATE ST3000NM0033, S/N:Z1Y1T69A, WWN:5-000c50-0675c6f7d, FW:NS00, 3.00 TB
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdz [SAT], not found in smartd database.
    Sep  9 17:12:02 WORK-PC smartd[9481]: Device: /dev/sdz [SAT], not capable of SMART Health Status check
    Sep  9 17:12:03 WORK-PC smartd[9481]: Device: /dev/sdz [SAT], is SMART capable. Adding to "monitor" list.
    Sep  9 17:12:03 WORK-PC smartd[9481]: Monitoring 25 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
    Sep  9 17:12:04 WORK-PC smartd[9481]: Device: /dev/sdm [SAT], previous self-test was interrupted by the host with a reset
    Sep  9 17:12:05 WORK-PC smartd[9481]: Device: /dev/sdp [SAT], previous self-test was interrupted by the host with a reset
    Sep  9 17:12:06 WORK-PC smartd[10307]: smartd has fork()ed into background mode. New PID=10307.
    Sep  9 17:12:06 WORK-PC smartd[10307]: file /run/smartd.pid written containing PID 10307
    

     

  13. now if i figure out how to properly modify the smartd.conf to properly detect the drives with the -d sat instead of doing DEVICESCAN which is putting in scsi maybe i can have a better unraid smart experience in general.

     

    not sure if i should use /dev/sdp or the equivalent /dev/sgX  but assuming ill stick with 'sd' here is what my /etc/smartd.conf would look like.

     

    #DEVICESCAN
    # Monitor LSI's disk SMART through SCSI generic
    /dev/sdb -d sat -a -s L/../../3/02
    /dev/sdc -d sat -a -s L/../../3/03
    /dev/sdd -d sat -a -s L/../../3/04
    /dev/sde -d sat -a -s L/../../3/05
    /dev/sdf -d sat -a -s L/../../3/06
    /dev/sdg -d sat -a -s L/../../3/07
    /dev/sdh -d sat -a -s L/../../3/08
    /dev/sdi -d sat -a -s L/../../3/09
    /dev/sdj -d sat -a -s L/../../3/10
    /dev/sdk -d sat -a -s L/../../3/11
    /dev/sdl -d sat -a -s L/../../3/12
    /dev/sdm -d sat -a -s L/../../3/13
    /dev/sdn -d sat -a -s L/../../3/14
    /dev/sdo -d sat -a -s L/../../3/15
    /dev/sdp -d sat -a -s L/../../3/16
    /dev/sdq -d sat -a -s L/../../3/17
    /dev/sdr -d sat -a -s L/../../3/18
    /dev/sds -d sat -a -s L/../../3/19
    /dev/sdt -d sat -a -s L/../../3/20
    /dev/sdu -d sat -a -s L/../../3/21
    /dev/sdv -d sat -a -s L/../../3/22
    /dev/sdw -d sat -a -s L/../../3/23
    /dev/sdx -d sat -a -s L/../../3/24
    /dev/sdy -d sat -a -s L/../../3/25
    /dev/sdz -d sat -a -s L/../../3/26

  14. actually how does this look w/ regards to smart output done through the netapp still but using proper LSI Fusion MPT SAS2 driver settings?

     

    does this look like it has the full output?

     

    root@WORK-PC:~# smartctl -d sat -a /dev/sdp
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.56-Unraid] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

    === START OF INFORMATION SECTION ===
    Device Model:     SEAGATE ST3000NM0033
    Serial Number:    Z1Y1X1PX
    LU WWN Device Id: 5 000c50 0675cb0e7
    Firmware Version: NS00
    User Capacity:    3,000,592,982,016 bytes [3.00 TB]
    Sector Size:      512 bytes logical/physical
    Rotation Rate:    7200 rpm
    Form Factor:      3.5 inches
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   ACS-2 (minor revision not indicated)
    SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
    Local Time is:    Mon Sep  9 13:10:00 2019 EDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled

    === START OF READ SMART DATA SECTION ===
    SMART Status not supported: Incomplete response, ATA output registers missing
    SMART overall-health self-assessment test result: PASSED
    Warning: This result is based on an Attribute check.

    General SMART Values:
    Offline data collection status:  (0x82) Offline data collection activity
                                            was completed without error.
                                            Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                (  105) seconds.
    Offline data collection
    capabilities:                    (0x7b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   1) minutes.
    Extended self-test routine
    recommended polling time:        ( 369) minutes.
    Conveyance self-test routine
    recommended polling time:        (   2) minutes.
    SCT capabilities:              (0x50bf) SCT Status supported.
                                            SCT Error Recovery Control supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.

    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000f   081   063   044    Pre-fail  Always       -       156604416
      3 Spin_Up_Time            0x0003   097   096   000    Pre-fail  Always       -       0
      4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       56
      5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       5
      7 Seek_Error_Rate         0x000f   086   060   030    Pre-fail  Always       -       406887179
      9 Power_On_Hours          0x0032   072   072   000    Old_age   Always       -       24771
     10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       23
    184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
    187 Reported_Uncorrect      0x0032   095   095   000    Old_age   Always       -       5
    188 Command_Timeout         0x0032   100   098   000    Old_age   Always       -       8590065667
    189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
    190 Airflow_Temperature_Cel 0x0022   064   053   045    Old_age   Always       -       36 (Min/Max 34/36)
    191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       20
    193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       1083
    194 Temperature_Celsius     0x0022   036   047   000    Old_age   Always       -       36 (0 21 0 0 0)
    195 Hardware_ECC_Recovered  0x001a   034   019   000    Old_age   Always       -       156604416
    197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

    SMART Error Log Version: 1
    ATA Error Count: 8 (device log contains only the most recent five errors)
            CR = Command Register [HEX]
            FR = Features Register [HEX]
            SC = Sector Count Register [HEX]
            SN = Sector Number Register [HEX]
            CL = Cylinder Low Register [HEX]
            CH = Cylinder High Register [HEX]
            DH = Device/Head Register [HEX]
            DC = Device Command Register [HEX]
            ER = Error register [HEX]
            ST = Status register [HEX]
    Powered_Up_Time is measured from power on, and printed as
    DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
    SS=sec, and sss=millisec. It "wraps" after 49.710 days.

    Error 8 occurred at disk power-on lifetime: 24721 hours (1030 days + 1 hours)
      When the command that caused the error occurred, the device was in an unknown state.

      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      04 51 00 00 00 00 00  Error: ABRT

      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      00 00 00 00 00 00 00 ff  15d+17:03:05.309  NOP [Abort queued commands]
      b0 d4 00 82 4f c2 00 00  15d+12:00:01.038  SMART EXECUTE OFF-LINE IMMEDIATE
      ec 00 01 00 00 00 00 00  15d+12:00:00.942  IDENTIFY DEVICE
      ec 00 01 00 00 00 00 00  15d+12:00:00.941  IDENTIFY DEVICE
      e5 00 00 00 00 00 00 00  15d+11:59:00.770  CHECK POWER MODE

    Error 7 occurred at disk power-on lifetime: 24708 hours (1029 days + 12 hours)
      When the command that caused the error occurred, the device was in an unknown state.

      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      04 51 00 00 00 00 00  Error: ABRT

      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      00 00 00 00 00 00 00 ff  15d+04:40:50.041  NOP [Abort queued commands]
      b0 d4 00 82 4f c2 00 00  14d+23:37:45.768  SMART EXECUTE OFF-LINE IMMEDIATE
      ec 00 01 00 00 00 00 00  14d+23:37:45.673  IDENTIFY DEVICE
      ec 00 01 00 00 00 00 00  14d+23:37:45.672  IDENTIFY DEVICE
      e5 00 00 00 00 00 00 00  14d+23:18:07.902  CHECK POWER MODE

    Error 6 occurred at disk power-on lifetime: 24687 hours (1028 days + 15 hours)
      When the command that caused the error occurred, the device was in an unknown state.

      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      04 51 00 00 00 00 00  Error: ABRT

      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      00 00 00 00 00 00 00 ff  14d+07:40:17.431  NOP [Abort queued commands]
      b0 d4 00 82 4f c2 00 00  14d+02:37:13.161  SMART EXECUTE OFF-LINE IMMEDIATE
      ec 00 01 00 00 00 00 00  14d+02:37:13.066  IDENTIFY DEVICE
      ec 00 01 00 00 00 00 00  14d+02:37:13.065  IDENTIFY DEVICE
      e5 00 00 00 00 00 00 00  14d+02:31:51.028  CHECK POWER MODE

    Error 5 occurred at disk power-on lifetime: 24675 hours (1028 days + 3 hours)
      When the command that caused the error occurred, the device was in standby mode.

      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 00 00 ff ff ff 4f 00  13d+19:52:05.743  READ FPDMA QUEUED
      e2 00 00 00 00 00 00 00  13d+19:52:04.691  STANDBY
      b0 d6 01 e0 4f c2 00 00  13d+19:52:04.691  SMART WRITE LOG
      b0 d6 01 e0 4f c2 00 00  13d+19:52:04.690  SMART WRITE LOG
      ef 10 02 00 00 00 00 00  13d+19:52:04.690  SET FEATURES [Enable SATA feature]

    Error 4 occurred at disk power-on lifetime: 24675 hours (1028 days + 3 hours)
      When the command that caused the error occurred, the device was active or idle.

      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 00 00 ff ff ff 4f 00  13d+19:52:00.514  READ FPDMA QUEUED
      60 00 00 ff ff ff 4f 00  13d+19:52:00.487  READ FPDMA QUEUED
      61 00 80 ff ff ff 4f 00  13d+19:52:00.483  WRITE FPDMA QUEUED
      61 00 f0 ff ff ff 4f 00  13d+19:52:00.481  WRITE FPDMA QUEUED
      61 00 78 ff ff ff 4f 00  13d+19:52:00.480  WRITE FPDMA QUEUED

    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed without error       00%     24748         -
    # 2  Extended captive    Interrupted (host reset)      00%     24748         -
    # 3  Short captive       Completed without error       00%     24748         -
    # 4  Short offline       Aborted by host               30%     24748         -
    # 5  Extended captive    Interrupted (host reset)      20%     24721         -
    # 6  Extended captive    Interrupted (host reset)      20%     24708         -
    # 7  Extended captive    Interrupted (host reset)      20%     24687         -
    # 8  Short captive       Completed without error       00%     24681         -
    # 9  Short offline       Completed without error       00%     24681         -

    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.

     

  15. you are correct it was there i just didn't find it due to my own stupidity.

     

    is this the smart output?

     

    root@WORK-PC:~# smartctl -a /dev/sdp
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.56-Unraid] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

    === START OF INFORMATION SECTION ===
    Vendor:               SEAGATE
    Product:              ST3000NM0033  SA
    Revision:             NA00
    Compliance:           SPC-3
    User Capacity:        3,000,592,982,016 bytes [3.00 TB]
    Logical block size:   512 bytes
    Rotation Rate:        7200 rpm
    Form Factor:          3.5 inches
    Logical Unit id:      0x50000c900064ff5c
    Serial number:        Z1Y1X1PX
    Device type:          disk
    Transport protocol:   SAS (SPL-3)
    Local Time is:        Mon Sep  9 09:47:42 2019 EDT
    SMART support is:     Available - device has SMART capability.
    SMART support is:     Enabled
    Temperature Warning:  Enabled

    === START OF READ SMART DATA SECTION ===
    SMART Health Status: OK

    Current Drive Temperature:     36 C
    Drive Trip Temperature:        <not available>

    Manufactured in week    of year
    Specified cycle count over device lifetime:  0
    Accumulated start-stop cycles:  56
    Read defect list: asked for grown list but didn't get it
    Vendor (Seagate Cache) information
      Blocks sent to initiator = 62
      Blocks received from initiator = 0
      Blocks read from cache and sent to initiator = 12
      Number of read and write commands whose size <= segment size = 0
      Number of read and write commands whose size > segment size = 0

    Vendor (Seagate/Hitachi) factory information
      number of hours powered up = 24761.00
      number of minutes until next internal SMART test = 28

    Error counter log:
               Errors Corrected by           Total   Correction     Gigabytes    Total
                   ECC          rereads/    errors   algorithm      processed    uncorrected
               fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
    read:          0        0         0         0          0          0.000           0
    write:         0        0         0         0          0          0.000           0

    Non-medium error count:       88


    [GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
    SMART Self-test log
    Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
         Description                              number   (hours)
    # 1  Background short  Completed                   -   24748                 - [-   -    -]
    # 2  Foreground long   Aborted (device reset ?)    -   24748                 0 [0xb 0x82 0x0]
    # 3  Foreground short  Completed                   -   24748                 - [-   -    -]
    # 4  Background short  Aborted (by user command)   -   24748                 0 [0xb 0x81 0x0]
    # 5  Foreground long   Aborted (device reset ?)    -   24721                 0 [0xb 0x82 0x0]
    # 6  Foreground long   Aborted (device reset ?)    -   24708                 0 [0xb 0x82 0x0]
    # 7  Foreground long   Aborted (device reset ?)    -   24687                 0 [0xb 0x82 0x0]
    # 8  Foreground short  Completed                   -   24681                 - [-   -    -]
    # 9  Background short  Completed                   -   24681                 - [-   -    -]

    Long (extended) Self-test duration: 22140 seconds [369.0 minutes]

  16. may i ask you smart folks, do you know how to re-detect that drive i put in without rebooting?

     

    i have a SAS2008 card connecting my desktop computer to that netapp previously discussed.

     

    Dell LSI SAS 9202-16e 6Gb/s SAS Host Bus Adapter

    from lspci

    03:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)

     

    from dmesg

     

    303.920071] mdcmd (13): import 12

    [  303.920072] md: import_slot: 12 missing

     

    root@WORK-PC:~# dmesg |grep -B2 -A2 SAS2008

    [   13.633974] mpt2sas_cm0: Current Controller Queue Depth(1948),Max Controller Queue Depth(2040)

    [   13.634737] mpt2sas_cm0: Scatter Gather Elements per IO(128)

    [   13.681883] mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(07.39.02.00)

    [   13.683391] mpt2sas_cm0: Protocol=(

    [   13.683391] Initiator

    --

    [   13.871549] mpt2sas_cm1: Current Controller Queue Depth(1948),Max Controller Queue Depth(2040)

    [   13.872281] mpt2sas_cm1: Scatter Gather Elements per IO(128)

    [   13.919956] mpt2sas_cm1: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(07.39.02.00)

    [   13.921470] mpt2sas_cm1: Protocol=(

    [   13.921471] Initiator

×
×
  • Create New...