Jump to content

CS01-HS

Members
  • Posts

    479
  • Joined

  • Last visited

Posts posted by CS01-HS

  1. Ah I see, that's still good power savings.

     

    Yes, I have the J5005-mITX with 6 HDDs (4 on the card, 2 on the board) and 2 Cache SDDs. More information is linked in my signature.

     

    If you want to see great power savings look at this:

    https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fwww.computerbase.de%2Fforum%2Fthreads%2Fselbstbau-nas-im-10-zoll-rack.1932675%2F

     

    He also posts here:

     

  2. I'm running a j5005 with an H1110 HBA (cut to fit the x1 slot) and 8 disks:

    • 2 SSD Cache (on the integrated intel)
    • 3 Array disks and and a BTRFS Raid 5 Pool of 3 disks (all distributed between the integrated ASMedia and the HBA)

    It's not a performance server but I don't notice any slowness although your idle wattage with the new setup (22W) is close to mine. Do you remember what your idle wattage was with the J4105 and HBA?

     

    Benchmarking the x1 controller in DiskSpeed does show limitations but not enough to affect usability (although if my controller ran 8 disks instead of 4 I could see it doubling? parity check time.)

    1073790761_ScreenShot2020-11-05at9_58_45AM.thumb.png.5e995c577ac9310e641b73259e1d2761.png

  3. I saw a few of these hard resetting link errors during my mover run. Thankfully (?) no CRC errors reported. ata3 is a spinning disk attached to an integrated ASM1062 controller.

     

    I wonder if it might be related to the power-saving tweaks because nothing else changed. For now I've disabled them and will see if they reappear. Maybe coincidence but I'm posting in case others have the same issue.

    Nov  3 23:17:30 NAS move: move: file /mnt/cache/Download/movie_1.mp4
    Nov  3 23:17:33 NAS kernel: ata3.00: exception Emask 0x10 SAct 0x80 SErr 0x4050002 action 0x6 frozen
    Nov  3 23:17:33 NAS kernel: ata3.00: irq_stat 0x08000000, interface fatal error
    Nov  3 23:17:33 NAS kernel: ata3: SError: { RecovComm PHYRdyChg CommWake DevExch }
    Nov  3 23:17:33 NAS kernel: ata3.00: failed command: WRITE FPDMA QUEUED
    Nov  3 23:17:33 NAS kernel: ata3.00: cmd 61/00:38:58:44:51/04:00:2c:02:00/40 tag 7 ncq dma 524288 out
    Nov  3 23:17:33 NAS kernel:         res 40/00:30:58:40:51/00:00:2c:02:00/40 Emask 0x10 (ATA bus error)
    Nov  3 23:17:33 NAS kernel: ata3.00: status: { DRDY }
    Nov  3 23:17:33 NAS kernel: ata3: hard resetting link
    Nov  3 23:17:33 NAS move: move: file /mnt/cache/Download/movie_1.mp4
    Nov  3 23:17:33 NAS kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
    Nov  3 23:17:33 NAS kernel: ata3.00: supports DRM functions and may not be fully accessible
    Nov  3 23:17:33 NAS kernel: ata3.00: supports DRM functions and may not be fully accessible
    Nov  3 23:17:33 NAS kernel: ata3.00: configured for UDMA/133
    Nov  3 23:17:33 NAS kernel: ata3: EH complete
    Nov  3 23:17:35 NAS move: move: file /mnt/cache/Download/movie_2.mp4

     

  4. 23 minutes ago, jonathanm said:

    Possible reduced stability. It's almost like the inverse of overclocking. Circuits are designed and tested to run at spec, deviate and you risk bit errors. Depending on the quality of your specific silicon you might be fine, but you don't know for sure.

     

    How valuable is your data integrity?

    I wouldn't take any meaningful risk to save 4W.


    Would you distinguish between any of the power-saving tweaks (SATA links, I2C, USB, PCI and increasing dirty_writeback) in terms of risk, assuming a UPS/no unexpected power loss?

  5. 16 minutes ago, falconexe said:

    This is pretty sweet. I'm the devloper the Ultimate Unraid Dashboard (UUD). If you want to cross-promote your solution, feel free to also post or link it in my topic. I don't personally, use Spin-up Groups, but this is a really neat solution to non-native data within Telegraf.

    Sure maybe if/when it gets cleaned up. 

     

    Just to be clear this isn't related to spin-up groups (which I don't use either), just standard drives. I wanted a way to easily track whether my drives were sleeping/waking too frequently. Someone cleverer might be able to integrate total wakes over the specified time range.

  6. I have a visually pleasing (but technically dirty) solution to my quest for a spin-up graph. I'm new to grafana and find it frustrating so if anyone has improvements feel free to post them.

     

    This is the end result:

    2004995325_ScreenShot2020-11-22at5_38_09AM.thumb.png.c2322de0e856a21aed10e2285a4a2981.png

     

    Setup:

     

    1. Start with a User Script script to track drive activity and temperature in influx set to run every 5 minutes (borrowed from this php version.)

    Replace every XX with your system's settings (default influx port is 8086)

    #!/bin/bash
    
    # User settings
    INFLUX_IP="XX"
    INFLUX_PORT="XX"
    HOSTNAME="XX"
    
    # Drive IDs (in /dev/disk/by-id/) in position order from top of graph to bottom
    declare -a DRIVE_LIST=(
      "XX"
      "XX"
    )
    
    position=0
    # Loop through drives
    for drive in ${DRIVE_LIST[@]}; do
        # capture smartctl output
        smartctl_output=`smartctl -n standby -AH /dev/disk/by-id/$drive`
        # test if awake
        is_asleep=`echo "$smartctl_output" | grep 'Device is in STANDBY mode' | wc -l`
        if [[ $is_asleep -ne 1 ]]; then
            temp=`echo "$smartctl_output" | egrep ^194 | awk '{print $10}'`
            active=",active=1,temp_c=${temp}"
        else
            active=''
        fi
    
        grafana_command="curl -i -XPOST 'http://$INFLUX_IP:$INFLUX_PORT/write?db=telegraf' --data-binary 'hdd_spin,host=$HOSTNAME,id_serial=$drive position=$position${active}'"
        eval $grafana_command
    
        position=$[$position +1]
    done

     

    2. Create a new pane with the following query

    1974348793_ScreenShot2020-11-22at5_42_11AM.thumb.png.7a3f6552c68c0abcc3daa35f39088911.png

     

     

    3.  Now the hacks start. The graph goes from 0 to (in my case with 7 drives) -7. We need a way to turn these lines into pretty ribbons. We'll graph the pos column along negative-y (so position 0 is at the top) then for every drive we'll create a corresponding transform that's the drive's position value but negative, minus 1, and have grafana fill the space between them.

     

    Here are the first three transforms in my setup in position order:

    Parity disk in position 0 transform to -1 (0 - 1)

    1st Pool disk in position 1 transform to -2 (1 * -2/1)

    1st Array disk in position 2 transform to -3 (2 * -3/2)

     

    Grafana doesn't allow fractions so you'll have to calculate the decimal value.

     

    The next entry in the sequence would be:

    Position 3 to -4 (3 * -4/3), or -1.33.

     

    89143669_ScreenShot2020-11-22at5_48_39AM.thumb.png.15a3dd31da93bb7cda8954376d949562.png

     

    4. Now go to Overrides to alias the drive pos and temp fields

    1300646233_ScreenShot2020-11-22at5_57_55AM.thumb.png.609f76c2681abb6e637648aa460eaba1.png1410948183_ScreenShot2020-11-22at6_02_11AM.thumb.png.2f708fc9b426d898045ab809e2410261.png

     

    5. Now to Panel to tweak the display.

    904917215_ScreenShot2020-11-22at6_10_28AM.thumb.png.e59da259e7c4cf3ce589499cd6885a19.png316088348_ScreenShot2020-11-22at6_11_00AM.thumb.png.b3863443ae8aaba58f222e0f82497fe0.png1971833116_ScreenShot2020-11-22at6_11_24AM.thumb.png.1dd75170da3715faa538649a7eaef23c.png  

     

    Create series overrides for each drive's position field (-pos), fill field (-fill) and temperature field (no suffix)

    Note the fill below to in the -pos fields which creates the "ribbons."

    1789346449_ScreenShot2020-11-22at6_20_09AM.thumb.png.ad1dbed1efd22729d40e34a1ca626357.png198542958_ScreenShot2020-11-22at6_22_16AM.thumb.png.ae3674e905d2689e1418218f396a72f9.png112611757_ScreenShot2020-11-22at6_24_07AM.png.3da53055dd1116608f8eb5406f0c22d4.png 

     

    That's it.

     

    To verify you haven't missed anything a completed panel for 7 drives will have:

    • 7 Transforms
    • 14 Overrides
    • 21 Series Overrides

     

     

    Note that the Legend will sort alphabetically by serial ID, not position (unfortunately.)

    If you're lucky (or obsessive enough to reposition your drives alphabetically, ahem) they'll match.

     

    EDIT: 11/22/2020 - Updated instructions for version 2, which adds temperature.

  7. Updated the script to work with with 6.9.0-beta30

     

    I don't know if it's the new kernel or unraid itself but now the drive's diskstats seem to increment even without access (maybe SMART polling?) Keying off the partition's diskstats seems to solve it. Note, this looks for reads/writes on the first partition - if you have a multi-partition drive it won't work properly (and may sleep you drive while you're accessing it,)

  8. 2 hours ago, ChatNoir said:

    I think @falconexe had a way to replace sdX by the disk S/N in his UUD topic

    Thanks. I don't see it in the dashboard (which instead shows a configurable "cache devices" setting) but I'll dig into the thread. Clever trick installing smartmontools in the container, I had been using HDDTemp docker.

     

  9. What's everyone doing to ensure persistent drive labels for reporting (Grafana), etc. ?

     

    My sdg and sdh swapped labels with sdi and sdj. I thought I could take advantage of hotswap to correct it - DON'T DO THIS (I'm rebuilding parity now because of it.)

     

    So is there a fix or do we just accept some level of randomness?

  10. I searched the thread and didn't find an answer so apologies if this question's been asked.

     

    I'm new to this plugin and have my first verification running after the initial check. It's been going for around 8 hours and I'm wondering how far it's progressed. Is there an indicator somewhere? The initial check showed a handy progress bar in the page under Tools.

     

    I'm also confused about whether "check" and "verify" are used interchangeably in the documentation. The help says Use the Check command to verify files against a previously exported file but the plugin can be configured to verify with Save new hashing results to flash (which I assume is the referenced "export") disabled, suggesting they're not interchangeable.

  11. 18 hours ago, NasOnABudget said:

    Is this due to a change in unraid or a change in their upses.

     

    Id be interested to know if others who buy the CP1000PFCLCD can reliably expect this, because I've been pouring over options trying to figure out what to get without unnecessarily overspending.

    Fancier options are plug-and-play but I have automatic restart working on my low-end Cyberpower. It requires the NUT plugin and some custom configuration. I wrote a how-to for my networked setup but the NUT plugin provides all the necessary files on unRAID so adapting it for a direct (USB) setup shouldn't be difficult.

  12. 3 hours ago, trurl said:

    This would be a setting in your BIOS

    Right, the BIOS has to be set to start automatically when powered but that won't be triggered unless the UPS (a) cuts power after the computer shuts down and (b) restores power when the outage ends.


    For example, if he'd set Turn off UPS to NO and the outage ended after the computer shut down but before the UPS was depleted the computer wouldn't know to boot because the outlet never lost power. (At least as I understand it.)

    3 hours ago, SirReal63 said:

    Good to know, thanks.  Under normal conditions prior to the UPS, it would not auto restart after a power off event, I would have to manually power it on.

    This doesn't make sense to me. The unRAID UPS setting doesn't change your BIOS.

  13. On 9/6/2020 at 9:29 AM, TexasUnraid said:

    After messing with it, it seems that the Montreal server is down or something. I tried Vancouver and it worked but in the past I saw significantly worse performance with it but should work for now. Will try Montreal again at some point.

     

    It is still really strange how binhex sabnzb worked with Montreal but qbittorrent did not. I guess that sab does not use port forwarding and that was causing the issue? Could be so many people have moved to Montreal there are no ports left?

    I had the same inaccessible webui problem. It's caused by an initialization failure either because port-forwarding fails on the server end or the container script that detects a successfully-forwarded port fails. 

     

    Either way, disabling port-forwarding fixed it:

    1488442601_ScreenShot2020-09-10at6_27_20AM.thumb.png.dbd8ec7408d52accea8432d70c70a05b.png

  14. My hot/cool:

    Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: parity temp=32 (settings are: hot=40, cool=35))

    and yours

    Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: parity temp=26 (settings are: hot=0, cool=6))

    are very different.

     

    Are you maybe setting absolute temperatures on the plugin page?

    They should be relative:

    1173541983_ScreenShot2020-09-08at5_51_31AM.thumb.png.d4023e755bcaef0a272eebb54874cf98.png

     

  15. 2 hours ago, bphillips330 said:

    Yeah, that is why i was thinking the /by-id/blahblahblah_part1 would avoid the sdX issue.   thanks.  I will check out that page. 

    I use this snippet in a user script to mount a USB drive by ID:

    THIS_DISK=`ls -l /dev/disk/by-id/ | grep 'usb-WD_My_Passport_25E2_5758313144393636' | head -1 | tail -c4`
    /usr/local/sbin/rc.unassigned mount "/dev/$THIS_DISK"
    if [[ $? -ne 0 ]]; then 
      echo "Exiting due to ERROR."
      exit 1
    fi
    echo "SUCCESS"

    And unmount:

     /usr/local/sbin/rc.unassigned umount "/dev/$THIS_DISK"

     

  16. 27 minutes ago, Spies said:

    They're all under 45c (which was my hot threshold).

    They all need to be "cool" which is determined by the Resume setting, not the Pause setting ("hot")

    30 minutes ago, Spies said:

    Why am I not seeing the testing portion in the log?

    Set debug logging to "Testing"

  17.  

    32 minutes ago, Spies said:

    Still happening for me, I've had to turn off overheat protection for the time being, i had upper temperatiure set to 45c

    Are you sure it's not user error? The latest update fixed it for me, thanks itimpi!

    Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: parity temp=32 (settings are: hot=40, cool=35))
    Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: disk1 temp=36 (settings are: hot=40, cool=35))
    Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=35))
    Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  8 07:00:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=35))
    Sep  8 07:00:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=35))
    Sep  8 07:00:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=35))
    Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
    Sep  8 07:00:01 NAS parity.check.tuning.php: Resumed Non-Correcting Parity Check  (77.7% completed)  as drives now cooled down
    Sep  8 07:00:01 NAS root: Cache used space threshhold (75) not exceeded.  Used Space: 72.  Not moving files
    Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: written RESUME (COOL) record to  /boot/config/plugins/parity.check.tuning/parity.check.tuning.progress
    Sep  8 07:00:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD start------
    Sep  8 07:00:02 NAS parity.check.tuning.php: DEBUG: detected that mdcmd had been called from sh with command mdcmd check RESUME
    Sep  8 07:00:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD end-------
    Sep  8 07:00:02 NAS kernel: mdcmd (236): check RESUME
    Sep  8 07:00:02 NAS kernel:

     

  18. 14 minutes ago, JorgeB said:

    You can change that on Settings -> Disk Settings -> Tunable (poll_attributes):

     

    Default is 1800s (30 minutes)

    Great, thanks!

     

    EDIT: I found this detailed explanation for the 30 minute default. "Pretty large disruption in I/O flow" makes me nervous but I've run the 5 minute check for months now and haven't noticed any performance issues on my relatively weak system. Hmm.

     

    https://forums.unraid.net/bug-reports/prereleases/unraid-os-version-690-beta25-available-r990/page/2/?tab=comments#comment-9930

     

  19. I run the autofan plugin (which I've set to check HDD temps every 5 minutes) and the HDDTemp docker for Grafana.

     

    Comparing HDD temps reported by those to temps in unRAID dashboard, dashboard seems about 30 minutes behind. I'm wondering why that is and if it's necessary. Neither the plugin nor docker wake my sleeping drives.

     

    Below is the drive temp code from autofan (which uses hdparm and smartctl.)

    Note: I've heavily customized autofan so I'm not sure what parts of this code are the old version, which my customization was based on, and what parts I tweaked. I notice the current version doesn't pass --nocheck standby to smartctl (and maybe the old one didn't either.)

    function_get_highest_hd_temp() {
      HIGHEST_TEMP_HDD=0
      HIGHEST_TEMP_HDD_LABEL=''
      for DISK in "${HD[@]}"; do
        SLEEPING=`hdparm -C ${DISK} | grep -c standby`
        if [[ $SLEEPING -eq 0 ]]; then
          if [[ $DISK == /dev/nvme[0-9] ]]; then
            CURRENT_TEMP_HDD=$(smartctl -A $DISK | awk '$1=="Temperature:" {print $2;exit}')
          else
            CURRENT_TEMP_HDD=$(smartctl --nocheck standby -A $DISK | awk '$1==190||$1==194 {print $10;exit}')
          fi
          if [[ $HIGHEST_TEMP_HDD -le $CURRENT_TEMP_HDD ]]; then
            HIGHEST_TEMP_HDD=$CURRENT_TEMP_HDD
            HIGHEST_TEMP_HDD_LABEL=$DISK
          fi
        fi
      done
    }

     

  20. Here are the relevant lines from the syslog. I can send the entire log after some cleanup if necessary.

    It looks like I was wrong about the difference between hot/cold settings but something's causing a failure to resume even when all drive temperatures reach "cool."

     

    Start of Parity check (difference of 3 between hot and cool):

    Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
    Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: created cron entries for running increments
    Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: created cron entry for monitoring disk temperatures
    Sep  7 13:10:23 NAS parity.check.tuning.php: TESTING: updated cron settings are in /boot/config/plugins/parity.check.tuning/parity.check.tuning.cron
    Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
    Sep  7 13:10:31 NAS kernel: mdcmd (217): check Resume
    Sep  7 13:10:31 NAS kernel: md: recovery thread: check P ...
    Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:15:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
    Sep  7 13:15:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
    Sep  7 13:15:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
    Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:18:57 NAS autofan: Board temp is 46C, hottest disk is 40C (/dev/sdh), setting fan speed to: 165 (76% @ 1588rpm)
    Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:20:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
    Sep  7 13:20:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
    Sep  7 13:20:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
    Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:25:02 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
    Sep  7 13:25:02 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
    Sep  7 13:25:02 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
    Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:25:11 NAS autofan: Board temp is 45C, hottest disk is 42C (/dev/sdh), setting fan speed to: 195 (90% @ 1819rpm)
    Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:30:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
    Sep  7 13:30:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
    Sep  7 13:30:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
    Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:35:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
    Sep  7 13:35:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
    Sep  7 13:35:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
    Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:40:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
    Sep  7 13:40:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
    Sep  7 13:40:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
    Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

     

    Drives overheat, check is paused, drives begin cool-down:

    Sep  7 13:45:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:45:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
    Sep  7 13:45:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
    Sep  7 13:45:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
    Sep  7 13:45:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
    Sep  7 13:45:01 NAS parity.check.tuning.php: Paused Non-Correcting Parity Check  (39.6% completed) : Following drives overheated: 42 42
    Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: written PAUSE (HOT) record to  /boot/config/plugins/parity.check.tuning/parity.check.tuning.progress
    Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD start------
    Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: detected that mdcmd had been called from sh with command mdcmd nocheck PAUSE
    Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD end-------
    Sep  7 13:45:02 NAS kernel: mdcmd (218): nocheck PAUSE
    Sep  7 13:45:02 NAS kernel:
    Sep  7 13:45:02 NAS kernel: md: recovery thread: exit status: -4
    Sep  7 13:45:02 NAS parity.check.tuning.php: TESTING: Heat notifications disabled so Pause Following drives overheated: 42 42  not sent
    Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 13:50:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
    Sep  7 13:50:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
    Sep  7 13:50:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
    Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
    Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
    Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 13:50:08 NAS autofan: Board temp is 39C, hottest disk is 40C (/dev/sdh), setting fan speed to: 165 (76% @ 1605rpm)
    Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 13:55:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
    Sep  7 13:55:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
    Sep  7 13:55:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
    Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
    Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
    Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:00:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
    Sep  7 14:00:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
    Sep  7 14:00:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
    Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
    Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
    Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:00:01 NAS root: Cache used space threshhold (75) not exceeded.  Used Space: 72.  Not moving files
    Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:05:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
    Sep  7 14:05:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
    Sep  7 14:05:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
    Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
    Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
    Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:10:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
    Sep  7 14:10:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
    Sep  7 14:10:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
    Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
    Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
    Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:15:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:15:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:15:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:15:07 NAS autofan: Board temp is 41C, hottest disk is 38C (/dev/sdh), setting fan speed to: 135 (62% @ 1386rpm)
    Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:20:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:20:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:20:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:25:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:25:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:25:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:27:40 NAS autofan: Board temp is 40C, hottest disk is 36C (/dev/sdh), setting fan speed to: 105 (48% @ 1117rpm)
    Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:30:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:30:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:30:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:35:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:35:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:35:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:40:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:40:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:40:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:45:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
    Sep  7 14:45:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
    Sep  7 14:45:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
    Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
    Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

     

    Drives all "cool" but check still paused

    Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:50:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=37))
    Sep  7 14:50:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=37))
    Sep  7 14:50:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=37))
    Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
    Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

     

    Mover-tuning settings updated without changing values, I hoped this might jumpstart it, nope

    Sep  7 14:52:31 NAS ool www[2754]: /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php 'updatecron'
    Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
    Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: created cron entries for running increments
    Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: created cron entry for monitoring disk temperatures
    Sep  7 14:52:31 NAS parity.check.tuning.php: TESTING: updated cron settings are in /boot/config/plugins/parity.check.tuning/parity.check.tuning.cron
    Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
    Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 14:55:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=37))
    Sep  7 14:55:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=37))
    Sep  7 14:55:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=37))
    Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
    Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
    Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 15:00:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=37))
    Sep  7 15:00:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=37))
    Sep  7 15:00:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=37))
    Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
    Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

     

    Mover-tuning settings updated to lower "cool" by 2 (difference of 5 now between hot and cool), check still doesn't resume:

    Sep  7 15:02:52 NAS ool www[19347]: /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php 'updatecron'
    Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
    Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: created cron entries for running increments
    Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: created cron entry for monitoring disk temperatures
    Sep  7 15:02:52 NAS parity.check.tuning.php: TESTING: updated cron settings are in /boot/config/plugins/parity.check.tuning/parity.check.tuning.cron
    Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
    Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
    Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
    Sep  7 15:05:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=35))
    Sep  7 15:05:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=35))
    Sep  7 15:05:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=35))
    Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
    Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
    Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

     

     

     

  21. On 9/5/2020 at 4:47 PM, robobub said:

    My parity check isn't resuming with all drives classified as cool. Any ideas?

    I haven't tested enough to be certain but I think if the difference between hot pause and cool resume isn't sufficient it won't resume.

     

    With

    Hot pause: 2 below

    Cool resume: 5 below

    (Difference of 3)

    It didn't resume

     

    With

    Hot pause: 2 below

    Cool resume: 7 below

    (Difference of 5)

    It did resume

     

    But it's possible I misinterpreted and something else caused the resume failure.

×
×
  • Create New...