Jump to content

Pourko

Members
  • Posts

    76
  • Joined

  • Last visited

Posts posted by Pourko

  1. Over the years I have always known that, should such an occasion arise, I can always mount (ro) any of my data disks in a different linux computer.  Yesterday I actually needed to do just that, and to my surprise, I was not able to do it: 

     

    # mount -t xfs  -o ro  /dev/sdc1 /tmp/sdc1/
    mount: /tmp/sdc1: mount(2) system call failed: Function not implemented.
    
    # tail /var/log/syslog
    Jun 30 00:08:05 ToyBox kernel: XFS (sdc1): device supports 4096 byte sectors (not 512)
    
    # blockdev --report /dev/sdc*
    RO    RA   SSZ   BSZ   StartSec            Size   Device
    rw   256  4096  4096          0  16000900661248   /dev/sdc
    rw   256  4096  4096         64  16000900608000   /dev/sdc1

     

    Curiously, I noticed that the md devices are being reported as 512-byte sector deivces, and maybe that played a role in how the xfs file system was created on them:

     

    # blockdev --report /dev/md*
    RO    RA   SSZ   BSZ   StartSec            Size   Device
    rw   256   512   512          0  16000900608000   /dev/md1
    rw   256   512   512          0  16000900608000   /dev/md2

     

    So, is that a bug or a feature?   I am wondering, how can such misrepresentation possibly be good for performance?

     

    And, are we no longer able to mount a data disk outside our unraid server?  (For me, that was a big selling point back when I first learned about Unraid.)  Any suggestions?
     

  2. 4 hours ago, NoobSpy said:

    It was a figure of speech. I dont think 10 years on UnRaid will even be relevant.

    You cant keep supporting people who use ancient equipment without growing stagnant and stale.

    I would support a GoFundme and strech goals just to get some often requested things rolling.

    By all means, go fund yourself.

  3. 1 hour ago, doron said:

    . and the jury's back in.

    Indeed, the case is as I said above: The state of one of my drives has just changed (i.e. drive spun up) while the i/o number (sum of 11 fields in /sys/block/sdX/stat) hasn't changed.

    You didn't need to bother adding those up -- just read the whole thing as one string, and then compare the two strings from before and after.  Same thing though.

     

    This thing keeps getting curiouser and curiouser.  :-)   It begs the question, why I am not seeing this on my server?  Can you maybe restart your server, but this time without starting emhttpd, and see if the spinup happens again?  Could something (maybe from the UI?) be sending some weird request to the disk that wakes up SAS disks?... Like a smart info request maybe?  I'm just speculating.

  4. 2 minutes ago, doron said:

    perhaps 10 minutes later, with no i/o happening, it may spin back up.

    Now this is something interesting, and we should get to the bottom of it!  I don't see that on my server, but then again, my UI is stock vanilla with no plugins, and even custom-crippled a bit. :-)  You should definitely try to prove the "no i/o happening" claim.  I suggest you make your monitoring script read the device's stat file, and demonstrate that the reading before and after the disk spins up are exactly the same.  Now that would be really interesting!

  5. I understand what you are trying to do, and I applaud your efforts.

    27 minutes ago, doron said:

    part of the result will be that the UI will also reflect the correct state of these SAS drives (which currently it certainly does not).

    My point is, a disk can spin down for different reasons, including it can decide to spin down all by itself, and it doesn't need to give any explanation about that to the UI.  So, if the UI can't make heads or tails of it, then the problem is in the UI. The UI should learn to properly sense the disk status and to display it correctly.

    32 minutes ago, doron said:

    My comment re your script was that even with it, you may think you're spinning down a SAS drive

    I not only think I'm spinning down the disks, I know I am, as I'm watching their status refresh in real time in a ssh window -- basically the same thing that you are trying to watch inside a browser with colored balls, only I have extensively tested my "monitor" script to know that it reflects the real status correctly.

     

    Anyway, I'll be watching the progress of this discussion, and I'll chime in if I think that I have something more to contribute.

     

    Good luck!

     

  6. See, I have the feeling that you are not correctly identifying the problem.  The way I see it, the problem is not how to spin down disks, the problem is that some buggy scripts in the UI don't know how to properly query a disk without waking it up, and they don't know when to rightfully display a green ball (or whatever other collor).  Personally, I rarely use the UI for anything, and on my server disks spin down when they are supposed to, and they stay spun down.  From reading the posts in this thread, I have the impression that you are trying to fix things kind of backwards, i.e., you take some info from the UI (that does not match reality) and try to make the disk status match that unreal info from the UI.  That is why I suggested that maybe you shouldn't bother, doing it that way, and instead plead with the UI people to fix their UI, if the UI is that important to you.  I hope this explaination makes some sense. :-)

  7. 1 minute ago, doron said:

    you don't need to read i/o counters - you can tell the drive to automatically spin down after a certain amount of idle time.

    Right. But I have a bunch of disks that disregard that setting. Which was the main reason I wrote my script.

     

    Anyway, I was only trying to help.  For myself, I have a solution that has been working flawlessly on my server for years.  If you don't like it -- forget I posted it.

     

    Cheers.

     

  8. On 9/8/2020 at 7:13 AM, doron said:

    I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid.

    It doesn't seem wize to hitch your wagon to something that's been buggy for a very long time.  Especially when there's a very easy way to do this yourself -- just read /sys/block/sdX/stat directly, and when you notice that there's been no i/o activity for a certain period of time, then just go ahed and spin down the drive.  For example, I am attaching here my own little script that has been faithfully serving me for over five years now.  Just disable all spindown stuff in the UI, start my script from your "go" file, and forget about it.  (Note, the UI may also be buggy in the way it polls for smart data, thus spinning up your disks, so you may want to look into that too.)

     

    #!/bin/bash
    # spind: disk spin-down daemon
    copy="Version 3.9 <c> 2020 by Pourko Balkanski"
    prog=spind
    
    ####################################################################
    MINUTES=${MINUTES:-60}  # the number of idle minutes before spindown
    ####################################################################
    
    idleTimeout=$(($MINUTES*60)) # in seconds
    loopDelay=61 # seconds
    
    kill $(pidof -x -o $$ $0) 2>/dev/null # our previous instances, if any
    [ "$1" = "-q" ] && exit 0 # Don't start a new daemon if called with -q
    
    renice 5 -p $$ >/dev/null  # renice self
    log () { logger -t $prog $@ ;}
    log $copy
    
    # Make a list of the disks that could be spun down
    i=0
    for device in /dev/[sh]d[aaa-zzz] ;do
       if proto=$(smartctl -i $device | grep -iE ' sas| sata| ide') ;then
          ((i++))
          devName[$i]=$device
          cmdStat[$i]="cat /sys/block/$(basename $device)/stat"
          devLastStat[$i]=$(${cmdStat[$i]})
          devSecondsIdle[$i]=0
          devError[$i]=0  # We'll use to flag disks that won't spin down
          cmdSpinStatus[$i]="hdparm -C $device"
          cmdStandby[$i]="hdparm -y $device"
          if grep -iq ' SAS' <<<$proto ;then
              # Switch from /dev/sdX to /dev/sgN
              devName[$i]=$(sg_map26 $device)
              cmdSpinStatus[$i]="sdparm --command=sense ${devName[$i]}"
              cmdStandby[$i]="sg_start --pc=3 ${devName[$i]}"
          fi
          theList+="${devName[$i]} "
       fi
    done
    devCount=$i
    
    if [ "$theList" = "" ] ;then
      log 'No supported disks found. Exiting.'
      exit 1
    fi
    log "Will spin down disks after $MINUTES minutes of idling."
    log "Monitoring: $theList"
    
    while :;do
       sleep $loopDelay
       for i in $(seq $devCount) ;do
          [ ${devError[$i]} -gt 2 ] && continue  # this disk has previously failed to spin down.
          devNewStat[$i]=$(${cmdStat[$i]})
          if [ "${devNewStat[$i]}" != "${devLastStat[$i]}" ] ; then
             # Some i/o activity has occured since the last time we checked.
             devSecondsIdle[$i]=0
             devLastStat[$i]=${devNewStat[$i]}
          else # No new activity since we last checked...
              # ...So, let's check its spin status
              if ${cmdSpinStatus[$i]} | grep -iq standby ; then
                  devSecondsIdle[$i]=0
              else # it's currently spinning
                  let "devSecondsIdle[$i] += $loopDelay"
                  # Check if it's been idling for long enough...
                  if [ ${devSecondsIdle[$i]} -gt $idleTimeout ] ; then
                      # It is time to spin this one down!
                      log "spinning down ${devName[$i]} "
                      ${cmdStandby[$i]} >/dev/null 2>&1
                      devSecondsIdle[$i]=0
                      sleep 1 # no need to worry about race conditions here.
                      # Check if the drive actually spun down as a result of our command
                      if ${cmdSpinStatus[$i]} | grep -iq standby ;then
                         devError[$i]=0
                      else
                         ((devError[$i]++))
                         [ ${devError[$i]} -gt 2 ] && log "${devName[$i]} fails to spin down."
                      fi
                  fi
              fi
          fi
       done
    done &
    disown
    exit 0

     

     

    spind-3.9.zip

  9. On 8/28/2020 at 2:05 PM, ccruzen said:

    No worries here. I think the world needs as many surprised monkeys in sweaters as we can give it.

    I could swear this was a selfie I took like 10 years ago. 😄

    As you see, I was young and good-looking back then.

    Now I'm just good-looking.

  10. On 8/26/2020 at 4:14 PM, ccruzen said:

    I was right there with you guys...

    @ccruzen

    LOL... I was about to ask can't you get your own avatar, when I found on archive.org that you've been using this for years. Ooops!   :-)

    I'll get around changing mine next week. (Unless you don't care?)

     

    Cheers!

     

    ccruzen.PNG

  11. See, guys, this whole thing boils down to knowledge.  Knowledge about which disk exactly is carrying the mismatched byte at certain position.  Knowledge that can't be had with single parity, but can be had with dual parity.  It amazes me that some people would not want that knowledge.  Like, life is easier without that knowledge -- just assume that the parity is wrong and sync it.  Like, if they are suddenly given the knowledge that, say, disk#4 is carrying the corrupted byte at that point, then they would be stumped about what to do with that knowledge.  Let me ease the anxiety: You could always chose to do what you have been doing all along -- sync the corrupted byte onto the parity disks.  Personally, I would take the other option -- recover the corrupted byte from parity.

    • Like 1
    • Thanks 1
  12. 44 minutes ago, hawihoney said:

    ...Now check if that bit is part of non-allocated space, metadata, ..., or a file on that _data disk_. If this bit belongs to a file -->...

    Yes, I understand, but that is beyond the scope of the kernel driver. Tools in userspace could be developed to that effect.

     

    44 minutes ago, hawihoney said:

    this one puzzles me since over 10 years, when my Unraid server synced over 1400 errors to parity because of a bad cable and I didn't notice that.

    That is a perfect example to prove my point.  Thank you.

  13. 49 minutes ago, hawihoney said:

    During parity check a bit on one disk (parity or data) does not carry the expected value. IMHO the only information that helps here would be the block that bit belongs to on the corresponding data (!!!) drive. This block might belong to a file - but not neccessarily. But if it's file content I would like to know it.

    Parity protection knows nothing about file systems. (At least it shouldn't)  It just adds corresponding bits from all disks to see if the result matches the bit that's on the parity disk, that's all.

     

    It shouldn't even know about partitions, but that's a different story. :-)

  14. 22 hours ago, itimpi said:

    The commonest case of parity needing to be corrected is after unclean shutdowns

    Yes, in 99% of the cases it will indeed be found that a parity disk is the one that's carrying the wrong byte. You will know that without the need of guessing.

     

    You are a moderatror, so you read a lot.  You must have seen hundreds of posts over the years that sometimes parrity errors appear for no obvious reason, without any unclean shutdowns.  When people start pulling their hair, trying to guess which disk those errors may be coming from, they should all wish we have the feature I am suggesting. :-)

     

    Personally, this thing has been bugging me for over 10 years now.  I raised these questions back in 2013.  But back then we had only single parity. And with that you can only detect the fact that there's a parity error, but you can't determine the position of the error.  Now that we have dual parity, we could take full advantage of it, and be able to know the position of the error.

  15. 12 hours ago, Energen said:

    So if you have a system of 10 disks, you're suggesting to run something like 100 different parity checks for each byte because you keep excluding one disk at a time?

     

    No, I am not suggesting that.  I did it like that because it was the only way to do it manually.  When done programatically, there are better ways to identify the corrupted disk at a certain byte position.

     

    I see that there is some confusion here... When I say "single disk corruption", I am not referring to a whole corrupted disk, I am only talking about a certain byte position at which we have arrived and we found a parity mismatch.  At the position of the next parity error, the corruption may or may not be on the same disk as before. It doesn't matter, we are dealing with one byte position at a time.  Therefore, all disks may have corruprions in various places, and as long as the corruptions are not at the same byte position, you could recover all disks in one pass.  Also, note that we are not destinguishing between data or parity disks, it works the same.

  16. 6 hours ago, limetech said:

    ...or say, adding ZFS support (or pick any other feature)?

    ZFS or pick any other feature -- I can do on any other distro.  The only reason I am here is the md/unraid driver -- that's the one thing others don't have.

     

    Of course, I realize very well that my suggestion is not something trivial.  Some ides need to cook in the back of your head for quite some time before they pop out all ready to code.  As long as we feel that this is something important, and should eventually be implemented, that is all I am hoping for.  

     

    6 hours ago, limetech said:

    I'll revisit this topic for 6.10 release

    Thank you!

  17. 6 hours ago, itimpi said:

    I would also be a bit concerned about what are the chance of a ‘false positive’ if in fact the corruption occurred on a parity drive and not on a data drive.   Could you be sure that whatever algorithm is used can detect this scenario and always be able to tell it apart from the case where the corrupt bits were on a data drive.   Taking action on a ‘false positive’ could result in perfectly good data then getting corrupted as a result.

    Actually, it doesn't matter which one the corrupted drive is -- data or parity -- it's all the same.  If it happens that the corruption is on a parity drive, it will just end up like syncing the parity drive.

  18. Also, let me give a rough description about how the driver could be doing what I am suggesting:


    It starts a parity check on a dual parity array, and at a certain byte position it arrives at a parrity error.  At this point it could do some extra work, trying to determine if this is a case of single-disk corruption.  One way to do that -- a slow and ugly way, but will do the job -- is this: it runs another parity calculation for the same byte position, but this time disregarding one disk, as though as that disk is missing.  Then another calculation, but this time disregarding another disk, and so on, for each disk in the array, one by one.  If we are fortunate enough to find an outcome in which the parity calculation shows to be correct, then that tells us right there that we have a case of single disk corruption, and we know exactly which disk is carrying the wrong byte.


    At this point we will face the question what to do with this new knowledge.  The choices for action are not that many, really.


    You could either...
    DO: A) Do what we've always done before -- sync the parity disks;
    OR: B) Do nothing, just report it in syslog; (read-only parity check)
    OR: C) Do this new thing -- recover the corrupted byte from parity.


    That's all.


    Now that we have one way of doing it -- feasible, not necessarily optimal -- we can optimize it from there.
     

  19. For those of you who are not sure what exactly I am suggesting, let me explain a but.  When a parity check runs into a mismatched byte at a certain byte position, in 99% of the cases the reasonable thing to do is to just sync the parity disks.  But there can be some situations, albeit rare, when only a single disk is mismatched at that byte position.  Double parity can be used to detect such cases, and it can detect exactly which disk is carrying the mismatched byte.  In those cases, we could be given the golden opportunity to to recover the mismatched byte from parity, instead of propagating the mismatch onto the parity disks.


    I the thread that I quoted above, I talked about how my server (which I was carrying in the trunk of my car) got out of my sight for some period of time during a recent border crossing.  I can't but be concerned that somebody may have taken some disks out of my server and tried to mount them somewhere in order to check for whatever it is they were checking for.  In such a case, they may have inadvertantly (or otherwise) modified a few bytes here and there.  Now, if the Unraid's parity check had the option to detect single-disk corruptions, then I would have a very good chance to restore everything on all the disks just as they were before the incident.


    One can imagine various scenarios that can lead to single-disk corruption.  It can be a malicious script, or it can be simply a human error.  For another illustration, consider this example:


    I set out to write a little script that will spin up all disks. (Story is all made up, but it can easily be true).  For that, I will just read a few random bytes from each disk.  That will wake them up, right?  So I write the script (simplified) like this:

    disks="sdb sdd sdf sdm"
    for i in $disks ;do
       dd of=/dev/$i bs=1024 seek=$(($RANDOM*10240)) if=/dev/urandom count=1
    done
    

    Pleased with the result, I run that script.  Twenty seconds later, a terrifying realization strikes me: What I have just done is not reading, but actually writing random bytes to all the disks!  Horrified, I jump out of my chair, and yank the power cord off the wall.  Too late.  I have just corrupted not one, not two, but ALL my data disks.  What was I thinking?!  And, what can be done?  Well, if the feature that I'm suggesting here had been implemented, then in this situation I would be able to perfectly recover ALL my disks in just one pass.  Now wouldn't that be something?


    Thank you for considering my suggestion.
     

    • Like 1
  20. I would like to request the feature that the parity check will be able to (optionally) do a single-disk corruption detection when working on a dual parity array.  That kind of parity check will likely come at the expense of some serious performance hit, (hense optional), but it would be priceless in certain situations.


    A discussion on the matter occured in this thread:


    https://forums.unraid.net/topic/2557-article-why-raid-5-stops-working-in-2009/?do=findComment&comment=882947


    This feature can even be (for starters) implemented as rport-only mode -- i.e., the driver will not have to make any decisions about what to do with the found wrong byte/sector -- just report it to the syslog and move on.


    Thank you.
     

    • Like 1
×
×
  • Create New...