nick5429

Community Developer
  • Posts

    121
  • Joined

  • Last visited

Posts posted by nick5429

  1. I seem to have developed intermittent parity errors (ie, iterative non-correcting parity checks don't show all the same errors)

     

    I had an unclean shutdown a while back and it's possible I didn't let the parity recalc finish.  That was stupid, but is somewhat separate from my issue.

     

    /var/log/syslog.1:May  1 00:00:01 nickserver kernel: md: recovery thread checking parity...

    /var/log/syslog.1:May  1 02:26:37 nickserver kernel: md: parity incorrect: 1542896200  <--- this one appears "real"

    /var/log/syslog.1:May  1 02:40:30 nickserver kernel: md: parity incorrect: 1669823784  the others are all phantom

     

    /var/log/syslog:May  9 10:38:38 nickserver kernel: md: recovery thread checking parity...

    /var/log/syslog:May  9 14:15:23 nickserver kernel: md: parity incorrect: 1542896200

    /var/log/syslog:May  9 16:16:46 nickserver kernel: md: parity incorrect: 2290922496

    /var/log/syslog:May  9 17:05:06 nickserver kernel: md: parity incorrect: 2558346984

    /var/log/syslog:May  9 17:36:25 nickserver kernel: md: parity incorrect: 2676010000

    /var/log/syslog:May  9 17:49:42 nickserver kernel: md: parity incorrect: 2740517552

    /var/log/syslog:May  9 22:19:38 nickserver kernel: md: recovery thread checking parity...

    /var/log/syslog:May  9 22:44:34 nickserver kernel: md: parity incorrect: 281891928

    /var/log/syslog:May  9 22:49:24 nickserver kernel: md: parity incorrect: 333552304

    /var/log/syslog:May 10 00:08:53 nickserver kernel: md: parity incorrect: 1158023712

    /var/log/syslog:May 10 00:49:43 nickserver kernel: md: parity incorrect: 1542896200

    /var/log/syslog:May 10 00:50:10 nickserver kernel: md: parity incorrect: 1546624504

    /var/log/syslog:May 10 00:50:49 nickserver kernel: md: parity incorrect: 1552673536

    /var/log/syslog:May 10 02:31:30 nickserver kernel: md: parity incorrect: 2374474944

     

    /var/log/syslog:May 10 08:17:10 nickserver kernel: md: recovery thread checking parity...

    /var/log/syslog:May 10 08:40:56 nickserver kernel: md: parity incorrect: 277107080

    /var/log/syslog:May 10 10:43:11 nickserver kernel: md: parity incorrect: 1542896200

    /var/log/syslog:May 10 10:57:54 nickserver kernel: md: parity incorrect: 1676424632

    /var/log/syslog:May 10 12:12:00 nickserver kernel: md: parity incorrect: 2286487984

     

    Configuration:

    unRAID 4.7 on a full Slackware (13.1?) installation

    All drives are SATA and connected directly to motherboard headers.

    I have 3 1.5TB drives and 2 2TB drives:

    Status	Disk	Mounted	Device	Model/Serial	Temp	Reads	Writes	Errors	Size	Used	%Used	Free
    OK	parity		/dev/sde	9VT1_5YD517KW	31°C	38725618	2774174					
    OK	/dev/md1	/mnt/disk1	/dev/sda	SAMSUNG_HD154UI_S1Y6J1KS744713	*	33225543	415589		1.50T	1.50T	100%	1.38M
    OK	/dev/md2	/mnt/disk2	/dev/sdc	00Z_WD-WMAVU3394155	*	24331755	242123		1.50T	1.47T	99%	26.74G
    OK	/dev/md3	/mnt/disk3	/dev/sdb	SAMSUNG_HD154UI_S1Y6J1KS744712	*	30915273	381624		1.50T	915.74G	62%	584.52G
    OK	/dev/md4	/mnt/disk4	/dev/sdd	00P_WD-WCAZAD107336	31°C	46038453	1765660		2.00T	488.22G	25%	1.51T
    	 	 	 	 	 	 	 	Total:	6.50T	4.38T	67%	2.12T

     

    Apparently my system configuration has some sort of log rotation turned on, so my syslog [attached] doesn't show my last boot (~2 months ago).

     

    May  7 05:37:00 nickserver kernel: ------------[ cut here ]------------

    May  7 05:37:00 nickserver kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0xf5/0x175()

    May  7 05:37:00 nickserver kernel: Hardware name: MS-7576

    May  7 05:37:00 nickserver kernel: Modules linked in: md_mod xor dm_mod fglrx(P) [last unloaded: md_mod]

    May  7 05:37:00 nickserver kernel: Pid: 0, comm: swapper Tainted: P          2.6.32.9-unRAID #9

    May  7 05:37:00 nickserver kernel: Call Trace:

    May  7 05:37:00 nickserver kernel:  [<c102523f>] warn_slowpath_common+0x65/0x7c

    May  7 05:37:00 nickserver kernel:  [<c12f838c>] ? dev_watchdog+0xf5/0x175

    May  7 05:37:00 nickserver kernel:  [<c102528a>] warn_slowpath_fmt+0x24/0x27

    May  7 05:37:00 nickserver kernel:  [<c12f838c>] dev_watchdog+0xf5/0x175

    May  7 05:37:00 nickserver kernel:  [<c1031c1b>] ? insert_work+0x41/0x49

    May  7 05:37:00 nickserver kernel:  [<c1031f4e>] ? __queue_work+0x2a/0x2f

    May  7 05:37:00 nickserver kernel:  [<c102c983>] run_timer_softirq+0x112/0x166

    May  7 05:37:00 nickserver kernel:  [<c12f8297>] ? dev_watchdog+0x0/0x175

    May  7 05:37:00 nickserver kernel:  [<c1029269>] __do_softirq+0x79/0xee

    May  7 05:37:00 nickserver kernel:  [<c1029304>] do_softirq+0x26/0x2b

    May  7 05:37:00 nickserver kernel:  [<c10293e3>] irq_exit+0x29/0x2b

    May  7 05:37:00 nickserver kernel:  [<c10126cf>] smp_apic_timer_interrupt+0x6f/0x7d

    May  7 05:37:00 nickserver kernel:  [<c10031f6>] apic_timer_interrupt+0x2a/0x30

    May  7 05:37:00 nickserver kernel:  [<c100843f>] ? default_idle+0x2d/0x42

    May  7 05:37:00 nickserver kernel:  [<c100868c>] c1e_idle+0xcd/0xd2

    May  7 05:37:00 nickserver kernel:  [<c1001b66>] cpu_idle+0x3a/0x50

    May  7 05:37:00 nickserver kernel:  [<c1346fe7>] rest_init+0x53/0x55

    May  7 05:37:00 nickserver kernel:  [<c14fb79d>] start_kernel+0x27b/0x280

    May  7 05:37:00 nickserver kernel:  [<c14fb097>] i386_start_kernel+0x97/0x9e

    May  7 05:37:00 nickserver kernel: ---[ end trace 6f5f19d34dc73db0 ]---

     

    Since a large portion of the identified "bad" blocks are >1500000000, my inclination is to think the issue lies with one of the 2tb drives (or hopefully the sata cables attaching to them).  Smart reports attached.

     

    Aside: has anyone figured out a good way to determine which file a block maps to with reiserFS yet? ext2/3/4 has the 'debugfs' tool that can do it...

     

    Any thoughts other than 'replace the sata cables on the 2 2tb drives and try another non-correcting check'?

    smart.txt

    syslog.txt

  2. I've had some unexpected power outages a few times over the past week (no UPS yet), and Crashplan is causing some annoying behavior.

     

    Note: I've done a full Slackware install with unraid 4.7, which may be the root of my 'user shares not starting' issue, but I haven't been able to get confirmation on that issue.

     

    When my server comes back up after an unclean shutdown, it begins the parity check as expected, but does not auto-mount my user shares.  Is this typical behavior, or unique to me?

     

    My crashplan is configured to back up to /mnt/user/Crashplan.  When it comes up after a power failure, Crashplan auto-starts, but the user shares aren't mounted (not sure if the disk shares get auto-mounted or not).  So Crashplan decides to try to start making a brand new backup completely from scratch under the non-array-mounted directory /mnt/user/Crashplan, which rapidly fills the boot drive.

     

    What I'd like to do is have some mechanism of testing if the array and user shares have been properly configured before telling Crashplan to auto-start in my 'go' script.

     

    I found this snippet in some plugin thread, but unfortunately mdadm considers my array 'STARTED' while it's doing the unclean-shutdown-parity-check:

    )
    until `cat /proc/mdcmd 2>/dev/null | grep -q -a "STARTED" ` ; do echo ">>>waiting for unraid array to start..." ; sleep 5 ; done ; echo ">>>STARTED."

     

    So it blows right through the wait, launches Crashplan even though the unraid array isn't really ready, and begins a new backup to an unmounted location.

     

    Any ideas on how to properly implement a wait?  I guess a quick fix would be something like

    if [-d "/mnt/user/Crashplan"]; then
        <start crashplan>
    fi

    but I'd prefer something that actually checks the array status, not just whether the directory exists. 

     

    Any ideas?

  3. So I was emailing a bit with Crashplan support trying to get a few things clarified about the private data key options, and the CSR mentioned that in certain scenarios, incorrectly entering your private data key will cause your backup data to be deleted.  

     

    I asked him to clarify and elaborate. Response below.

     

    If you perform the Adoption procedure, the CrashPlan software will ask for the encryption key for the archive that you are adopting. If you mis-enter the key, then the archive will be cleared.

     

    Just thought you all might find this ... interesting.

  4. Look at the normalized values... they actually improved during the preclear.   (they are further from their failure thresholds, and it looks as if they start at 100 when fresh from the factory, as that is the "old value" for most those that channged)

     

    Ignore the raw values... very few are meaningful to end users.  Your drive got better as it broke in during the preclear.

    Thanks, Joe! Helpful as always :)

  5. Just finished 2 (separate) rounds of preclear on a drive that will replace my parity, and just wanted some eyes more familiar with SMART stats to confirm that these deltas are nothing to be concerned about:

     

    1st preclear

    ** Changed attributes in files: /tmp/smart_start_sdb  /tmp/smart_finish_sdb
                    ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VALUE
          Raw_Read_Error_Rate =   116     100            6        ok          102990368
             Spin_Retry_Count =   100     100           97        near_thresh 0
             End-to-End_Error =   100     100           99        near_thresh 0
      Airflow_Temperature_Cel =    60      61           45        near_thresh 40
          Temperature_Celsius =    40      39            0        ok          40
       Hardware_ECC_Recovered =    37     100            0        ok          102990368
    No SMART attributes are FAILING_NOW

     

     

    2nd preclear

    Disk Temperature: 38C, Elapsed Time:  26:23:28
    ========================================================================1.13
    ==  ST2000DL003-9VT166    5YD517KW
    == Disk /dev/sdb has been successfully precleared
    == with a starting sector of 64
    ============================================================================
    ** Changed attributes in files: /tmp/smart_start_sdb  /tmp/smart_finish_sdb
                    ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VALUE
          Raw_Read_Error_Rate =   119     116            6        ok          211246080
             Spin_Retry_Count =   100     100           97        near_thresh 0
             End-to-End_Error =   100     100           99        near_thresh 0
      Airflow_Temperature_Cel =    62      64           45        near_thresh 38
          Temperature_Celsius =    38      36            0        ok          38
       Hardware_ECC_Recovered =    37      26            0        ok          211246080
    No SMART attributes are FAILING_NOW
    
    0 sectors were pending re-allocation before the start of the preclear.
    0 sectors were pending re-allocation after pre-read in cycle 1 of 1.
    0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.
    0 sectors are pending re-allocation at the end of the preclear,
        the number of sectors pending re-allocation did not change.
    0 sectors had been re-allocated before the start of the preclear.
    0 sectors are re-allocated at the end of the preclear,
        the number of sectors re-allocated did not change.
    

     

    The only ones that seem intuitively concerning to me are the Raw_Read_Error_Rate and Hardware_ECC_Recovered changes. Though this page seems to indicate this may be fine.  Thoughts?

  6. Is it possible to have my unraid samba shares visible ONLY to valid users? 

     

    For instance, I have a user 'nick' and a user 'streaming'.  'Nick' is a valid user on all shares (movies, photos, backup, personal, etc) and has read/write access to everything.  'Streaming' is a valid user only on 'tv' and 'movies', and only has read access to those two shares.  If 'streaming' tries to access any other shares, access is (correctly) denied.  I want those shares to not show up as browseable for 'streaming'

     

    I set up my tv-connected streaming device to log in as 'streaming', and I want ONLY the 'tv' and 'movies' shares to be visible to that user so that I don't have to scroll through a dozen different shares just to get to the two that are relevant to the device.  When 'nick' logs in, I want all shares to be visible and browseable.

     

    I've spent quite a bit of time over the past day trying to figure this out.  There are a few relevant samba config options, but none does quite what I'm looking for.  I've found about a dozen threads (on other forums/mailing lists) of people asking the same question, with no valid solutions offered.

     

    Setting "browseable=no" on the non-streaming shares would sort of accomplish this, but then 'nick' couldn't browse all the shares except by typing the name of the share or auto-mounting the shares on startup or something.  I don't want to do this.

     

    Setting "hide unreadable=yes" sounds like it does exactly what I want.  However, this only works on files/directories within a given share.  It does not hide the top-level share itself.

     

    Any ideas??

  7. The dm-mod tweak still gives me the said error (I've checked the files, they seem correct, used wget with pastebin's Download link).

    I'd forgotten that Makefiles can be extremely picky about using the correct whitespace (you must use tabs, rather than spaces, at the beginning of lines).  Tab characters don't seem to have been preserved in Pastebin when I copy/pasted.

     

    Try downloading those two files using wget from here instead.

     

    edit: also, the preclear script needs updating for Slackware 13.1.  See my post here for details.  Joe said he'd update the main script, but in the meantime I've made my locally edited version of the preclear script available here.  I make no promises whatsoever about the script, other than it worked for me :)

  8. I have not successfully booted into 2.6.32.9

    I'd strongly recommend getting this to work first, before trying to mess with the dm-mod tweak.

     

     

    I think you've answered your own question here :)

    On boot up I get a kernel panic something to the effect, "VFS: unable to mount ..."  

    Obviously due to the fact that it can't boot the root fs on the device.

     

    For those who have installed a slackware system with the AOC-SASLP-MV8, what have you done? Is there a configuration I'm missing? I tried adding modprobe scst and /sbin/modprode mvsas to rc.modules to no avail.

     

    The only way I can boot into this kernel is if I enable built in support for AOC-SASLP-MV8

     

    If your boot drive is on the raid controller, then you will not be able to boot unless you have the raid controller's drivers built into the kernel.  A module won't work.  Think about it: the module is stored on the hard drive that you're trying to access in order to boot from.  But you can't access that hard drive until the module has been loaded!

     

    Another option is to create an initrd (initial ramdisk) which contains the relevant modules. If there's some legitimate reason that the mvsas driver should be built as a module rather than built into the kernel, then this is what you should do.  This isn't something I've ever had great success with (potentially just from lack of really trying) and I've generally avoided the process.  There should be a plethora of guides on how to do it available on Google.  Here's a site that explains what these are and why you might need them.

     

    The easiest way to work around all of this, though, (and what I chose to do) is to just put your boot drive somewhere that is easily and directly accessible without going through an external controller.  Directly connected to your motherboard (with the proper kernel drivers built-in), for instance.

  9. Thanks for your help nick. But moving Makefile and Kconfig into the md directory gives me errors while issuing make oldconfig:

    Strange. I just double-checked, and it still works fine for me.  

     

    Have you successfully compiled and booted into the kernel (and tested that the unRAID interface works) without this tweak for dm-mod?

     

    You're using kernel 2.6.32.9, right?

     

    Take a look at the bottom of that Kconfig file.  Make sure it includes the last two lines; it's easy to accidentally miss a line at the bottom of a large copy/paste:

    endmenu
    
    endif
    

     

    If that doesn't work, just replace them both with the originals that you copied over from the /unraid/ directory (from the wiki's instructions).  dm-mod might not be absolutely critical...

  10. Thanks for your help nick. But moving Makefile and Kconfig into the md directory gives me errors while issuing make oldconfig:

    Strange. I just double-checked, and it still works fine for me.  

     

    Have you successfully compiled and booted into the kernel without this tweak for dm-mod?

     

    You're using kernel 2.6.32.9, right?

     

    Take a look at the bottom of that Kconfig file.  Make sure it includes the last two lines; it's easy to accidentally miss a line at the bottom of a large copy/paste:

    endmenu
    
    endif
    

     

    If that doesn't work, just replace them both with the originals that you copied over from the /unraid/ directory (from the wiki's instructions).  dm-mod might not be absolutely critical...

  11. These are SATA drives.  How would I check to make sure I'm not in IDE emulation mode?  A quick bit of googling wasn't conclusive.

     

    Check in your BIOS for settings related to IDE for the SATA chipset.  you want to set the "mode" on the chipset to AHCI for best and native SATA performance.

    The only potentially-relevant setting I could find in my BIOS was for a SATA RAID mode, which some sites say might implicitly enable AHCI; I left it off. However, hdparm -I says NCQ is supported/enabled for these drives, which implies to me that the drives aren't running in legacy IDE mode.

  12. First.. some definitions.

     

    Parity errors are when there is not an even number bits across a series of drives at the identical bit position set to a "1"   The errors you are seeing when pre-clearing drives have absolutely nothing to do with parity as they are not yet assigned to the parity protected array.

    After reading all the talk here on the board about 'parity errors', they were on my mind and I simply misspoke; thanks for being extra clear, though.

     

    The errors you are seeing are ICRC errors.  (checksum errors in communication with the disks)  That typically indicates problems in either the cables used, the disk controller ports used, the power supply, or the disks themselves.

    I haven't been able to reproduce these ICRC errors in standalone testing yet as I'm not sure where on the disk they occurred, but...

     

     

    As far as not telling you why the pre-clear was un-successful, welll... it is...

     

    On step 10... Testing if the pre-clear was successful out4 = 00092 and out5 = 00092 were both the un-expected values.  Basically, the values read back from the drive were not as expected. 

    This "MBR preclear error" seems to stem simply from a different implementation of "echo" in my environment.  My version of echo wants "\0" preceding octal numbers, and has no idea what I'm talking about when given, for instance, "\252" in the script:

    root@nickserver:/usr/src/linux# echo -ne "\252"
    \252root@nickserver:/usr/src/linux#

     

    "Step 6"
      # set MBR signature in last two bytes in MBR
      # two byte MBR signature
      echo -ne "\252" | dd bs=1 count=1 seek=511 of=$theDisk
      echo -ne "\125" | dd bs=1 count=1 seek=510 of=$theDisk

     

    The script is expecting out4 = 00170 and out5 = 00085

    echo -ne "\252" | dd bs=1 count=1 seek=511 of=/dev/sdc   >& /dev/null
    echo -ne "\125" | dd bs=1 count=1 seek=510 of=/dev/sdc  >& /dev/null
    root@nickserver:~# dd bs=1 count=1 skip=511 if=/dev/sdc 2>/dev/null |sum|awk '{print $1}'
    00092
    root@nickserver:~# dd bs=1 count=1 skip=510 if=/dev/sdc 2>/dev/null |sum|awk '{print $1}'
    00092

     

    echo -ne "\0252" | dd bs=1 count=1 seek=511 of=/dev/sdc >& /dev/null
    echo -ne "\0125" | dd bs=1 count=1 seek=510 of=/dev/sdc  >& /dev/null
    root@nickserver:~# dd bs=1 count=1 skip=511 if=/dev/sdc 2>/dev/null |sum|awk '{print $1}' #out4
    00170
    root@nickserver:~# dd bs=1 count=1 skip=510 if=/dev/sdc 2>/dev/null |sum|awk '{print $1}' #out5
    00085

     

     

    From your other post I see you are using "experimental" drivers that nobody else in unRAID is using.   That, to me, indicates you are not a linux newbee.  (it may also be a mistake in judgment, as support other than in very general terms is impossible... and non-existent from lime-technology)

     

    Because you are experienced enough to compile your own kernel I think you'll be able to look at the pre-clear shell script and see where the specific verification steps are performed checking for specific values.

    I'd like to think that's an accurate statement.  On the other hand, it's possible that I know just enough to be a danger to myself ;-)  Really though, thanks for the prod to 'go figure it out yourself'!  This wasn't an issue that could have reasonably been figured out by anyone without access to my system.

     

    Because you are using those drivers, it is impossible for me to easily tell if the drives involved are SATA or IDE.    If IDE, then it could easily be the cable used for the two disks.  It might be defective, or it might be an older 40 conductor cable instead of a 80 conductor cable.   You might have bundled the disk cables tightly to the noisy power cables.   If SATA you might have the SATA controller in IDE emulation mode.

    These are SATA drives.  How would I check to make sure I'm not in IDE emulation mode?  A quick bit of googling wasn't conclusive.

     

     

    In any case, these same errors will only cause hair-loss if you do not resolve them NOW before you start using that set of hardware for an unRAID array.  It has nothing directly to do with the pre-clear script, but it does show how the pre-clear process will expose them.  Any drive that cannot be read back "correctly" is a problem.  you'll face constant random parity errors, and pull out your hair trying to resolve the issue.   :)

     

    The disks themselves are probably OK (even if they are not currently pre-cleared)  Once you resolve the CRC errors, you can attempt the pre-clear process on them again.

    The preclear script had a single CRC error that I haven't been able to repeat.  I think I'm going to go ahead and power cycle and run it again, to see what happens.

     

    Though if anyone has other ideas (particularly to try to reproduce the CRC error) I'd be open to trying it, as a 10 hour test cycle is going to be a little frustrating if it keeps failing at the end :)

     

    Thanks again for your help, Joe.

  13. I'm setting up my unRAID (Pro) server for the first time (and running on a full Slackware 13.1 installation).

     

    Both of my SATA Samsung 1.5G 154UI drives gave me results similar to this after 10.5 hours:

     

    ===========================================================================
    =                unRAID server Pre-Clear disk /dev/sdb
    =                       cycle 1 of 1
    = Disk Pre-Clear-Read completed                                 DONE
    = Step 1 of 10 - Copying zeros to first 2048k bytes             DONE
    = Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE
    = Step 3 of 10 - Disk is now cleared from MBR onward.           DONE
    = Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4       DONE
    = Step 5 of 10 - Clearing MBR code area                         DONE
    = Step 6 of 10 - Setting MBR signature bytes                    DONE
    = Step 7 of 10 - Setting partition 1 to precleared state        DONE
    = Step 8 of 10 - Notifying kernel we changed the partitioning   DONE
    = Step 9 of 10 - Creating the /dev/disk/by* entries             DONE
    = Step 10 of 10 - Testing if the clear has been successful.     DONE
    =
    Disk Temperature: 32C, Elapsed Time:  10:32:36
    ============================================================================
    ==
    == SORRY: Disk /dev/sdb MBR could NOT be precleared
    ==
    == out4= 00092
    == out5= 00092
    ============================================================================
    1+0 records in
    1+0 records out
    512 bytes (512 B) copied, 0.000245285 s, 2.1 MB/s
    0000000 0000 0000 0000 0000 0000 0000 0000 0000
    *
    0000700 0000 0000 0000 003f 0000 7af1 aea8 0000
    0000720 0000 0000 0000 0000 0000 0000 0000 0000
    *
    0000760 0000 0000 0000 0000 0000 0000 0000 5c5c
    0001000
    

     

    Each item is "DONE", but it fails with no indication of what the problem is or why, just "could NOT be precleared"....

     

    I see this in the syslog, but it seems odd that the parity errors would occur on both SATA drives at the exact same time

    Dec 14 01:55:48 nickserver kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x580000 action 0x6
    Dec 14 01:55:48 nickserver kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x1980000 action 0x6
    Dec 14 01:55:48 nickserver kernel: ata4.00: BMDMA stat 0x25
    Dec 14 01:55:48 nickserver kernel: ata4: SError: { 10B8B Dispar LinkSeq TrStaTrns }
    Dec 14 01:55:48 nickserver kernel: ata3.00: BMDMA stat 0x25
    Dec 14 01:55:48 nickserver kernel: ata4.00: failed command: WRITE DMA EXT
    Dec 14 01:55:48 nickserver kernel: ata4.00: cmd 35/00:00:68:53:f8/00:04:10:00:00/e0 tag 0 dma 524288 out
    Dec 14 01:55:48 nickserver kernel:          res 51/84:b3:b5:54:f8/84:02:10:00:00/e0 Emask 0x10 (ATA bus error)
    Dec 14 01:55:48 nickserver kernel: ata4.00: status: { DRDY ERR }
    Dec 14 01:55:48 nickserver kernel: ata4.00: error: { ICRC ABRT }
    Dec 14 01:55:48 nickserver kernel: ata3: SError: { 10B8B Dispar Handshk }
    Dec 14 01:55:48 nickserver kernel: ata3.00: failed command: WRITE DMA EXT
    Dec 14 01:55:48 nickserver kernel: ata3.00: cmd 35/00:00:98:df:d2/00:04:0e:00:00/e0 tag 0 dma 524288 out
    Dec 14 01:55:48 nickserver kernel:          res 51/84:61:37:e0:d2/84:03:0e:00:00/e0 Emask 0x10 (ATA bus error)
    Dec 14 01:55:48 nickserver kernel: ata3.00: status: { DRDY ERR }
    Dec 14 01:55:48 nickserver kernel: ata3.00: error: { ICRC ABRT }

     

    Thoughts?

     

    If I saw this in someone else's log, I might think it was due to an insufficient PSU.  I don't think that's my issue, though; I've got a 480W Antec power supply running 3 HDDs, a CD/DVD drive, a graphics card, and the motherboard/CPU -- that's it.

    syslog.txt

  14. Just to clarify, BRiT's instructions are extremely helpful, but ever so slightly incorrect/unclear.

     

    You should replace "modprobe -rw" with "rmmod  -w " (that's "rmmod<space><space><space><space>-w<space>", such that there are now two spaces between "-w" and "md-mod" and four spaces total between "rmmod" and "-w"). 

     

    Editing the binary in 'vim' worked fine for me.

     

    Just for reference, this post consolidates all the additional steps compared to the wiki that I had to take when installing unRAID 4.6 onto a full Slackware 13.1 distribution.

  15. I just installed unRAID 4.6 on a fresh installation of 32-bit Slackware 13.1, loosely following the directions at http://www.lime-technology.com/wiki/index.php?title=Installing_unRAID_on_a_full_Slackware_distro

     

    I didn't have any "legacy" info in my unRAID configuration, so I chose kernel options to conform to the Slackware defaults (for instance, using the 'experimental' PATA drivers in libsata instead of the older ATA/IDE/etc option for PATA drives; this causes PATA drives to show up as /dev/sd* instead of /dev/hd*).

     

    Notable changes from the wiki instructions:

    • kernel version is linux-2.6.32.9
    • You additionally need to copy over the following files from the unRAID distribution into your full Slackware distribution:  /lib/libvolume_id.so.1.1.0 (and create a symlink to it from /lib/libvolume_id.so.1), /etc/exports-, /var/spool/cron/crontabs/root-
    • The names of a variety of kernel options have changed; hopefully you can figure it out.  I chose to disable "Device Drivers > ATA/ATAPI/MFM/RLL support" entirely, and enable the PATA drivers in "Serial ATA (prod) and Parallel ATA (experimental) drivers" instead, as explained above.  Here's my kernel config: http://pastebin.com/HTnU8nYLp.  Note that this is specific to my hardware, and is unlikely to work for you without modifications.
    • Various things (lilo comes to mind) kept complaining that module 'dm-mod' did not exist. The unRAID kernel config files in the devices/md directory disabled the option to create it.  To work around this: 1) build your kernel as specified in the wiki and make sure your system boots successfully; 2) replace /usr/src/linux/drivers/md/Makefile with this and /usr/src/linux/drivers/md/Kconfig with this; 3) 'make oldconfig' and enable dm-mod as a module, then 'make modules && make modules_install'; 4) add '/sbin/modprobe dm-mod' to /etc/rc.d/rc.modules. edit: it's probably better to download these two files (using wget) from my personal host here: http://www.nickmerryman.com/unraid/dmmod_2.6.32.9/ or from the attachments to this post (you'll have to rename the files)
    • Follow the instructions in this thread to work around an issue in the UI where emhttp tries to call modprobe with an unsupported flag.  I think the instructions there are slightly incorrect, or at least a bit unclear; you should replace "modprobe -rw" with "rmmod  -w " (that's "rmmod<space><space><space><space>-w<space>", such that there are now two spaces between "-w" and "md-mod" and four spaces between "rmmod" and "-w").  Editing the binary in 'vim' worked fine for me.
    • Also, the preclear script needs updating for Slackware 13.1.  See my post here for details.  Joe said he'd update the main script, but in the meantime I've made my locally edited version of the preclear script available here.  I make no promises whatsoever about the script, other than it seems to have worked for me :)

     

    I think that's everything different from the wiki that I had to do.  My drives are currently 'preclearing', and the only annoyance I currently have is that the "flash" samba share that unRAID creates shares my /boot partition, rather than actually sharing the /flash directory as its name would imply.  And I have something set up wrong such that I don't get a pretty framebuffer on boot, but that's a) my own problem and b) not really a big deal.

     

    edit 12/13/2010: added preclear script details

    Kconfig.txt

    Makefile.txt