Jump to content

HDRW

Members
  • Posts

    29
  • Joined

  • Last visited

Posts posted by HDRW

  1. Strange result - kicked off the update to 6.12, came back later and rebooted, and the array came up as stopped (is this expected?).  Pressed [Start] and it didn't - it showed all the data drives had "missing or invalid format", or words to that effect.  All were XFS, and showed as such.

    Stopped the array, rebooted, started the array, and this time all drives came up OK.  I wonder what happened?

  2. 28 minutes ago, gfjardim said:

    Not about the same time, the first time it ran for about 6 hours, now it only read 8.3GiB.

     

    Please send me the output of this command: smartctl -a /dev/sdb

    OK, here it is:

     

    root@uServer1:~# smartctl -a /dev/sdb
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.98-Unraid] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

    === START OF INFORMATION SECTION ===
    Device Model:     TOSHIBA HDWQ140
    Serial Number:    X9GBK1LEFBJG
    LU WWN Device Id: 5 000039 99bc010c0
    Firmware Version: FJ1M
    User Capacity:    4,000,787,030,016 bytes [4.00 TB]
    Sector Size:      512 bytes logical/physical
    Rotation Rate:    7200 rpm
    Form Factor:      3.5 inches
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   ATA8-ACS (minor revision not indicated)
    SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
    Local Time is:    Wed Feb 26 20:36:29 2020 GMT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled

    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

    General SMART Values:
    Offline data collection status:  (0x82) Offline data collection activity
                                            was completed without error.
                                            Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                (  120) seconds.
    Offline data collection
    capabilities:                    (0x5b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            No Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   2) minutes.
    Extended self-test routine
    recommended polling time:        ( 455) minutes.
    SCT capabilities:              (0x003d) SCT Status supported.
                                            SCT Error Recovery Control supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.

    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
      2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
      3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       7394
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       2
      5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
      8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
      9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       125
     10 Spin_Retry_Count        0x0033   100   100   030    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2
    191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       0
    193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       133
    194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       28 (Min/Max 11/42)
    196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
    222 Loaded_Hours            0x0032   100   100   000    Old_age   Always       -       125
    223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
    224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
    226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       637
    240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

    SMART Error Log Version: 1
    No Errors Logged

    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Extended offline    Completed without error       00%        81         -
    # 2  Short offline       Completed without error       00%        44         -

    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.

     

    Cheers,

    Howard

     

  3. On 2/25/2020 at 4:42 PM, gfjardim said:

    Your hard drive returned input/output errors at Feb 22 22:15:01 and that made the preclear session to fail. This could be the hard drive controller's fault or could be your USB3 dock controller's fault, but I'm confident it's your USB3 dock fault, because hard drives usually present media errors and yours doesn't. Maybe it got hot, who knows.

     

    Try replacing it or try to add a small fan toward it.

    Thanks for your help.  I'm sure it's not a temperature problem - where it is makes this pretty-much impossible!

     

    I ran the Preclear again, this time starting with the Erase, which ran OK (taking about 24hrs) and then zeroing.  I last saw the log with less than 10% remaining of the Zeroing, at about 12:00 today.  When I came back for a look this evening, the drive had vanished from the Main screen, so I couldn't get to its own log, but the main log showed this (there wasn't anything before this):

    =====================

    Feb 26 12:26:19 uServer1 kernel: sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
    Feb 26 12:26:19 uServer1 kernel: sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 00 f7 e0 00 00 00 08 00 00 00
    Feb 26 12:26:19 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 16244736
    Feb 26 12:26:19 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 16242688
    Feb 26 12:26:19 uServer1 kernel: Buffer I/O error on dev sdb, logical block 2030336, async page read
    Feb 26 12:26:19 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 16242688
    Feb 26 12:26:19 uServer1 kernel: Buffer I/O error on dev sdb, logical block 2030336, async page read
    Feb 26 12:26:19 uServer1 rc.diskinfo[7387]: SIGHUP received, forcing refresh of disks info.
    Feb 26 12:26:19 uServer1 kernel: usb 8-2: new SuperSpeed Gen 1 USB device number 3 using xhci_hcd
    Feb 26 12:26:19 uServer1 kernel: usb 8-2: language id specifier not provided by device, defaulting to English
    Feb 26 12:26:19 uServer1 kernel: usb-storage 8-2:1.0: USB Mass Storage device detected
    Feb 26 12:26:19 uServer1 kernel: scsi host1: usb-storage 8-2:1.0
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd - read 8316256256 of 4000787030016.
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd command failed, exit code [1].
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 10+0 records in
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 9+0 records out
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 18874368 bytes (19 MB, 18 MiB) copied, 0.234822 s, 80.4 MB/s
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 1102+0 records in
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 1101+0 records out
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 2308964352 bytes (2.3 GB, 2.2 GiB) copied, 12.1095 s, 191 MB/s
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 2263+0 records in
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 2262+0 records out
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 4743757824 bytes (4.7 GB, 4.4 GiB) copied, 24.7798 s, 191 MB/s
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 3453+0 records in
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 3452+0 records out
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 7239368704 bytes (7.2 GB, 6.7 GiB) copied, 37.7141 s, 192 MB/s
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: dd: error reading '/dev/sdb': Input/output error
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 3964+1 records in
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 3964+1 records out
    Feb 26 12:26:21 uServer1 preclear_disk_0123456789000000005[1943]: Post-Read: dd output: 8314159104 bytes (8.3 GB, 7.7 GiB) copied, 43.4195 s, 191 MB/s
    Feb 26 12:26:23 uServer1 preclear_disk_0123456789000000005[1943]: error encountered, exiting...
    Feb 26 12:26:26 uServer1 kernel: mdcmd (72): spindown 2
    Feb 26 12:26:41 uServer1 kernel: usb 8-2: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
    Feb 26 12:29:58 uServer1 kernel: mdcmd (73): spindown 3
    Feb 26 12:37:00 uServer1 kernel: mdcmd (74): spindown 0
    Feb 26 13:49:30 uServer1 kernel: mdcmd (75): spindown 0
    Feb 26 14:14:41 uServer1 kernel: mdcmd (76): spindown 0
    Feb 26 14:14:53 uServer1 kernel: mdcmd (77): spindown 1
    Feb 26 19:37:16 uServer1 emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/disk_log sdc

    =====================

    So it looks like the failure occurred at about the end of Zeroing (given the time).

     

    I rebooted and the drive reappeared, showing "precleared" but with the Format button inoperative, as before.

    So it does look like something is causing the USB/SATA dock to fail, but it got through erasing and only failed at about the same point during Zeroing.  I'm a bit lost for ideas!

     

    Cheers,

    Howard

  4. (I posted this in General originally, not realising it should be here - sorry"!)

    I've got a brand-new Toshiba X300 "NAS"  4TB drive that I want to swap-in to replace a 2TB data disk in my array (the Parity drive is already 4TB) but I'm having trouble with the "Preclear" function, accessed by pressing "Start Preclear"  under "Unassigned Devices" on the Main page...

     

    As my hardware doesn't have any spare SATA ports, I have the new drive connected to a USB3 "Drive Dock", where the disk plugs into a slot in the top and connects to a SATA connector at the bottom of the slot.  Seems to work fine...

     

     I did a Preclear with the Pre-read set Off (it's a new disk, no SMART problems, zero power-on hours showing when I started using it, so I assumed pre-read was not going to do anything useful).

     

    The Zeroing went well, but during Post-Read something happened at some point after 75% - see below for the messages that popped up as it went:

     

    =================================================

    Preclear on 0123456789000000005: 22-02-2020 09:14
    Zeroing started on 0123456789000000005 (sdb)
    Zeroing started on 0123456789000000005 (sdb). Cycle 1 of 1.

     

    Preclear on 0123456789000000005: 22-02-2020 16:16
    Zeroing finished on 0123456789000000005 (sdb)
    Zeroing finished on 0123456789000000005 (sdb). Cycle 1 of 1.

     

    Preclear on 0123456789000000005: 22-02-2020 16:16
    Post-Read started on 0123456789000000005 (sdb)
    Post-Read started on 0123456789000000005 (sdb). Cycle 1 of 1.

     

    Preclear on 0123456789000000005: 22-02-2020 17:44
    Post-Read in progress on 0123456789000000005 (sdb)

     

    Post-Read in progress on 0123456789000000005 (sdb): 25% @ 181. Temp: 41 C. Cycle 1 of 1.

    Preclear on 0123456789000000005: 22-02-2020 19:21
    Post-Read in progress on 0123456789000000005 (sdb)

     

    Post-Read in progress on 0123456789000000005 (sdb): 50% @ 165. Temp: 40 C. Cycle 1 of 1.

    Preclear on 0123456789000000005: 22-02-2020 21:11
    Post-Read in progress on 0123456789000000005 (sdb)

     

    Post-Read in progress on 0123456789000000005 (sdb): 75% @ 136. Temp: 39 C. Cycle 1 of 1.

    Preclear on 0123456789000000005: 22-02-2020 22:15

     

    FAIL! Post-Read 0123456789000000005 (/dev/sdb) failed
    FAIL! Post-Read 0123456789000000005 (/dev/sdb) failed

    ===================================================

    What's going on?  The SMART data is still all OK, but the "Format" button doesn't operate, the only thing I can do is to try another Preclear, but given the time it takes I'd like to know why it failed rather than just "trying it again to see what happens"!

     

    The log file, from the Preclear starting, is as follows:

    ============

    Feb 22 09:14:14 uServer1 preclear_disk_0123456789000000005[15731]: Command: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh --notify 1 --frequency 4 --cycles 1 --skip-preread --no-prompt /dev/sdb
    Feb 22 09:14:21 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 09:14:21 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 09:14:21 uServer1 preclear_disk_0123456789000000005[15731]: Zeroing: dd if=/dev/zero of=/dev/sdb bs=2097152 seek=2097152 count=4000784932864 conv=notrunc iflag=count_bytes,nocache,fullb

    Feb 22 16:16:12 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 16:16:15 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 16:16:18 uServer1 preclear_disk_0123456789000000005[15731]: Post-Read: cmp /tmp/.preclear/sdb/fifo /dev/zero
    Feb 22 16:16:18 uServer1 preclear_disk_0123456789000000005[15731]: Post-Read: dd if=/dev/sdb of=/tmp/.preclear/sdb/fifo count=2096640 skip=512 conv=notrunc iflag=nocache,count_bytes,skip_bytes
    Feb 22 16:16:19 uServer1 preclear_disk_0123456789000000005[15731]: Post-Read: cmp /tmp/.preclear/sdb/fifo /dev/zero
    Feb 22 16:16:20 uServer1 preclear_disk_0123456789000000005[15731]: Post-Read: dd if=/dev/sdb of=/tmp/.preclear/sdb/fifo bs=2097152 skip=2097152 count=4000784932864 conv=notrunc iflag=nocache,count_bytes,skip_bytes
    Feb 22 22:15:01 uServer1 kernel: sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
    Feb 22 22:15:01 uServer1 kernel: sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x88 88 00 00 00 00 01 95 f0 f0 00 00 00 08 00 00 00
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 6810562560
    Feb 22 22:15:01 uServer1 kernel: sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
    Feb 22 22:15:01 uServer1 kernel: sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x88 88 00 00 00 00 01 95 f0 f8 00 00 00 08 00 00 00
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 6810564608
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 0
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 1860881286
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 672538168
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 7814037167
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 1372338735
    Feb 22 22:15:01 uServer1 kernel: print_req_error: I/O error, dev sdb, sector 6810562560
    Feb 22 22:15:01 uServer1 kernel: Buffer I/O error on dev sdb, logical block 851320320, async page read
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Write Protect is off
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Mode Sense: 03 00 00 00
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] No Caching mode page found
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Assuming drive cache: write through
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 22:15:03 uServer1 kernel: sdb: sdb1
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    Feb 22 22:15:03 uServer1 kernel: sd 1:0:0:0: [sdb] Attached SCSI disk
    Feb 22 22:15:04 uServer1 preclear_disk_0123456789000000005[15731]: Post-Read: dd output: dd: error reading '/dev/sdb': Input/output error
    Feb 22 22:15:04 uServer1 unassigned.devices: Adding disk '/dev/sdb1'...
    Feb 22 22:15:04 uServer1 unassigned.devices: Mount drive command: /sbin/mount -t precleared -o rw,auto,async,noatime,nodiratime '/dev/sdb1' '/mnt/disks/TOSHIBA_HDWQ140'
    Feb 22 22:15:04 uServer1 unassigned.devices: Mount of '/dev/sdb1' failed. Error message: mount: /mnt/disks/TOSHIBA_HDWQ140: unknown filesystem type 'precleared'.

    ===================

    I have no idea that UNKNOWN (0x2003) means, and it's weird that it suddenly threw errors on apparently random sectors at 22:15:01 - does this suggest anything?

     

    Thanks for any help/advice!

     

    Cheers,

    Howard

     

  5. The best thing is the ability to have mixed drive-sizes, with specific drives being visible individually, so I can upgrade drives one at a time and keep the system running while it rebuilds the new, increased-size, drive.

     

    The thing I'd like to see added is more support for external drives, using FireWire, USB 3 etc. So I can have fast access to drives that aren't in the array but need to be network-accessible.

  6. Well I couldn't get eSATA to work (I suspect a power problem) , so I went with a USB 3.0 PCI-Express card in the machine, and a USB 3.0 Docking station (where the drive plugs in from the top) which has its own power supply.

     

    It shows the drive correctly as 4TB and the preclear is now running along nicely at a tad over 200MB/sec, so I think I can confirm that my first USB-SATA adaptor is duff, and my USB 2 dock only handles 2TB!

     

    Thanks for all the advice, folks.

     

    (I can't see how to add "SOLVED" to the title, or I'd do that)

  7. Well after 48 hours it crapped out entirely, so I suspect the USB-SATA adaptor is faulty.

    Tried again using a USB  "dock" where the drive plugs into the top, and it went much faster (20MB/Sec) but it only showed the 4TB drive as 2TB - may be a limit of this dock, so I'm now waiting for a USB 3.0 - SATA adaptor to turn up.

     

    Incidentally, this server has an eSATA port - would that be expected to work with UnRAID?

     

    Cheers,

    Howard

  8. I'm going to updrade my array by initially going to a 4TB Parity drive (from 2TB) and then upping some data disks.

    My server is full, with 4 SATA drives, so I thought I'd PreClear the 4TB disk using a USB-SATA adaptor (USB 2).

    Connected up, started the PreClear tool, and started it running with no changes to the parameters (Pre- and Post-Read set to happen).  That was yesterday...

     

    So far it's reading:  Pre-Read: 1% @ 0 MB/s (21:41:39)

     

    Nearly 22 hours to do 1%!  Is this the expected speed using USB 2, or is there something else happening?  I need this to finish well before Christmas, but at this rate it won't even have finished the Pre-Read by then...

     

    The only other options I have are to use eSATA (don't have the cable for this) or to get a 30-day trial of UnRAID and fire up another machine with it, connect this drive internally, and use that for the preclear, which is quite a bit of hassle (I need to find a suitable machine first).

     

    Any thoughts/advice, please?

     

    Cheers,

    Howard

     

  9. I want to swap it in, replacing a drive that's showing some problems (3 reallocated sectors, 419 errors on the main page) and formatting it as XFS (Reiser is current on the existing drive).  Due to changing the format I don't want to follow the "replace a failed drive" procedure.

    So, to be clear, you have all the data from the current ReiserFS drive backed up elsewhere, and you want to put the new drive into that slot with a blank XFS format and copy the data back?

    Well this array *is* the backup for others, so I do have the originals, but I didn't want to copy over the network, but between disks on the UnRAID - it still took several hours using rsync.

    Yes, but I found last time I wanted to do this (replace a disk and change the format) that the instructions in that thread don't work with the current (6.2.4) version of UnRAID.

     

    I've found that the Wiki doesn't cover what I want to do (twice so far, one more to do at some point!) - to pre-clear a new disk, then format it using XFS, then to copy over the data from the Reiser-formatted disk, then swap the latter out for the former.  Seems that it is pretty tricky without it trying to recreate the Parity at least once, and if you're not careful, a couple of times.  There was a period this time when I was running without valid Parity, but probably for less time than it takes to recreate it.

     

    It's done this time, but it would be nice to know if there's a better way next time.

     

    Cheers,

    Howard

  10. OK, back again...

     

    Anyway, I think I'm now sorted out - I'm not going to touch the thing again for a good long time now!

     

    Well, for "good long time", read "about seven weeks"!  :-)

     

    I recently got a (rare) bargain on eBay - a Used 2TB WD Re "Enterprise" drive for a good price, and it turned out to be only a few months old, and SMART reports it had only 39 power-on hours!  (Wish they'd had more than one...).

     

    I want to swap it in, replacing a drive that's showing some problems (3 reallocated sectors, 419 errors on the main page) and formatting it as XFS (Reiser is current on the existing drive).  Due to changing the format I don't want to follow the "replace a failed drive" procedure.

     

    I connected it up, and ran the "Pre-Clear" add-on which took 9 hours.  It's now shown in "Unassigned" as Precleared, and it has "Format" to the left of the Temp column, but it doesn't have a button surround, and clicking it causes a green surround to appear while the button is held down, but nothing happens.

     

    Is this a bug in 6.2.4?  How do I format it? 

     

    I'd rather not add it to the array at this point in case it triggers a Parity Check which isn't necessary at this point, and puts unnecessary wear on the Parity disk - judging from my earlier experiences, it will be necessary to do a full Parity Check/generation when I swap the drive in anyway.

     

    Cheers,

     

    Howard

     

  11. (I know you didn't point me to those instructions, which is why my comment followed the quote from Johnnie's message  :) )

     

    Anyway, thanks - I've gone with number 3, a New Config without known-good Parity, placed the drives as I want them (new drive in, old drive out), and restarted, and it's now rebuilding Parity (another 15 hours!).

     

    I wasn't actually doing a cascade of changes of format, the primary objective was to replace a slow (5400rpm) drive with a fast (7200) one, and changing to XFS on the new one was - I thought - a "free" bonus!  I realise I could have just done the "failed drive replacement" procedure, but that would have continued with ReiserFS on the new disk, which seemed to be a wasted opportunity.

     

    I still don't understand why features come and go (especially go!) between versions - the absence of clearing while the array is started in 6.1 prompted me to update to 6.2, unaware of the problem with swapping slots.  It seems to me that RobJ's instructions can't work as laid out, since it seems to rely on features that don't all exist in any one UnRAID version!  I may post a warning on that thread.

     

    Thinking about it, the disk slots don't all have to be occupied, so I could have just unassigned Disk 1 after copying over, but then all my machines would have to have their access changed from Disk1 to Disk4, and having the physical slots not match the numbers in the software just feels wrong.

     

    Anyway, I think I'm now sorted out - I'm not going to touch the thing again for a good long time now!

     

    Thanks for all your help - I could probably have perpetrated a disaster without it...

     

    Cheers,

     

    Howard

     

  12. Changing disk slots only works with v6.1, at the moment it's not possible with v6.2 with single parity,  it will never be possible with dual parity.

    Arrrggh!  So why did you point me to those instructions?  I updated to 6.2 because 6.1 wouldn't do a Clear with the array started, and now I'm stuck!

     

    @garycase:  I haven't found any reference to the Global User Share Bug that you've mentioned, at least on page 26 of that thread, but I have both Disk and User shares turned Off in Global Share Settings, so hopefully that's OK.

    Steps I followed:

    1. Changed the default format to XFS

    2. Stopped the array and powered down, plugged-in the new drive.  Restarted.

    3. Per my thread:

    https://lime-technology.com/forum/index.php?topic=52948.msg508922#msg508922

    I updated to 6.2.1 as 6.1 wouldn't clear the new drive (it wouldn't do anything to it).

    4. After the update the new disk was assigned as Disk 4, and Cleared (about 15 hours).

    5. Formatted the new disk (as XFS).

    6. Following RobJ's item 8:  Accessed the server via SSH and ran:

    rsync -avPX  /mnt/disk1/ /mnt/disk4/

    which ran for many hours (at least 15).  I'd decided not to remove the source files - that seems like sawing through the branch you're sitting on...

    7. Following RobJ's item 9, ran:

    rsync -rcvPX  /mnt/disk1/ /mnt/disk4

    which ran without copying anything, so I think this proves a good copy.

    8. Following RobJ's 10 to 13, swapped over the drives between Disk 1 and Disk 4, and changed their formats to match what they are (Disk 4 now ReiserFS, Disk 1 now XFS).  Both now showing "Wrong" (see attached capture of the Main page at that time - I've sanitised the IP addresses).

    9. Now can't Start the array due to "Too many wrong and/or missing disks!" - can't do anything on Main except reboot or power it down.

     

    How do I get my disks back?  As far as I know I have all the data still there (two copies of the copied data) so where do I go from here? 

    Should I go for a "New Config"?

    If so, what should I set for "Retain Configuration"?  (I suspect "Parity slots", and then add in the drives in their new slots, but I'd like confirmation as this seems terribly dangerous!)

     

    Finally:  Are the disk "slots" determined by physical location, or Disk ID?  I haven't physically swapped Disk 4 into the Disk 1 slot, but I will do - should I do it before any further recovery, or is physical location irrelevant?

     

    Cheers,

    Howard

    HDRW.zip

  13. OK, I've followed the instructions up to item 14, with the array stopped and the two drives swapped over (and their formats swapped as well), the disks I've swapped are shown as "Wrong", the Array Operation area shows "Too many wrong and/or missing disks!" and the Start button is greyed-out, so I'm a bit stuck as this doesn't accord with instruction 14.

     

    How do I get it going again?

     

    Cheers,

     

    Howard

     

  14. I'm changing one data drive for another, and changing the file system from ReiserFS to XFS.  I have cleared (15+ hours!) and formatted the new drive, so they're both available on the array.

     

    What is the quickest way to copy the entire contents of one drive to the other?  I can obviously do it over the network from a PC, using something like XXCOPY (sic) under Windows, but that seems a bit wasteful when they are both on the same machine.

     

    Is there a way to get UnRAID to copy everything over?  (I'm not a Linux person so if it involves using a command line, I'll need detailed instructions, please!  :-)

     

    Cheers,

     

    Howard

     

    (UnRAID 6.2.1 on an HP Microserver)

     

  15. I am looking to replace a slower (5400rpm) drive with a faster (7200) one, and also change the disk format from ReiserFS to XFS.  I understand I'll have to add a new disk and copy the data, as the format can't be changed in place.

     

    Trying to follow the "Add One or More New Disks" instructions in "UnRAID_Manual_6", it says:

     

        Stop the array.

        Power down the server.

        Install your new hard drive(s).

        Power up the unit.

        Start the array.

    Which I've done.  The new disk is shown as Unassigned, as expected.

     

    The instructions then say:

     

    "When you Start the array, the system will first format the new disk(s). When this operation finishes, all the data disks, including the new one(s), will be exported and be available for use. "

     

    but there's no sign of this happening - the new disk is just sitting there unassigned.  (Not sure what it means by "exported"!  And as it's unassigned, I can't see why it would be added to the array without me doing it).

     

    I looked around further, and found "UnRAID_6/Storage_Management" which, again under adding a new disk:

     

        Stop the array.

        Power down the server.

        Install your new hard drive(s).

        Power up the unit.

        Assign the new storage device to a disk slot using the unRAID webGui.

        Start the array.

    And then:

    "When you Start the array, the system will mount the disk and automatically begin to clear the disk which is required before it can be added to the array."

     

    Well I tried that, and after assigning the new disk to the array there was no "Start" button, only "Clear", implying that the array has to be down while this happens (many hours for 2TB - unacceptable for what is my main server).

     

    So I'm rather confused by two sets of instructions, neither of which work as they are written!

     

    Where do I go from here, please?

     

    This it UnRAID version 6.1.9 on an HP Microserver.

     

    Cheers,

     

    Howard

     

  16. OK, seems to be good news...

     

    The smell wasn't from the server, but from a dehumidifier that is on a timeswitch, and which was off when I visited the server (when I overrode the timeswitch to On, there was a rough, rumbling noise and a really strong smell from the dehumidifier!).

     

    I took the drives out of the server, disconnected the motherboard power connector, and connected it to a Power Supply Tester, powered on and all voltages got green lights.  I measured the 5V and 12V on a "spare" hard drive Molex connector, and both were within 0.1V of correct.

     

    So I reassembled it (you have to slide the motherboard out to get to the power connector) put the drives back in and powered it up.  unRAID came up, showing all drives, and started a Parity Check.

     

    It's currently at 83% complete, with an hour to go, so it looks like whatever stopped it working before was a glitch of some sort, and turning it off and back on again was what it needed!  ::)

     

    In the meantime I've found that the PSU is a fairly standard "1U" one and there are many available as replacements, in various ratings (150W seems to have been the original).

     

    So false alarm - thanks for the advice, and next time I'll try to be more sure of my facts before calling for help!

     

    Cheers,

     

    Howard

     

  17. Thanks for the suggestion, however it beat you to it - it stopped responding to anything on the web page - clicking to change tabs, clicking any buttons, were all ignored.  I went to the machine and noticed that "overheated power supply" smell - anyone who has smelled it will recognise it.

     

    I suspect one or more power supply capacitors may have blown (the hot weather accelerating that) and it may now have lost the 12V rail, hence not being able to spin up the disks, even to stop the array, which I tried some time ago.

     

    It could be that spinning up two disks manually was all it could manage when I did that, and that finally killed the failing components.

     

    So now I have some dismantling to do  :(  I'll start by removing the disks, then power it up and check the voltages, and proceed from there.

     

    Thanks again,

     

    Howard

     

  18. Earlier today I went to access a drive on my unRAID array, and it came back saying it couldn't find it (exact message varied by operating system).  PING returned replies with 0mS latency.

     

    Going into the web page, on the Main tab all the drives were showing OK, but spun down ("Array Started" was the main status indication).  I clicked on one data drive and the parity drive, and they spun up.

    Checked Dashboard, no errors shown, temperatures around +37C (~98F) and no SMART warnings.

    Still can't get to the drives from any other machine, although Windows Explorer shows them present, and with the correct (approximately!) size and free space.

     

    Oh, and clicking on "Log" in the top-right of the Main screen produces an "about:blank" window with nothing in it, and at the bottom:  "Waiting for 192.(etc)..." and just sits there like that - for half an hour and counting!

     

    I'm about to try "turning it off and on again" as I'm out of ideas!

     

    Anyone seen this, or know why it may be happening?

     

    HP uServerl N36L, unRAID 6.1.9, running for 19+ days without problems.

  19. I'm about to reconfigure my array, currently:

    2TB(p) + 2TB + 2TB + 1TB

    replacing a slow-ish 2TB drive with a faster one, and then using the slow 2TB to replace the 1TB drive, so I initiated a Parity Check to make sure was all OK before I started messing about.

     

    After about 20 mins, there was a short power cut which brought the server down.  I connected up the UPS that had been sitting idle and connected the UNRAID server to it (my optician says my hindsight is 20/20! ::) )

     

    On power up, UNRAID seemed to continue the Parity Check where it left off - I was expecting it to start again.  Is this expected behaviour?

     

    I'm planning to use the "Replace a failed disk" procedure to swap in the faster 2TB drive, then the "Replace a single disk with a bigger one" to replace the 1TB with the slow 2TB one.  Does this seem reasonable? 

    Will UNRAID get confused by seeing a drive that used to be in another slot?  In which case, how do I stop this happening?

     

    This is running UNRAID 6.1.19, but it started out as Version 5 so the drives are all formatted as ReiserFS.  Is there any advantage to changing the format of the new drives to something else?  (is it even possible?)  If so, what would be better? 

     

    Cheers,

     

    Howard

     

  20. I went to update from 6.1.8 to 6.1.9 this afternoon.  A pop-up window showed the progress from downloading the .ZIP and .MD5 files, then a series of "inflating", "creating", deleting" entries, and finally "syncing - please wait".

     

    And there it's stayed for over 2 hours.  I don't know what syncing is in this case (surely not recreating the Parity disk?) so I have no idea how long I should wait!

     

    I can't access the machine using the web interface from other machines, and one of the three data disks is returning File Not Found to any DIR command, but the other two disks will display a directory listing properly.  It does respond to PING.

     

    Before I did the update I checked the Dashboard, and no drives were showing any problems.

     

    What to do now?

     

    Cheers,

     

    Howard

     

×
×
  • Create New...