Tom2000

Members
  • Posts

    46
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Tom2000's Achievements

Rookie

Rookie (2/14)

0

Reputation

  1. Found the problem is on the RAM. Upgrade from 1G to 2G solved the problem. Here is the error when upgrading to 6.5 tom@Hydra:/tmp$ cat unraid_error plugin: installing: https://s3.amazonaws.com/dnld.lime-technology.com/stable/unRAIDServer.plg plugin: downloading https://s3.amazonaws.com/dnld.lime-technology.com/stable/unRAIDServer.plg plugin: downloading: https://s3.amazonaws.com/dnld.lime-technology.com/stable/unRAIDServer.plg ... done plugin: downloading: https://s3.amazonaws.com/dnld.lime-technology.com/stable/unRAIDServer-6.5.1-x86_64.zip ... done plugin: downloading: https://s3.amazonaws.com/dnld.lime-technology.com/stable/unRAIDServer-6.5.1-x86_64.md5 ... done Archive: /tmp/unRAIDServer.zip creating: /tmp/unRAIDServer/EFI-/ creating: /tmp/unRAIDServer/EFI-/boot/ inflating: /tmp/unRAIDServer/EFI-/boot/bootx64.efi inflating: /tmp/unRAIDServer/EFI-/boot/ldlinux.e64 inflating: /tmp/unRAIDServer/EFI-/boot/libcom32.c32 inflating: /tmp/unRAIDServer/EFI-/boot/libutil.c32 inflating: /tmp/unRAIDServer/EFI-/boot/mboot.c32 inflating: /tmp/unRAIDServer/EFI-/boot/menu.c32 inflating: /tmp/unRAIDServer/EFI-/boot/syslinux.cfg inflating: /tmp/unRAIDServer/bzfirmware inflating: /tmp/unRAIDServer/bzimage inflating: /tmp/unRAIDServer/bzmodules inflating: /tmp/unRAIDServer/bzroot unzip error 0 plugin: run failed: /bin/bash retval: 1
  2. I haven't kept up with the upgrade and realized that the version I am running is way behind. What is the best way to upgrade the system to the latest and stable version? Read the wiki and forum, but couldn't find a good answer yet. Thanks, --Tom
  3. Luckily that the difference of cables didn't fry the hard drive controller boards. I need to check the output voltages even though the connectors are exactly the same type. I thought the cable is standard for modular type of power supply. Need to verify that.
  4. Thank you very much for your direction. I was able to bring back my server again. The key word is what you said: "Not all modular supplies use the same setup, cables that work on one may not work on another." After reading your reply, I unplug everything(including unRAID USB drive) and start over again with the original power supply(OCZ 600W) with only CPU/MB power cable and verified there is no power/LED lit on the MB. I then switch to the new power supply(Seasonic 660W) with "new cable" that comes with the new power supply to connect to the only IDE HD I have and all the cooling fans. Then, the system was able to boot up to the BIOS screen! That's a big improvement and milestone. I found two cooling fan died. I then moved on to test the rest of SATA drives with "new cables". All went well, so I plugged in everything. The system booted up again. Without your prompt help, I won't be able to get the server back. I really appreciate your help. --Tom
  5. Hi, I can't find the topics for 4.7 support, so posted my problem here. Please let me know if I need to move to a proper area. I have been running 4.7 since it was available without many issues. Two weeks ago, there was a power outage and I found my server down after going home and it won't boot up again. I have 2 spare power supplies, but failed on both of them. I just ordered a new one and still can't boot the server. Upon testing, I found the server will boot up if I unplug all the power cables except for CPU. Whenever I plugged in any power cable for the hard drives or fans, the server won't boot up. I am using modular power supply, so I don't need to unplug the power cable on the disk end. Please help since I have no access to all the files on the server and afraid of losing all the files amount to several TB. My system: Mother board: P5B-VM DO One stick of 1GB RAM Two two-ports Sata Raid cards 9 disks including one parity: 8 SATA and 1 IDE. Since I am using disk share only, will I be able to read each individual disk by using USB enclosure to read the disk one by one? Thanks, --Tom
  6. I suspected the issue with the drive and then the SATA port. The smartctl report no error on every drive, so it might be the port. I opened the case, found the "problem(not really bad)" drive was connected to a 2-port SATA card. I move the cable from the card to the port on the motherboard, which was originally connected to the failure drive I just removed earlier. I then SYNC the array again and no error at all. The conclusion is that I have a failure drive and bad SATA card in the system from the beginning. Both need to be fixed. --Tom
  7. Attach smartctl scan report for all the drive. I am not familiar where to look, so please let me know if there is any indication of disk error. On the unRAID main page, all the disks show green so far except for the Parity disk which shows red. I am not sure if it is because of syn error? --Tom smartctl_report_20121231.txt
  8. Hi, I have a system with 10 drives running 4.7 in the past year. Recently I found a bad drive (disk6 turned red on the Main), so I followed the instruction (http://lime-technology.com/forum/index.php?topic=2591.msg20919#msg20919) on the forum to remove the drive. When I started parity sync, it was very very slow, so I took a look at the syslog and found the following messages: Dec 30 23:33:00 Tower kernel: ata12: drained 32768 bytes to clear DRQ. Dec 30 23:33:00 Tower kernel: ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 30 23:33:00 Tower kernel: ata12.00: failed command: READ DMA EXT Dec 30 23:33:00 Tower kernel: ata12.00: cmd 25/00:00:0f:d9:07/00:04:01:00:00/e0 tag 0 dma 524288 in Dec 30 23:33:00 Tower kernel: res ff/ff:ff:ff:ff:ff/ff:ff:ff:ff:ff/ff Emask 0x2 (HSM violation) Dec 30 23:33:00 Tower kernel: ata12.00: status: { Busy } Dec 30 23:33:00 Tower kernel: ata12.00: error: { ICRC UNC IDNF ABRT } Dec 30 23:33:00 Tower kernel: ata12: hard resetting link Dec 30 23:33:00 Tower kernel: ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 30 23:33:00 Tower kernel: ata12.00: configured for UDMA/100 Dec 30 23:33:00 Tower kernel: ata12: EH complete Dec 30 23:34:24 Tower kernel: ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Dec 30 23:34:24 Tower kernel: ata12.00: BMDMA2 stat 0x6d0009 Dec 30 23:34:24 Tower kernel: ata12.00: failed command: READ DMA EXT Dec 30 23:34:24 Tower kernel: ata12.00: cmd 25/00:00:27:28:72/00:04:01:00:00/e0 tag 0 dma 524288 in Dec 30 23:34:24 Tower kernel: res 51/04:bf:27:28:72/00:00:00:00:00/e0 Emask 0x1 (device error) Dec 30 23:34:24 Tower kernel: ata12.00: status: { DRDY ERR } Dec 30 23:34:24 Tower kernel: ata12.00: error: { ABRT } Dec 30 23:34:24 Tower kernel: ata12.00: configured for UDMA/100 Dec 30 23:34:24 Tower kernel: ata12: EH complete Dec 30 23:34:37 Tower kernel: ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Dec 30 23:34:37 Tower kernel: ata12.00: BMDMA2 stat 0x6d0009 Dec 30 23:34:37 Tower kernel: ata12.00: failed command: READ DMA EXT Dec 30 23:34:37 Tower kernel: ata12.00: cmd 25/00:00:27:5e:82/00:04:01:00:00/e0 tag 0 dma 524288 in Dec 30 23:34:37 Tower kernel: res 51/04:6f:27:5e:82/00:00:00:00:00/e0 Emask 0x1 (device error) Dec 30 23:34:37 Tower kernel: ata12.00: status: { DRDY ERR } Dec 30 23:34:37 Tower kernel: ata12.00: error: { ABRT } Dec 30 23:34:37 Tower kernel: ata12.00: configured for UDMA/100 Dec 30 23:34:37 Tower kernel: ata12: EH complete The complete syslog is attached. I have to stop the parity rebuild since it won't go any further. Now the array is in not protected state. I'd appreciate if anyone could help. Thanks, --Tom syslog_20121230.txt
  9. Hi Joe, The preclear_disk.sh script finally completed one cycle and below is the result. I suppose it is OK, right? Thanks, --Tom =========================================================================== = unRAID server Pre-Clear disk /dev/sdj = cycle 1 of 1 = Disk Pre-Clear-Read completed DONE = Step 1 of 10 - Copying zeros to first 2048k bytes DONE = Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE = Step 3 of 10 - Disk is now cleared from MBR onward. DONE = Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE = Step 5 of 10 - Clearing MBR code area DONE = Step 6 of 10 - Setting MBR signature bytes DONE = Step 7 of 10 - Setting partition 1 to precleared state DONE = Step 8 of 10 - Notifying kernel we changed the partitioning DONE = Step 9 of 10 - Creating the /dev/disk/by* entries DONE = Step 10 of 10 - Testing if the clear has been successful. DONE = Disk Post-Clear-Read completed DONE Elapsed Time: 14:36:38 ============================================================================ == == Disk /dev/sdj has been successfully precleared == ============================================================================ S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 71c71 < 190 Airflow_Temperature_Cel 0x0022 075 075 000 Old_age Always - 25 (Lifetime Min/Max 25/26) --- > 190 Airflow_Temperature_Cel 0x0022 072 072 000 Old_age Always - 28 (Lifetime Min/Max 25/28) 77c77 < 200 Multi_Zone_Error_Rate 0x000a 253 253 000 Old_age Always - 0 --- > 200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0 ============================================================================
  10. Hi Joe, Thanks again for the explanations. Those are good notes that I will keep to maintain my unRAID server. I am still waiting for preclear_disk.sh script to finish and will update later on. --Tom
  11. Hi Joe, Thanks for the analysis. Please see below for the smartctl command output and it seems to be fine to me. I did the hot-plug on the external SATA cable.Thanks for pointing out that to me since I did not know I am not supposed to do that. I am using putty to connect to unRAID server. Whenever I execute the command "preclear_disk.sh -t /dev/sdk", the session just terminated right away. I think I will go ahead stop the array and restart the server, and then run the preclear_disk.sh again. Thanks, --Tom ------------------------------------------------------------------------ root@Tower:~# smartctl -d ata -a /dev/sdk smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: SAMSUNG HD154UI Serial Number: S1Y6J1KS744099 Firmware Version: 1AG01118 User Capacity: 1,500,301,910,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 3b Local Time is: Fri Aug 21 12:44:04 2009 Local time zone must be set--see zic m ==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details. SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (19393) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 34) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0007 071 071 011 Pre-fail Always - 9640 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail Always - 0 8 Seek_Time_Performance 0x0025 100 100 015 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 10 10 Spin_Retry_Count 0x0033 100 100 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 0 183 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 184 Unknown_Attribute 0x0033 100 100 000 Pre-fail Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 075 075 000 Old_age Always - 25 (Lifetime Min/Max 25/26) 194 Temperature_Celsius 0x0022 075 075 000 Old_age Always - 25 (Lifetime Min/Max 25/27) 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 223195953 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 253 253 000 Old_age Always - 5 200 Multi_Zone_Error_Rate 0x000a 253 253 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
  12. Hi, I think this should be the right thread to post my question. I just purchased two 1.5 TB Samsung SATA drives and tried to use the preclear.sh script to prepare the HD. What I normally do is to connect the HD to the external SATA port on the system and ran though the preclear.sh scripts one disk at a time. Unfortunately Both of them returned the same unsuccessful results, which shows below. =========================================================================== = unRAID server Pre-Clear disk /dev/sdk = cycle 1 of 1 = Disk Pre-Clear-Read completed DONE = Step 1 of 10 - Copying zeros to first 2048k bytes DONE = Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE = Step 3 of 10 - Disk is now cleared from MBR onward. DONE = Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE = Step 5 of 10 - Clearing MBR code area DONE = Step 6 of 10 - Setting MBR signature bytes DONE = Step 7 of 10 - Setting partition 1 to precleared state DONE = Step 8 of 10 - Notifying kernel we changed the partitioning DONE = Step 9 of 10 - Creating the /dev/disk/by* entries DONE = Step 10 of 10 - Testing if the clear has been successful. DONE = Elapsed Time: 7:56:59 ============================================================================ == == SORRY: Disk /dev/sdk MBR could NOT be precleared == ============================================================================ 0+0 records in 0+0 records out 0 bytes (0 B) copied, 2.8617e-05 s, 0.0 kB/s 0000000 The only difference is that when I ran the first HD, the syslog was filled up with the following message like 600MB. I then deleted the syslog and do a touch command to created a new syslog. I then ran the preclear.sh on my second drives and the syslog remains size of 0. Aug 20 03:37:23 Tower kernel: end_request: I/O error, dev sdk, sector 32563752 Aug 20 03:37:23 Tower kernel: sd 9:0:0:0: [sdk] Result: hostbyte=0x04 driverbyte=0x00 Aug 20 03:37:23 Tower kernel: end_request: I/O error, dev sdk, sector 32579816 Aug 20 03:37:23 Tower kernel: sd 9:0:0:0: [sdk] Result: hostbyte=0x04 driverbyte=0x00 Aug 20 03:37:23 Tower kernel: end_request: I/O error, dev sdk, sector 32595880 Aug 20 03:37:23 Tower kernel: sd 9:0:0:0: [sdk] Result: hostbyte=0x04 driverbyte=0x00 Aug 20 03:37:23 Tower kernel: end_request: I/O error, dev sdk, sector 32611944 With the message such as "SORRY: Disk /dev/sdk MBR could NOT be precleared", does it imply that both of my HD are defect? I appreciate if anyone can chime-in what I should do next? --Tom
  13. I am running 4.4 now. From my understanding, the way it works is that you basically plug-in the included pci-e card which has two eSATA port. Then connect one of the eSATA port to the enclosure. How is the disk write and read speed on unRAID? Cheers, --Tom