Jump to content

JonathanM

Moderators
  • Posts

    16,165
  • Joined

  • Last visited

  • Days Won

    65

Everything posted by JonathanM

  1. I've solved my issue by bypassing it totally. :'( I hooked my dumb back-ups 500 to a windows machine that also stays on 24-7, and the windows version of apcupsd sees it just fine, and the net option works ok on my unRaid box. I gave up getting the simple signalling option working in unRaid, but I have a suspicion on what the problem could be if others want to continue hammering at it. The installation script only allows very limited editing options of the .conf file for apcupsd, and I suspect manually changing it to the correct cable and ups type might work.
  2. So far the only APC units I have seen posted to work (including yours) with unRaid are smart signaling units. Which is why I made the comment that perhaps simple mode is broken in our prepackaged distribution.
  3. Great... I was hoping changing the cable type would do something productive. Now I'm wondering if simple mode is broken in the apcupsd build that was prebuilt for unraid. Especially since the "documentation" suggests using "dumb" as the cable type. Is there a list of confirmed working with unRaid APC UPS models somewhere? Jonathan
  4. That is exactly the result I got when I changed to "simple". It seemed to communicate, but not update status. I'll be interested to know what happens when the UPS battery dies, to see if the status lines ever change. BTW, plugging one of those halogen work lights into the UPS makes a dandy fake load to accelerate the process. Just make sure you have enough watts to run the light. Perhaps a couple table lamps would be a more accurate load.
  5. If you follow the link the original author posted, he is using the equivalent of the APC 940-0020B cable, NOT a null modem cable. I haven't tested this yet on mine, but I suspect putting 940-0020B in the cable type for apcupsd should work.
  6. I'm having a similar problem, and I think I may be closer to a solution, having found this little nugget # UPSCABLE <cable> # Defines the type of cable connecting the UPS to your computer. # # Possible generic choices for <cable> are: # simple, smart, ether, usb # # Or a specific cable model number may be used: # 940-0119A, 940-0127A, 940-0128A, 940-0020B, # 940-0020C, 940-0023A, 940-0024B, 940-0024C, # 940-1524C, 940-0024G, 940-0095A, 940-0095B, # 940-0095C, M-04-02-2000 Notice, it doesn't say "dumb" is an option for cable type. Now, that is documentation for an OLD version of apcupsd, so maybe the newer version has that option, but I couldn't get it to accept it. I am currently running with the option "simple", and it loads apcupsd, but the status never shows power failure, it always shows "OK". I'm going to try actually putting the APC cable number in when I get a chance. Here is another link that discusses cable types. http://www.fatblokeracing.org/ApcupsdCableConfiguration.shtml Jonathan
  7. I would like opinions on the cheapest ups that will give feedback on the state of the power line to unraid. Here is my situation. I have multiple large ups's that keep my IT infrastructure up for about 1/2 hour, but they don't have monitoring capability that I can use. (Exide Powerware 9 Prestige EXT) What I would like to do is put a small cheap UPS next to the unraid box that will detect when the power fails, provide a way to signal the other PC's on the net that power has failed, and start an orderly shutdown process. The ups won't need to power anything, just provide information on the state of the power line to the network. If someone has a better option, I'm open to suggestions. Jonathan
  8. Take note of the serial number, make sure you remove the correct drive, put the new one in, boot back up, and the array will not automatically start, it will ask you to confirm that you want to rebuild the drive on the new blank disk. Red means the virtual drive is available and part of the array, but the physical drive isn't being read or written to because it's been marked as failed. Right now all the drive data is being recreated on the fly from the rest of the drives, so if you have another failure before you replace and fully rebuild the disk, you will lose the data on both failed drives.
  9. One small (but possibly important) thing to add. It would be wise to plug the UPS into a switched outlet (power strip or similar) and switch the supply off to test it rather than just pulling the plug out of the wall. Some UPSs don't handle being unplugged while under load very well, they assume they will still have a valid ground reference connected when the power fails.
  10. I've already set up these three apps using the wiki article http://lime-technology.com/wiki/index.php?title=Install_Python_based_servers, but the stop button doesn't seem to work for me, so I'd like to use your .conf's to update my install to what would seem to be the currently accepted install practices for these apps. I briefly parsed through your scripts to see if I could just quickly replicate your start/stop buttons, but it looked a little more complicated than I wanted to manually work through. If I just put your .conf's in the unmenu package installer and run them, am I going to blow up my currently functioning install? Thanks, Jonathan unRaid 4.4.2 Pro with 10 data drives and a cache.
  11. Update on clearing 3 new 1.5 Seagates. In the original post, I had just unzipped a fresh install of 4.4.2, installed the smart libraries, and kicked off the script. It precleared 2 out of three disks successfully. I then pulled the USB stick, added the newest bubbaraid and enabled it, and booted bubbaraid. The disk assignments changed, and I suspect I may have given you status on the wrong drive. I looked at the current syslog, and it seems the disk that failed the preclear the first time through may be bad. When I originally ran the script, the drives were sdb, sdc, and sdd. Now they are sda, sdb, and sdc. sda seems to be sick. Jan 12 13:20:25 Tower preclear_disk-start[6674]: 1 Raw_Read_Error_Rate 0x000f 117 100 006 Pre-fail Always - 235560959 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 7 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 5 Reallocated_Sector_Ct 0x0033 095 095 036 Pre-fail Always - 220 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 220180 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 67 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 7 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 187 Reported_Uncorrect 0x0032 041 041 000 Old_age Always - 59 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 4295032833 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 189 High_Fly_Writes 0x003a 076 076 000 Old_age Always - 24 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 190 Airflow_Temperature_Cel 0x0022 067 065 045 Old_age Always - 33 (Lifetime Min/Max 28/33) Jan 12 13:20:25 Tower preclear_disk-start[6674]: 195 Hardware_ECC_Recovered 0x001a 044 044 000 Old_age Always - 235560959 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 18 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 18 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: SMART Error Log Version: 1 Jan 12 13:20:25 Tower preclear_disk-start[6674]: ATA Error Count: 65 (device log contains only the most recent five errors) Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ICR = Command Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IFR = Features Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ISC = Sector Count Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ISN = Sector Number Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ICL = Cylinder Low Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ICH = Cylinder High Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IDH = Device/Head Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IDC = Device Command Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IER = Error register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IST = Status register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: Powered_Up_Time is measured from power on, and printed as Jan 12 13:20:25 Tower preclear_disk-start[6674]: DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, Jan 12 13:20:25 Tower preclear_disk-start[6674]: SS=sec, and sss=millisec. It "wraps" after 49.710 days. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 65 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Device Fault; Error: ABRT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:26.454 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:26.442 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:26.243 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:25.911 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:20.959 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 64 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:26.442 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:26.243 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:25.911 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:20.959 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:20.936 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 63 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Device Fault; Error: ABRT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:20.959 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:20.936 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:20.729 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:20.395 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:15.534 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 62 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:20.936 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:20.729 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:20.395 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:15.534 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:15.421 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 61 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Device Fault; Error: ABRT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:15.534 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:15.421 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:15.213 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:14.888 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 35 00 00 ff ff ff ef 00 06:59:14.132 WRITE DMA EXT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: SMART Self-test log structure revision number 1 Jan 12 13:20:25 Tower preclear_disk-start[6674]: No self-tests have been logged. [To run self-tests, use: smartctl -t] Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: SMART Selective self-test log data structure revision number 1 Jan 12 13:20:25 Tower preclear_disk-start[6674]: SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS Jan 12 13:20:25 Tower preclear_disk-start[6674]: 1 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 2 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 3 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 4 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 5 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: Selective self-test flags (0x0): Jan 12 13:20:25 Tower preclear_disk-start[6674]: After scanning selected spans, do NOT read-scan remainder of disk. Jan 12 13:20:25 Tower preclear_disk-start[6674]: If Selective self-test is pending on power-up, resume after 0 minute delay. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 15:34:25 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:25 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:25 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:25 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:25 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:25 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:25 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:25 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:25 Tower kernel: ata4: EH complete Jan 12 15:34:28 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:28 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:28 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:28 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:28 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:28 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:28 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:28 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:28 Tower kernel: ata4: EH complete Jan 12 15:34:31 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:31 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:31 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:31 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:31 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:31 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:31 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:31 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:31 Tower kernel: ata4: EH complete Jan 12 15:34:34 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:34 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:34 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:34 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:34 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:34 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:34 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:34 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:34 Tower kernel: ata4: EH complete Jan 12 15:34:37 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:37 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:37 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:37 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:37 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:37 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:37 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:37 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:37 Tower kernel: ata4: EH complete Jan 12 15:34:40 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:40 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:40 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:40 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:40 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:40 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:40 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:40 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] Jan 12 15:34:40 Tower kernel: Descriptor sense data with sense descriptors (in hex): Jan 12 15:34:40 Tower kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Jan 12 15:34:40 Tower kernel: 5d 58 8d a5 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] ASC=0x11 ASCQ=0x4 Jan 12 15:34:40 Tower kernel: end_request: I/O error, dev sda, sector 1566084517 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760564 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760565 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760566 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760567 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760568 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760569 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760570 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760571 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760572 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760573 Jan 12 15:34:40 Tower kernel: ata4: EH complete Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Write Protect is off Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Mode Sense: 00 3a 00 00 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 12 15:34:42 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 I assume the failures at 7 hours are when the preclear failed the first time.
  12. I'm kicking off another set of 3 preclears on all 3 disks. We'll see in a couple days.
  13. root@Tower:/boot/scripts# preclear_disk.sh -t /dev/sdb Pre-Clear unRAID Disk ######################################################################## Device Model: ST31500341AS Serial Number: 9VS0HE2T Firmware Version: CC1H User Capacity: 1,500,301,910,016 bytes Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors Units = sectors of 1 * 512 = 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 63 2930277167 1465138552+ 0 Empty Partition 1 does not end on cylinder boundary. ######################################################################## ============================================================================ == == DISK /dev/sdb IS PRECLEARED == ============================================================================ root@Tower:/boot/scripts# All I did was boot the server, install the smartctl libraries, install the preclear script, and kick it off in three different telnet windows on the three different drives. 2 completed successfully, 1 didn't.
  14. Type: fdisk -l /dev/sdb dd if=/dev/sdb count=1 | od -x -A d Post the output of both commands. Joe L. Tower login: root Linux 2.6.27.7-unRAID. root@Tower:~# fdisk -l /dev/sdb Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 1 182402 1465138552+ 0 Empty Partition 1 does not end on cylinder boundary. root@Tower:~# dd if=/dev/sdb count=1 | od -x -A d 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.000298241 s, 1.7 MB/s 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000448 0000 0000 0000 003f 0000 7af1 aea8 0000 0000464 0000 0000 0000 0000 0000 0000 0000 0000 * 0000496 0000 0000 0000 0000 0000 0000 0000 aa55 0000512 root@Tower:~#
  15. The first disk didn't complete successfully. What logs and or other info do I need to look at to figure out why?
  16. This was the stock w2k command line telnet. I normally use putty, but this server is not on my home lan.
  17. This is a fresh (as in rolled 1/2 hour before use) install of 4.4.2 with the only customizations being the download and install of the smartctl libraries, and download of the new york timezone file. Timezone is set to custom in the configuration. The date command as you specified returned 1231549459 as of 8:05 eastern.
  18. I just kicked off three telnet preclear sessions on three new 1.5TB drives. Is the time display supposed to show the dashes?
×
×
  • Create New...