Jump to content

JonathanM

Moderators
  • Posts

    16,171
  • Joined

  • Last visited

  • Days Won

    65

Everything posted by JonathanM

  1. And here is the issue. UNRAID AUTOMATICALLY STARTS A CORRECTING PARITY CHECK AFTER A CRASH. It doesn't give you the option to evaluate the situation before it starts writing data to the parity disk.
  2. I have personally lost data because of a correcting parity check. I have tried to stress the importance of NOT writing ANYTHING to the disks if you suspect there is a problem until you know the nature of the problem. I really wish there was a way to boot the array to a diagnostic mode that there would be no writes of any sort, read only for all data. This topic has been debated to death in the past, and suffice it to say, in an ideal world where the disks and controllers behave as designed when an error is detected, a correcting parity check is the correct action. If something isn't behaving as it should, you can easily get into a situation where you really don't want to write to a drive without running further diagnostics.
  3. Just be sure you can only plug safe loads into it. You wouldn't want a power drill, vacuum cleaner, hair dryer, etc killing your UPS... This UPS can handle it. It's a Powerware Prestige 9 2KVA, and the output is 16 Amps. It has no problem with laser printers, etc. Yes, it will handle a Dyson vacuum cleaner.
  4. Heh, I ran the outlet in my study to a iec male socket in the server room instead of the breaker box, and just used a standard PC power cord from the UPS to the iec socket. That way all the battery backups are in one room with the server, and I have a battery backup protected outlet without the bulk and noise in the main area of the house.
  5. Just want to remind people thinking about doing this to remember to put your ethernet switch and router on battery backup as well.
  6. Depends. dnsmasq can provide several different services. If you don't enable conflicting overlapping services, you should be fine.
  7. The manual says it should be there, Legacy LAN as a boot option. Perhaps you may have to use the other network port?
  8. http://lime-technology.com/forum/index.php?board=31.0
  9. Many disk browsers access the files to determine file type and display thumbnails.
  10. Any location other than /boot/* or /mnt/* in a stock unraid install is on a RAM drive.
  11. If you are only using it for short periods, power consumption isn't a concern. Just throw together the cheapest box you can, use your castoff smaller hard drives, whatever gets you a current backup. Do it sooner rather than later. I deal with people on a regular basis that all say, I was planning on backing it up, but I never got around to it. Data recovery is VERY expensive in some cases, you could put together a very nice 20TB unraid server for the average price of an expedited clean room recovery.
  12. So basically you create an account and add all of the storage resources you wish to make accessible from the interface (even drobbox accounts) when you install the client on your computer I'm trying to be diplomatic here, but it really sounds like xlordnashx is a salesman from this sme place. He doesn't even listen to the valid point made here, in that sme would have to have all your account details from your other existing accounts that you want to use. Sounds like a security nightmare waiting to happen. I'm genuinely curious here, xlordnashx, what unraid version are you currently running?
  13. If you use the official lime logo and unraid text, you may be violating trademark. Not saying you can't get them made, just that it's not nice, and may not be legal where you live. You really need to get Tom's permission before doing it.
  14. If you did add the preclear signature without clearing the disk, you would definitely corrupt any subsequent disk rebuilds until a full correcting parity check was run. Is the data on your other disks important to you?
  15. You can't just cut the power on the unraid server, i assume you're using a UPS in this scenario, which will do a clean powerdown when you cut power. Besides the fact that you are abusing the UPS this way, there is a bigger risk. Normally, in case of a power loss, you not only want to powerdown the unraid server in a clean way ASAP, but after that is done, also shutdown the UPS. If you don't, in case the power returns for like a minute or a few seconds and powers down AGAIN, you will be in the middel of starting up your unraid server again (because your using 'resume on powerloss') and you can't do a nice powerdown since the UPS probably is empty or near empty... this is why you do NOT want to resume on powerloss. Just stay off. So, do not use the UPS as a handy on/off switch... it's an emergency device, treat it as such. While I fully agree with what you said regarding a UPS, that is not what was being recommended. Nowhere in the original post was UPS mentioned, but it should have been. The timer use case would be directly on the server power cord, on the OUTPUT of the UPS, not the wall side of the UPS. I agree that a UPS should always be used with a server, and the original suggestion should have made clear which SIDE of the UPS the timer should be controlling. With a properly sized and controlled UPS, resume on powerloss can be applied with minimal risk.
  16. If you have a PCI ethernet card, I'd try that and turn off the motherboard ports. Realtek ethernet support seems to be spotty at best. Intel based cards usually work better.
  17. The link in http://lime-technology.com/forum/index.php?topic=2817.msg23246#msg23246 this post works fine for me. Which post are you referring to?
  18. Update on clearing 3 new 1.5 Seagates. In the original post, I had just unzipped a fresh install of 4.4.2, installed the smart libraries, and kicked off the script. It precleared 2 out of three disks successfully. I then pulled the USB stick, added the newest bubbaraid and enabled it, and booted bubbaraid. The disk assignments changed, and I suspect I may have given you status on the wrong drive. I looked at the current syslog, and it seems the disk that failed the preclear the first time through may be bad. When I originally ran the script, the drives were sdb, sdc, and sdd. Now they are sda, sdb, and sdc. sda seems to be sick. Jan 12 13:20:25 Tower preclear_disk-start[6674]: 1 Raw_Read_Error_Rate 0x000f 117 100 006 Pre-fail Always - 235560959 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 7 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 5 Reallocated_Sector_Ct 0x0033 095 095 036 Pre-fail Always - 220 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 220180 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 67 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 7 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 187 Reported_Uncorrect 0x0032 041 041 000 Old_age Always - 59 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 4295032833 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 189 High_Fly_Writes 0x003a 076 076 000 Old_age Always - 24 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 190 Airflow_Temperature_Cel 0x0022 067 065 045 Old_age Always - 33 (Lifetime Min/Max 28/33) Jan 12 13:20:25 Tower preclear_disk-start[6674]: 195 Hardware_ECC_Recovered 0x001a 044 044 000 Old_age Always - 235560959 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 18 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 18 Jan 12 13:20:25 Tower preclear_disk-start[6674]: 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: SMART Error Log Version: 1 Jan 12 13:20:25 Tower preclear_disk-start[6674]: ATA Error Count: 65 (device log contains only the most recent five errors) Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ICR = Command Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IFR = Features Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ISC = Sector Count Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ISN = Sector Number Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ICL = Cylinder Low Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^ICH = Cylinder High Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IDH = Device/Head Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IDC = Device Command Register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IER = Error register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: ^IST = Status register [HEX] Jan 12 13:20:25 Tower preclear_disk-start[6674]: Powered_Up_Time is measured from power on, and printed as Jan 12 13:20:25 Tower preclear_disk-start[6674]: DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, Jan 12 13:20:25 Tower preclear_disk-start[6674]: SS=sec, and sss=millisec. It "wraps" after 49.710 days. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 65 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Device Fault; Error: ABRT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:26.454 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:26.442 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:26.243 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:25.911 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:20.959 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 64 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:26.442 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:26.243 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:25.911 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:20.959 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:20.936 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 63 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Device Fault; Error: ABRT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:20.959 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:20.936 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:20.729 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:20.395 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:15.534 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 62 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:20.936 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:20.729 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:20.395 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:15.534 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:15.421 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Error 61 occurred at disk power-on lifetime: 7 hours (0 days + 7 hours) Jan 12 13:20:25 Tower preclear_disk-start[6674]: When the command that caused the error occurred, the device was active or idle. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: After command completion occurred, registers were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: ER ST SC SN CL CH DH Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- Jan 12 13:20:25 Tower preclear_disk-start[6674]: 04 71 04 81 87 80 e0 Device Fault; Error: ABRT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Commands leading to the command that caused the error were: Jan 12 13:20:25 Tower preclear_disk-start[6674]: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name Jan 12 13:20:25 Tower preclear_disk-start[6674]: -- -- -- -- -- -- -- -- ---------------- -------------------- Jan 12 13:20:25 Tower preclear_disk-start[6674]: a1 00 00 00 00 00 a0 00 07:00:15.534 IDENTIFY PACKET DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: ec 00 00 00 00 00 a0 00 07:00:15.421 IDENTIFY DEVICE Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 04 07:00:15.213 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 00 00 00 00 00 00 00 ff 07:00:14.888 NOP [Abort queued commands] Jan 12 13:20:25 Tower preclear_disk-start[6674]: 35 00 00 ff ff ff ef 00 06:59:14.132 WRITE DMA EXT Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: SMART Self-test log structure revision number 1 Jan 12 13:20:25 Tower preclear_disk-start[6674]: No self-tests have been logged. [To run self-tests, use: smartctl -t] Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 13:20:25 Tower preclear_disk-start[6674]: SMART Selective self-test log data structure revision number 1 Jan 12 13:20:25 Tower preclear_disk-start[6674]: SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS Jan 12 13:20:25 Tower preclear_disk-start[6674]: 1 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 2 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 3 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 4 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: 5 0 0 Not_testing Jan 12 13:20:25 Tower preclear_disk-start[6674]: Selective self-test flags (0x0): Jan 12 13:20:25 Tower preclear_disk-start[6674]: After scanning selected spans, do NOT read-scan remainder of disk. Jan 12 13:20:25 Tower preclear_disk-start[6674]: If Selective self-test is pending on power-up, resume after 0 minute delay. Jan 12 13:20:25 Tower preclear_disk-start[6674]: Jan 12 15:34:25 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:25 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:25 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:25 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:25 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:25 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:25 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:25 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:25 Tower kernel: ata4: EH complete Jan 12 15:34:28 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:28 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:28 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:28 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:28 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:28 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:28 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:28 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:28 Tower kernel: ata4: EH complete Jan 12 15:34:31 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:31 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:31 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:31 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:31 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:31 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:31 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:31 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:31 Tower kernel: ata4: EH complete Jan 12 15:34:34 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:34 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:34 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:34 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:34 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:34 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:34 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:34 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:34 Tower kernel: ata4: EH complete Jan 12 15:34:37 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:37 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:37 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:37 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:37 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:37 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:37 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:37 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:37 Tower kernel: ata4: EH complete Jan 12 15:34:40 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jan 12 15:34:40 Tower kernel: ata4.00: BMDMA stat 0x64 Jan 12 15:34:40 Tower kernel: ata4.00: cmd 25/00:00:00:8d:58/00:01:5d:00:00/e0 tag 0 dma 131072 in Jan 12 15:34:40 Tower kernel: res 51/40:00:a5:8d:58/40:00:5d:00:00/00 Emask 0x9 (media error) Jan 12 15:34:40 Tower kernel: ata4.00: status: { DRDY ERR } Jan 12 15:34:40 Tower kernel: ata4.00: error: { UNC } Jan 12 15:34:40 Tower kernel: ata4.00: configured for UDMA/133 Jan 12 15:34:40 Tower kernel: ata4.01: configured for UDMA/133 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] Jan 12 15:34:40 Tower kernel: Descriptor sense data with sense descriptors (in hex): Jan 12 15:34:40 Tower kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Jan 12 15:34:40 Tower kernel: 5d 58 8d a5 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] ASC=0x11 ASCQ=0x4 Jan 12 15:34:40 Tower kernel: end_request: I/O error, dev sda, sector 1566084517 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760564 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760565 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760566 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760567 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760568 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760569 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760570 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760571 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760572 Jan 12 15:34:40 Tower kernel: Buffer I/O error on device sda, logical block 195760573 Jan 12 15:34:40 Tower kernel: ata4: EH complete Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Write Protect is off Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Mode Sense: 00 3a 00 00 Jan 12 15:34:40 Tower kernel: sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 12 15:34:42 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 I assume the failures at 7 hours are when the preclear failed the first time.
  19. I'm kicking off another set of 3 preclears on all 3 disks. We'll see in a couple days.
  20. root@Tower:/boot/scripts# preclear_disk.sh -t /dev/sdb Pre-Clear unRAID Disk ######################################################################## Device Model: ST31500341AS Serial Number: 9VS0HE2T Firmware Version: CC1H User Capacity: 1,500,301,910,016 bytes Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors Units = sectors of 1 * 512 = 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 63 2930277167 1465138552+ 0 Empty Partition 1 does not end on cylinder boundary. ######################################################################## ============================================================================ == == DISK /dev/sdb IS PRECLEARED == ============================================================================ root@Tower:/boot/scripts# All I did was boot the server, install the smartctl libraries, install the preclear script, and kick it off in three different telnet windows on the three different drives. 2 completed successfully, 1 didn't.
  21. Type: fdisk -l /dev/sdb dd if=/dev/sdb count=1 | od -x -A d Post the output of both commands. Joe L. Tower login: root Linux 2.6.27.7-unRAID. root@Tower:~# fdisk -l /dev/sdb Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 1 182402 1465138552+ 0 Empty Partition 1 does not end on cylinder boundary. root@Tower:~# dd if=/dev/sdb count=1 | od -x -A d 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.000298241 s, 1.7 MB/s 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000448 0000 0000 0000 003f 0000 7af1 aea8 0000 0000464 0000 0000 0000 0000 0000 0000 0000 0000 * 0000496 0000 0000 0000 0000 0000 0000 0000 aa55 0000512 root@Tower:~#
  22. The first disk didn't complete successfully. What logs and or other info do I need to look at to figure out why?
  23. This was the stock w2k command line telnet. I normally use putty, but this server is not on my home lan.
×
×
  • Create New...