January 21, 200917 yr Anyone know what the following errors mean? Thanks in advance. Jan 20 22:29:10 Tower kernel: pci 0000:00:13.0: OHCI: BIOS handoff failed (BIOS bug?) 00000184 Jan 20 22:29:10 Tower kernel: ata1: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata1: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata1.00: HPA detected: current 2930275055, native 18446744072344861488 Jan 20 22:29:10 Tower kernel: ata2: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata2: failed due to HW bug, retry pmp=0Jan 20 22:29:10 Tower kernel: ata3: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata3: failed due to HW bug, retry pmp=0Jan 20 22:29:10 Tower kernel: ata3.00: HPA detected: current 2930277168, native 18446744072344861488 Jan 20 22:29:10 Tower kernel: ata4: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata4: failed due to HW bug, retry pmp=0Jan 20 22:29:10 Tower kernel: ata6: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata6: failed due to HW bug, retry pmp=0Jan 20 22:29:10 Tower kernel: ata6.00: HPA detected: current 1953523055, native 1953525168 Jan 20 22:29:10 Tower kernel: atiixp 0000:00:14.1: simplex device: DMA disabled Jan 20 22:29:10 Tower kernel: ide1: DMA disabledJan 20 22:29:11 Tower kernel: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
January 21, 200917 yr Looks bad. I see a reference to the "HPA". That is the "host protected area. It is a small part of the disk drive that is reserved by some BIOSes. Some users have complained that the HPA will make certain drives in their array slightly smaller than other drives of the same size. This can be a PITA if you want to use the drive as parity, because unRAID will not allow a parity drive to be smaller (even by the tiniest amount) than the largest drive in the array. I believe there is a fix for the issue I am mentioning, but this does not appear to be your problem. You need to post more information for us to be able to help very much. 1. Explain the situation that led up to this. (e.g., Has this motherboard worked previously and now is returning these errors? Did you recently change anything?) 2. Post the entire syslog. 3. Post details about your system (motherboard, memory, disk controllers, etc.).
January 21, 200917 yr Author Looks bad. I see a reference to the "HPA". That is the "host protected area. It is a small part of the disk drive that is reserved by some BIOSes. Some users have complained that the HPA will make certain drives in their array slightly smaller than other drives of the same size. This can be a PITA if you want to use the drive as parity, because unRAID will not allow a parity drive to be smaller (even by the tiniest amount) than the largest drive in the array. I believe there is a fix for the issue I am mentioning, but this does not appear to be your problem. You need to post more information for us to be able to help very much. 1. Explain the situation that led up to this. (e.g., Has this motherboard worked previously and now is returning these errors? Did you recently change anything?) 2. Post the entire syslog. 3. Post details about your system (motherboard, memory, disk controllers, etc.). That's exactly what's going on - I have a 1.5TB parity drive, and it's telling me I can't install additional 1.5TB drives because they are larger than the parity drive. I'll post the syslog when I get home, I don't have access to the server from work. The other issue I'm having (nowhere as bad as this one), I can't view files on my flash when server is online. I can see folders, but the files are hidden. All my settings for hidden/system files are set correctly, so I don't really know what the issue is.
January 21, 200917 yr Look here and here. It sounds like changing a BIOS setting to run SATA drives in SATA / AHCI mode, rather than PATA / IDE mode may fix the problem.
January 21, 200917 yr Author Attached is the complete syslog and a screen capture of the issue. All of my drives are already running in AHCI/SATA. Thanks.
January 21, 200917 yr Author From reading the above threads, it seems the one constant is the Gigabyte board... Which is what I have.
January 21, 200917 yr There are some disturbing lines in the syslog about hardware errors related to these drives on this controller (likely the MB). Since you've been running fine with them, however, I am assuming that is not really causing a problem. (I would do some research and look for a BIOS update anyway). Yes, it appears 3 of your drives have an HPA. I think that once the HPA is created, it is created. In order to delete it you have to do something special (see those other threads). So if you ever ran in non-SATA mode it is possible that the HPAs were craeted then, and have stayed around ever since. Or it could be with your MB, that HPAs are created even in SATA mode. All drive sizes should end in 552 (kind of a coincidence, but true for modern drives 500G+, at least throught the 1.5G). (Only 2 drives are showing that way - maybe only 2 drives have HPA) You have a couple of different options: 1 - Do the parity swap procedure to replace a drive with a drive larger than parity. It basically rebuilds parity onto the new disk, and then rebuilds the data disk onto the existing parity disk. When you are done, the slightly larger disk will be parity and disk1 will be what is now the parity disk. This would likely be the "offical" recommendation. 2 - Figure out how to get rid of the HPAs. 3 - FIgure out how to get the HPA ONTO the new disk. It could be as easy as hooking it up to a motherboard port and going into BIOS. I'm just not sure what puts it there so can't be more specific.
January 21, 200917 yr Author There are some disturbing lines in the syslog about hardware errors related to these drives on this controller (likely the MB). Since you've been running fine with them, however, I am assuming that is not really causing a problem. (I would do some research and look for a BIOS update anyway). Yes, it appears 3 of your drives have an HPA. I think that once the HPA is created, it is created. In order to delete it you have to do something special (see those other threads). So if you ever ran in non-SATA mode it is possible that the HPAs were craeted then, and have stayed around ever since. Or it could be with your MB, that HPAs are created even in SATA mode. All drive sizes should end in 552 (kind of a coincidence, but true for modern drives 500G+, at least throught the 1.5G). You have a couple of different options: 1 - Do the parity swap procedure to replace a drive with a drive larger than parity. It basically rebuilds parity onto the new disk, and then rebuilds the data disk onto the existing parity disk. When you are done, the slightly larger disk will be parity and disk1 will be what is now the parity disk. This would likely be the "offical" recommendation. 2 - Figure out how to get rid of the HPAs. 3 - FIgure out how to get the HPA ONTO the new disk. It could be as easy as hooking it up to a motherboard port and going into BIOS. I'm just not sure what puts it there so can't be more specific. Thanks for the reply. I was running a few of the drives in IDE mode when the system was initially built. So it is possible that HPA is there from that period. I'll pull my party drive and try to pull the HPA off of it.. Hopefully that will resolve the issue. I'll post back when it's done.
January 21, 200917 yr Are you in the middle of trying to recover data from a failed disk? If so, be very careful!
January 21, 200917 yr This software may let you reset the HPA. http://hddguru.com/content/en/software/2007.07.20-HDD-Capacity-Restore-Tool/ I've never used it, but it looks like it might do what you are needing. If it blows up, or creates a black-hole, or causes a time-warp, you are on your own. Joe L.
January 22, 200917 yr Anyone know what the following errors mean? Thanks in advance. Jan 20 22:29:10 Tower kernel: pci 0000:00:13.0: OHCI: BIOS handoff failed (BIOS bug?) 00000184 Jan 20 22:29:10 Tower kernel: ata1: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata1: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata1.00: HPA detected: current 2930275055, native 18446744072344861488 Jan 20 22:29:10 Tower kernel: ata2: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata2: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata3: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata3: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata3.00: HPA detected: current 2930277168, native 18446744072344861488 Jan 20 22:29:10 Tower kernel: ata4: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata4: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata6: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata6: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata6.00: HPA detected: current 1953523055, native 1953525168 Jan 20 22:29:10 Tower kernel: atiixp 0000:00:14.1: simplex device: DMA disabled Jan 20 22:29:10 Tower kernel: ide1: DMA disabled Jan 20 22:29:11 Tower kernel: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended Just a suggestion, it would be better to separate these lines, as I have above, because they are not related to each other, and that becomes misleading, especially because they have the same timestamp. I agree, it is far better to provide the entire syslog, so the lines are not viewed out of context. pci 0000:00:13.0: OHCI: BIOS handoff failed (BIOS bug?) 00000184 This is a common message found with certain recent motherboards, appears to be harmless, probably a small bug in your BIOS. Jan 20 22:29:10 Tower kernel: ata1: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata1: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata2: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata2: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata3: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata3: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata4: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata4: failed due to HW bug, retry pmp=0 Jan 20 22:29:10 Tower kernel: ata6: softreset failed (device not ready) Jan 20 22:29:10 Tower kernel: ata6: failed due to HW bug, retry pmp=0 These messages were discussed elsewhere, and the conclusion was that they are harmless. They only occur with motherboards with the SB600 or SB700 chipsets. Please see http://lime-technology.com/forum/index.php?topic=2826.msg23935#msg23935 Jan 20 22:29:10 Tower kernel: ata1.00: HPA detected: current 2930275055, native 18446744072344861488 Jan 20 22:29:10 Tower kernel: ata3.00: HPA detected: current 2930277168, native 18446744072344861488 Although you do have an HPA on one of the drives, it is not indicated in these messages, because all of the Seagate 1.5TB drives show this message, with the garbage native size above. Appears to be a bug, either in the kernel driver that reads this native size, or in the number returned by the Seagate firmware. Because the number returned is not equal to the current size, it thinks there is an HPA on all of the 1.5TB drives. In your case, the 2 lines do show a difference in the current sizes, of 2113, which is typical of HPA's. And that does seem to be more common on Gigabyte boards. I'm afraid the way to fix this currently, is to switch the two 1.5TB drives. If you find a way to remove the HPA from your current parity drive, then you probably would still have to rebuild it, because it won't look like the same drive, because you have changed its size. Jan 20 22:29:10 Tower kernel: atiixp 0000:00:14.1: simplex device: DMA disabled Jan 20 22:29:10 Tower kernel: ide1: DMA disabled This corresponds to an IDE controller with 2 channels (2 drives each), IDE0 and IDE1, neither of which are in use, so this seems harmless. This set of lines did not appear in the last syslog you attached, but you also had a Seagate ST31000340NS attached here, a SATA drive. A change in the BIOS settings may have something to do with this change. Jan 20 22:29:11 Tower kernel: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended This one is from the extra stuff included with BubbaRAID, so you will have to ask him about it. It is not normal for unRAID, as unRAID does not use ext3.
January 22, 200917 yr Author The application posted by Joe L took care of the HPA issue. Rob - You're absolutely right. I did have to rebuild my parity drive after removing HPA since it became recognized as a different drive, as well as one other drives that had the same issue. I'm going to get everything working right then I'll post another log. Thanks for everyone's help!
Archived
This topic is now archived and is closed to further replies.