System broken


Recommended Posts

I was saving a file to the tower and it hung.  Windows gave an error about not being able to close the file.

 

When I checked the tower it seemed to be OK, but I could not cd into the offending directory and syslog said:

 

Apr 15 20:09:58 Tower kernel: hdi: lost interrupt

 

I could also not remove the directory that had the broken file.

 

I tried to shutdown and it hung, though another window I tried to reboot, and that hung.

 

I then power cycled the machine and now I get the following errors upon rebooting:

 

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2308504

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2308512

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2308520

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2308528

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2308928

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2314960

Apr 15 20:07:58 Tower kernel: md0: parity incorrect: 2314968

 

Ideas?

 

 

Link to comment

Well, it did come back up and started a parity check.  It then hung during the parity check.  I've now removed the harddrive and am running unprotected.

 

The drive never has had any issues with it up till now, it is about 4 months old. 

 

I took the drive out, rebooted, deleted the file I had the original trouble with, shutdown, put drive back in, rebooted.  It is now rebuilding the server, and will be done in 1682 minutes!  Wow that is a long time!

 

Hopefully all will work out in the end.

 

 

 

 

Link to comment

Well 1600+ minutes later the tower did finally finish the rebuild.  Is this a typical time frame to rebuild a 320GB drive?

 

On the bright side there were no DMA errors during that entire rebuild time.  That is amazing considering the issues I had prior to upgrading to the new version.  Appears that the new version has made significant improvements for me.

 

 

Link to comment

lovingHDTV,

 

1600+ minutes is about four times longer than it takes my three Terabyte unRaid array to do a parity check.

 

Typically, it takes my array around 360 minutes or less now with the new OS.

 

I would say that you definitely have some kind of problem.

 

Regards,

TCIII

Link to comment

Well 1600+ minutes later the tower did finally finish the rebuild.  Is this a typical time frame to rebuild a 320GB drive?

No, that is definately not typical... unless perhaps you have an old, slow disk in there.

 

On the bright side there were no DMA errors during that entire rebuild time.  That is amazing considering the issues I had prior to upgrading to the new version.  Appears that the new version has made significant improvements for me.

Glad to hear that, but perhaps what's happening is there are still DMA errors occurring, just not hanging the system.

 

At the present time, I am out of town until Thursday, Apr. 20.  When I get back, I'll be able to give you some things to try to test the health of your system.

Link to comment

I stopped the array and rebooted.

 

Now I see this in the /var/log/messages file:

 

Apr 19 00:26:00 Tower kernel:    ide4: BM-DMA at 0xbc00-0xbc07, BIOS settings: hdi:pio, hdj:pio

Apr 19 00:26:00 Tower kernel:    ide5: BM-DMA at 0xbc08-0xbc0f, BIOS settings: hdk:pio, hdl:pio

Apr 19 00:26:00 Tower kernel: hda: 625142448 sectors (320073 MB) w/8192KiB Cache, CHS=38913/255/63, UDMA(100)

Apr 19 00:26:00 Tower kernel: hdc: 234441648 sectors (120034 MB) w/8192KiB Cache, CHS=14593/255/63, UDMA(100)

Apr 19 00:26:00 Tower kernel: hdi: 625142448 sectors (320073 MB) w/8192KiB Cache, CHS=38913/255/63, UDMA(100)

 

Does this mean that the BIOS is trying to set this drive to pio mode, but the kernal overrode it this time to UDMA 100?

 

I ran my bitanalyzer (a program my brother wrote for me that reads/writes a file and reports the throughput) and I am getting better performance than before.  Before I was getting ~10Mb/s now I am getting 14Mb/s for writes.  Reads were ~80Mb/s.  So I think this means that I'm truly running UDMA again.

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.