[SOLVED] unRAID network blowing up


Recommended Posts

I recently moved my devices around at home and re-configured my network and now my 4.6 device appears to be blowing up when I start copying files to it.

 

The system boots and I can ping it all day, but as soon as I start moving data to it, I lose connection to the server and the copy fails and then sometimes the connection returns and other times it may not.  This never used to be an issue in the past,  and the only real change is a new gigabit switch.  All my other network connections seem to stay solid, but I also am not moving 20gig files between the rest.

 

Just not sure where this issue could be creeping in from since everything used to function great until I moved my equipment around, but the overall configuration is essentially the same still.

 

I thought for  a second it was because of a parity check, but I stopped it and it still fails.  It really makes zero sense, so I am in search of some assistance.  I cannot even attach the log because the system is just frozen.  When I look at the light on the switch it is flickering super fast while the rest just flash like normal when they transmit data.  The unraid one is certainly messed up, but I am not understanding it.  Does anyone have any thoughts?  

 

The system is locked where even the console doesnt work, the keyboard right at the system does not work.  A hard reset seems as though it would be the only fix, which i already tried once.

 

Link to comment
  • Replies 52
  • Created
  • Last Reply

Top Posters In This Topic

A new router, and a light on it "flickering really fast" indicates there is either a lot of traffic on that port, OR, the cable to that port is picking up a lot of noise, OR, the cable on that port might be defective, OR, the port on the new router is defective.

 

Try moving the cable to a different port first.  Then, possibly try a different router.

Link to comment

Yeah, but that is the crazy thing, none of this should make the server completely hang where the system does not even accept input from the keyboard connected directly to the server.  It seems the entire system is frozen.

 

One thing I did find is a user with a similar issue and he said he had a bad drive.  Can a bad drive cause the system to hang like this?  I know looking at the disk after a hard reset they all show healthy.

 

Because I am not able to get any logs, I was thinking about a hard reset so I can access the system again.  If there is a drive issue, would the log indicate anything right after boot?

 

If I could make any suggestion at all, it would be that instead of the log file being over written, that instead it would copy the old logs into a new folder at time of boot and then start a new log.  This way when the server crashes and hangs like this we can still get to the logs after rebooting to hopefully find a cause.  You would simply date stamp the folder with the time and date at the time of boot.  No idea what it would involve but I think it would be a great enhancement feature if it could be done.

Link to comment

Yeah, but that is the crazy thing, none of this should make the server completely hang where the system does not even accept input from the keyboard connected directly to the server.  It seems the entire system is frozen.

 

Yes, but if the system is too busy responding to interrupts on the network port ....?

 

One thing I did find is a user with a similar issue and he said he had a bad drive.  Can a bad drive cause the system to hang like this?  I know looking at the disk after a hard reset they all show healthy.

 

Yes, I'm sure that there are hard drive/cabling faults which could cause the system to become unresponsive.

 

Because I am not able to get any logs, I was thinking about a hard reset so I can access the system again.  If there is a drive issue, would the log indicate anything right after boot?

 

If I could make any suggestion at all, it would be that instead of the log file being over written, that instead it would copy the old logs into a new folder at time of boot and then start a new log.  This way when the server crashes and hangs like this we can still get to the logs after rebooting to hopefully find a cause.  You would simply date stamp the folder with the time and date at the time of boot.  No idea what it would involve but I think it would be a great enhancement feature if it could be done.

 

The syslog is created on ram disk, so it's not possible to recover it after a hard reset.  It is, indeed, copied to the flash drive ... as the system is shutdown - but, of course, this doesn't happen if you have to force a system reset.   If the syslog was written directly to your flash drive, it would seriously shorten the life of the device.

 

Your best bet for gathering more info is to reboot the system, run the command: 'tail -f /var/log/syslog' on the console, and then provoke the failure again.  The last messages from the syslog should appear on your console, if that is still able to display.

Link to comment

FWIW, I had two problems that came up intermittently when I moved large amounts of data to the unRAID server:

 

1. A faulty gigabit switch.  Worked fine with light loads, but, with heavier loads, I can randomly lose connection to computers --- connections may/may not come back without a reboot.  For now, I've stopped buying D-Link switches --- had to replace two faulty switches to fix my problem.

 

2. Realtek NIC on motherboards and discrete NIC cards.  Replaced both with Intel.  The Realtek worked fine until I started doing really large transfers; wonder if there's a thermal problem with them.

 

Yesterday I successfully created a 300 GB backup file on the unRAID server without a problem (ultimately I'll configure my backup software to generate a series of much smaller files --- One huge file that took nearly 90 minutes to create is a large gamble for me)

 

Link to comment

Well, this is something interesting.  I just unplugged the network cable and left it for a bit and then the console became active again...

 

Finally, I plugged the cable back in and things remained up, so I remoted in and was able to pull the syslog down without hard rebooting.  It is now attached, I see some red Emask errors which I can only assume means something bad.

 

Also, how do I exit the tail command now that I was able to pull the log or will I still need that?  Le me know if I should invoke another failure and capture. 

 

Thanks.

syslog-2011-03-20.txt

Link to comment

Just to add a little more information, just took a peek at the smart history page in unRAID and saw the following,

 

Disk 5: *ERROR* - Current_Pending_Sector it is now 134 (error threshold is 5)

 

 

Could this be as simple as a bad cable of one that came loose during moving, or is it more than likely a bad drive?  Dont really want to do anything in fear of making it worse, so I want to wait for direction.  Thanks.

 

 

Got this about my Disk 5,

Statistics for /dev/sdf 00MVWB0_WD-WMAZ20082367

 

smartctl -a -d ata /dev/sdf

smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD20EARS-00MVWB0

Serial Number:    WD-WMAZ20082367

Firmware Version: 50.0AB50

User Capacity:    2,000,398,934,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Sun Mar 20 12:39:10 2011 EDT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

was suspended by an interrupting command from host.

Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (37200) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

Conveyance self-test routine

recommended polling time: (  5) minutes.

SCT capabilities:       (0x3035) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      0

  3 Spin_Up_Time            0x0027  162  161  021    Pre-fail  Always      -      6875

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      125

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  091  091  000    Old_age  Always      -      6690

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      11

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      1

193 Load_Cycle_Count        0x0032  129  129  000    Old_age  Always      -      215086

194 Temperature_Celsius    0x0022  123  115  000    Old_age  Always      -      27

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      134

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  200  189  000    Old_age  Offline      -      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed without error      00%      6690        -

# 2  Short offline      Completed without error      00%      6656        -

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

Link to comment

Also, how do I exit the tail command now that I was able to pull the log

Control-C will exit the command.

 

Sectors pending re-allocation are not usually caused by a bad cable.  (unless it is a power cable and the sectors could not be written properly)  They are the disk's inability to read specific sectors after multiple attempts.

 

Those sectors have been marked for re-allocation when they are next written.  They are a sign of a disk that is starting to fail, although most disks have several thousand spare sectors.  You'll know when they either get re-written in place, or get re-allocated when they are next written.

 

Joe L.

Link to comment

OK, system locked completely again but the screen was off and I couldnt bring it to life to see the tail of the log.  Obviously the health of this system is bad.  I am curious, as a test, can I just remove my disk 5 from the server and power it back on?  I want to test to see if it that is the cause.  I figure if I bring it up and everything starts working again, then I will get a new drive in there ASAP.  I would assume then I just power down add new disk and saw it is disk 5 and parity will rebuild it...just like with any other drive failure.  But the key for me is to get the thing running so I can test and make sure that is what is causing my problem.

 

Drive 5 will need replaced anyway since there are the bad sectors, but I dont want to wait for the new drive just to test to see if that is what was causing my issue.

 

Thoughts?

Link to comment

Apparently removing the failing disk was a horrible idea.  Now I am seeing this at boot,

 

Model/Serial

65R_WD-WCAVY2217816

65R_WD-WCAVY2136899

00P_WD-WMAVU0856021

00P_WD-WMAVU0526394 <-- was old disk in this slot

      WDC WD15EADS-00P_WD-WMAVU0526394 <-- current disk in this slot

75J7B0_WD-WMATV2054515 <-- was old disk in this slot

      WDC WD1001FALS-7_WD-WMATV2054515 <-- current disk in this slot

 

00MVWB0_WD-WMAZ20082367

00M_WD-WCAZA1264935

00M_WD-WCAZA1253619

 

 

What in the world?  All I did was remove 1 disk and now the whole system is blowing up.  Going to go shut down and wait for assistance, this is crazy.

 

The only thing I will mention is that I did put 4.7 on my flash since I had everything shut down.  I figured it couldnt hurt, but maybe that is what caused this.  Now that I look at it again, the serial number is right, but it is identifying the disk different.  Holy smokes, I hope this is an easy fix, or should I return to 4.6?

Link to comment

Your changing too many things at once. 4.7 is more rigorous in the drive format allowed; HPA could cause the problem your seeing. Revert to 4.6.

 

Un-assigning and disconnecting the suspect drive was not a bad idea. A failing drive can cause unexpected system-wide behaviors.

Link to comment

I'm not fully conversant with the syslogs produced by unRAID, but there are a few things which puzzle me:

 

1) It appears that your machine was booted at 18:35, yet it appears that the drives in your array did not get mounted until 18:59.  However, this might be explained by ntp synchronisation - I wonder whether you have a problem with your clock?  There is also a subsequent report 'Clocksource tsc unstable'

 

2) At 19:04, ntp reported 'no servers reachable'

 

3) About 6 minutes later, it appears that your network interface is timed out by a watchdog.

 

4) An error is reported for ata14 when the 'CHECK POWER MODE' command fails.  The drive then becomes ready again and the network interface comes back up, all within one second.  As I read it, ata14 is your sdh, a 2TB WD20EARS.

 

 

Just a wild guess, but with clock, network and drive faults being reported, I wonder whether it is your psu which is at fault.

 

Edit:

 

Just to add that ata14.00 does not appear to be the drive reporting the Current_Pending_Sector errors, since the reported firmware version is different.

Link to comment

Here is a good one.

 

So, I took the new switch out of the equation. It is a TRENDnet TEG-S80G and it gets really great reviews so I am surprised it is the culprit, maybe it is bad.  So, I took and plugged directly into my Dlink gigabit switch.  I believe it is this one, DGS-2208 8-Port 10/100/1000 Desktop Switch http://www.dlink.com/products/?pid=495.  This is also where my desktop system connects to.

 

Between the 2 switches I simply have another network cable, like anything, stringing the devices together. 

 

Once I connected to the other switch, I ran my same transfer again and everything went through without a hatch, I have no idea what to make of it.  Do the Dlink and Trendnet not play nice together?  is it because they are both green devices?  I also have a 10/100 dlink switch connected to the Trendnet for some other devices as well, so I dont know if that too possibly has an effect, I dont know.  All the connections seem to work fine, it is only when I start to transmit large amounts of data that I have this issue.  I copied some smaller files with no issue at all.

 

The Trendnet has a 5 year warranty so I may try to get it replaced, but I am really unsure if that is truly the issue, it is just really strange.  I mean if it didnt work, I would simply expect it to fail for all transfers, but the fact that some work and others dont is what i so weird.

 

I have attached another syslog just in case it tells us anything new.

 

1) It appears that your machine was booted at 18:35, yet it appears that the drives in your array did not get mounted until 18:59.  However, this might be explained by ntp synchronisation - I wonder whether you have a problem with your clock?  There is also a subsequent report 'Clocksource tsc unstable'

I turned on the server and because the drive is missing I had to tell it to start.  I was in the middle of making dinner so I didnt start it right away

2) At 19:04, ntp reported 'no servers reachable'

Not sure...

3) About 6 minutes later, it appears that your network interface is timed out by a watchdog.

Watchdog?  not sure what that would be

4) An error is reported for ata14 when the 'CHECK POWER MODE' command fails.  The drive then becomes ready again and the network interface comes back up, all within one second.  As I read it, ata14 is your sdh, a 2TB WD20EARS.

This I think is when the transfer broke it again, but this time the system recovered on its own.  Maybe the failing drive caused bigger issues.

 

 

I plan to RMA the bad drive tomorrow as well as maybe pick up a replacement from microcenter instead of being without the failing drive for a week, or should I put it back in the array since it isnt failed yet, just some sectors going bad and relocating, at least till the new drive comes?

syslog-2011-03-20.txt

Link to comment

Picked up a new drive today.  Went with WD as the Seagates make me nervous.

 

I was thinking about using the advanced format and skipping the jumper, but it appears I have the HPA issue that comes with the gigabyte boards and I know 4.7 will bark at me.  Therefore, I was thinking that if I get rid of those errors now then I can go to 4.7, preclear the new drive and then add it to the array by tomorrow night to start the rebuild.  I know the jumpered vs no jumper yields no benefit, but it is just one of those things.  If I can do it I will.  I want to get to 4.7 anyway so it is deal with the HPA issues now or deal with them later.  It appears I have 2 drives affected.

 

Mar 20 18:35:25 Tower kernel: usb 3-3: configuration #1 chosen from 1 choice
Mar 20 18:35:25 Tower kernel: generic-usb 0003:0764:0501.0001: hiddev96,hidraw0: USB HID v1.10 Device [CPS UPS CP850AVRLCD ] on usb-0000:00:12.0-3/input0
Mar 20 18:35:25 Tower kernel: ata5: softreset failed (device not ready)
Mar 20 18:35:25 Tower kernel: ata5: applying SB600 PMP SRST workaround and retrying
Mar 20 18:35:25 Tower kernel: ata2: softreset failed (device not ready)
Mar 20 18:35:25 Tower kernel: ata2: applying SB600 PMP SRST workaround and retrying
Mar 20 18:35:25 Tower kernel: ata3: softreset failed (device not ready)
Mar 20 18:35:25 Tower kernel: ata3: applying SB600 PMP SRST workaround and retrying
Mar 20 18:35:25 Tower kernel: ata1: softreset failed (device not ready)
Mar 20 18:35:25 Tower kernel: ata1: applying SB600 PMP SRST workaround and retrying
Mar 20 18:35:25 Tower kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 20 18:35:25 Tower kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 20 18:35:25 Tower kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 20 18:35:25 Tower kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 20 18:35:25 Tower kernel: ata2.00: ATA-8: WDC WD20EADS-65R6B0, 01.00A01, max UDMA/133
Mar 20 18:35:25 Tower kernel: ata2.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
Mar 20 18:35:25 Tower kernel: ata1.00: ATA-8: WDC WD20EADS-65R6B0, 01.00A01, max UDMA/133
Mar 20 18:35:25 Tower kernel: ata5.00: HPA detected: current 1953523055, native 1953525168
Mar 20 18:35:25 Tower kernel: ata1.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
Mar 20 18:35:25 Tower kernel: ata5.00: ATA-8: WDC WD1001FALS-75J7B0, 05.00K05, max UDMA/133
Mar 20 18:35:25 Tower kernel: ata5.00: 1953523055 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
Mar 20 18:35:25 Tower kernel: ata2.00: configured for UDMA/133
Mar 20 18:35:25 Tower kernel: ata1.00: configured for UDMA/133
Mar 20 18:35:25 Tower kernel: ata5.00: configured for UDMA/133
Mar 20 18:35:25 Tower kernel: ata3.00: ATA-8: WDC WD15EADS-00P8B0, 01.00A01, max UDMA/133
Mar 20 18:35:25 Tower kernel: ata3.00: 2930277168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
Mar 20 18:35:25 Tower kernel: ata3.00: configured for UDMA/133
Mar 20 18:35:25 Tower kernel: scsi 1:0:0:0: Direct-Access ATA WDC WD20EADS-65R 01.0 PQ: 0 ANSI: 5
Mar 20 18:35:25 Tower kernel: sd 1:0:0:0: [sda] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
Mar 20 18:35:25 Tower kernel: scsi 2:0:0:0: Direct-Access ATA WDC WD20EADS-65R 01.0 PQ: 0 ANSI: 5
Mar 20 18:35:25 Tower kernel: scsi 3:0:0:0: Direct-Access ATA WDC WD15EADS-00P 01.0 PQ: 0 ANSI: 5
Mar 20 18:35:25 Tower kernel: sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
Mar 20 18:35:25 Tower kernel: sd 3:0:0:0: [sdc] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
Mar 20 18:35:25 Tower kernel: sd 3:0:0:0: [sdc] Write Protect is off
Mar 20 18:35:25 Tower kernel: sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
Mar 20 18:35:25 Tower kernel: sd 2:0:0:0: [sdb] Write Protect is off
Mar 20 18:35:25 Tower kernel: sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Mar 20 18:35:25 Tower kernel: sd 1:0:0:0: [sda] Write Protect is off
Mar 20 18:35:25 Tower kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Mar 20 18:35:25 Tower kernel: sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 20 18:35:25 Tower kernel: sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 20 18:35:25 Tower kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 20 18:35:25 Tower kernel: sdb:
Mar 20 18:35:25 Tower kernel: sdc:
Mar 20 18:35:25 Tower kernel: sda: sdb1
Mar 20 18:35:25 Tower kernel: sd 2:0:0:0: [sdb] Attached SCSI disk
Mar 20 18:35:25 Tower kernel: sda1
Mar 20 18:35:25 Tower kernel: sd 1:0:0:0: [sda] Attached SCSI disk
Mar 20 18:35:25 Tower kernel: sdc1
Mar 20 18:35:25 Tower kernel: sd 3:0:0:0: [sdc] Attached SCSI disk
Mar 20 18:35:25 Tower kernel: ata4: softreset failed (device not ready)
Mar 20 18:35:25 Tower kernel: ata4: applying SB600 PMP SRST workaround and retrying
Mar 20 18:35:25 Tower kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 20 18:35:25 Tower kernel: ata4.00: HPA detected: current 2930275055, native 2930277168
Mar 20 18:35:25 Tower kernel: ata4.00: ATA-8: WDC WD15EADS-00P8B0, 01.00A01, max UDMA/133
Mar 20 18:35:25 Tower kernel: ata4.00: 2930275055 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
Mar 20 18:35:25 Tower kernel: ata4.00: configured for UDMA/133
Mar 20 18:35:25 Tower kernel: scsi 4:0:0:0: Direct-Access ATA WDC WD15EADS-00P 01.0 PQ: 0 ANSI: 5
Mar 20 18:35:25 Tower kernel: sd 4:0:0:0: [sdd] 2930275055 512-byte logical blocks: (1.50 TB/1.36 TiB)
Mar 20 18:35:25 Tower kernel: sd 4:0:0:0: [sdd] Write Protect is off
Mar 20 18:35:25 Tower kernel: scsi 5:0:0:0: Direct-Access ATA WDC WD1001FALS-7 05.0 PQ: 0 ANSI: 5
Mar 20 18:35:25 Tower kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Mar 20 18:35:25 Tower kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 20 18:35:25 Tower kernel: sdd:
Mar 20 18:35:25 Tower kernel: sd 5:0:0:0: [sde] 1953523055 512-byte logical blocks: (1.00 TB/931 GiB)
Mar 20 18:35:25 Tower kernel: sd 5:0:0:0: [sde] Write Protect is off
Mar 20 18:35:25 Tower kernel: sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
Mar 20 18:35:25 Tower kernel: sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 20 18:35:25 Tower kernel: sde: sde1
Mar 20 18:35:25 Tower kernel: sd 5:0:0:0: [sde] Attached SCSI disk
Mar 20 18:35:25 Tower kernel: sdd1
Mar 20 18:35:25 Tower kernel: sd 4:0:0:0: [sdd] Attached SCSI disk
Mar 20 18:35:25 Tower kernel: atiixp 0000:00:14.1: IDE controller (0x1002:0x439c rev 0x00)
Mar 20 18:35:25 Tower kernel: ATIIXP_IDE 0000:00:14.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar 20 18:35:25 Tower kernel: atiixp 0000:00:14.1: not 100%% native mode: will probe irqs later
Mar 20 18:35:25 Tower kernel: ide0: BM-DMA at 0xfa00-0xfa07
Mar 20 18:35:25 Tower kernel: atiixp 0000:00:14.1: simplex device: DMA disabled
Mar 20 18:35:25 Tower kernel: ide1: DMA disabled

 

 

Link to comment

Thanks, I had not yet seen that thread.  However, After reading a few others, I learned that I need to get away from Gigabyte all together.  From the looks of things, my mobo does not support disabling HPA, so for this reason I plan to replace the mobo.  Now the question is, what mobo to use that will work well in my system.

 

I am actually looking at an MSI board at the moment as it is available locally, so if things do in fact not function, most importantly my supermicro aoc-sat2-mv8.

 

First I need to get the new drive into place and data re-built, then maybe by the weekend, the new board can go in and I will deal with the HPA that appears on my 2 drives.

Link to comment

A HPA can only be accessed or set once per power-up or something like that and that Gigabyte motherboard will keep accessing at least one HPA meaning the drive will refuse to let you remove it. Also, the Gigabyte motherboard will just keep making the damn things if you do remove them. So, get rid of the motherboard first.

 

Peter

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.