DISK_DSBL


Recommended Posts

Hi

 

I've been nervously keeping an eye on my unRAID box (following an upgrade from 4.4.2 to 4.5 and a, presumably unrelated, problem powering down which resulted in 3 sync errors when I powered back on), and tonight noticed that disc 9 is showing as disabled.

 

I've powered down for now and will check the cables on Thursday when I have the day off (and will also order a new drive just in case) but wondered if the attached syslog and/or SMART report can shed any light. I don't know enough to be able to interpret the information so any help would be greatly appreciated.

 

Thanks!

Link to comment

I'm not sure I can help, because the SMART report and the syslog both look fine, no errors.  But I'm also not sure if this is the right syslog, because your system clock is wrong, says January 13.  It is using the latest unRAID, v4.5, so must be recent, but there is no evidence of a disabled drive in it.  Disk 9 is correctly mounted.

 

I do have one unrelated comment, about a performance issue.  You have your parity drive, a fast Seagate 1.5TB, connected to one of the 3 SiI3114 PCI cards, which will limit your write performance significantly, partly because it is sharing the PCI bus with so many other drives, and partly because the older Silicon Image chipsets limit drives to UDMA/100.  I would move the fastest drives to the fastest SATA ports.  In your case, you have an empty SATA port on the motherboard.  That is the ideal port to connect the parity drive to.

 

You can set the server clock on the Settings page of the unRAID Web Management pages.

Link to comment

I have to say the support from the forum here is first-rate. Thanks to everyone who contributes.

 

Curious that there's nothing in the syslog. I was sure I picked the right one as I saved it without the date reference so I'd avoid picking the wrong one! Oh well, I'll try again! I also noted the wrong system date but assumed not relevant so didn't bother changing it.

 

As far as the parity is concerned, again, I was sure I connected this to the MB not the pci card(s) although perhaps I got mixed up when I upgraded the drive - another thing to check. (I have 7 available connections on the MB the 8th 'free' (as you quite rightly pointed out) is broken, hence not in use). I only have 4 drives connected to the 8 (7 available - another breakage!) slots on the 2 sata pci cards (the 3rd pci card being video) and these are (at least I thought they were!) 3 x 500GB drives and 1 x 1TB drive.

 

As an aside, if the server doesn't power down properly (seem to have a process running, a point which you noted in the 4.5 announcement thread) could that result in a sync issue and be the reason the disk is disabled, rather than there being a physical issue with the disc and/or cable? How would I identify the specific process which is preventing the powerdown and how do I then kill it? The 'top' command gives me what appears to be lots and lots of processes and I think I kill these by using the relevant reference number (is that right?) but I'm lost to be honest. I'll search the wiki tonight to see what I can learn ;-)

 

I'm tempted to roll-back to 4.4.2 while I try to sort this out. I don't know enough to conclude whether the upgrade to 4.5 is of any relevance or purely coincidence (most likely the latter) but it would make me feel a little more comfortable as I had few issues before.

 

Looks like a have a bit of work to do......

 

Thanks once again for your help.

 

Simon

 

Link to comment

I have to say the support from the forum here is first-rate. Thanks to everyone who contributes.

 

Curious that there's nothing in the syslog. I was sure I picked the right one as I saved it without the date reference so I'd avoid picking the wrong one! Oh well, I'll try again! I also noted the wrong system date but assumed not relevant so didn't bother changing it.

 

As far as the parity is concerned, again, I was sure I connected this to the MB not the pci card(s) although perhaps I got mixed up when I upgraded the drive - another thing to check. (I have 7 available connections on the MB the 8th 'free' (as you quite rightly pointed out) is broken, hence not in use). I only have 4 drives connected to the 8 (7 available - another breakage!) slots on the 2 sata pci cards (the 3rd pci card being video) and these are (at least I thought they were!) 3 x 500GB drives and 1 x 1TB drive.

 

As an aside, if the server doesn't power down properly (seem to have a process running, a point which you noted in the 4.5 announcement thread) could that result in a sync issue and be the reason the disk is disabled, rather than there being a physical issue with the disc and/or cable? How would I identify the specific process which is preventing the powerdown and how do I then kill it? The 'top' command gives me what appears to be lots and lots of processes and I think I kill these by using the relevant reference number (is that right?) but I'm lost to be honest. I'll search the wiki tonight to see what I can learn ;-)

For any disk that is showing "Unmounting" you can log in via telnet and type (example here for disk1)

lsof /mnt/disk1

 

The lsof shows a list of open files.

 

Basically, it can be a file that is open by a process, or a process that has a directory on a disk "busy" as it is the current working directory.

lsof will show both.

 

As far as stopping the array, if you logged in and changed directory to a disk, then it is your log-in session keeping a disk busy.  If you started an add-on process, it is probably it that is keeping a disk busy.  You need to stop it.

I'm tempted to roll-back to 4.4.2 while I try to sort this out. I don't know enough to conclude whether the upgrade to 4.5 is of any relevance or purely coincidence (most likely the latter) but it would make me feel a little more comfortable as I had few issues before.

 

Looks like a have a bit of work to do......

 

Thanks once again for your help.

 

Simon

 

Link to comment

Hi there

 

Disc 9 was "missing" on the next boot so replaced the SATA cable and it's now back. However, unRAID thinks I've replaced the disc (screenshot). Is my only option to select "Start will bring the array on-line, start a Data-Rebuild, and then expand the file system (if possible)" ?

 

Thanks

Link to comment

Hi there

 

Disc 9 was "missing" on the next boot so replaced the SATA cable and it's now back. However, unRAID thinks I've replaced the disc (screenshot). Is my only option to select "Start will bring the array on-line, start a Data-Rebuild, and then expand the file system (if possible)" ?

 

Thanks

If it is exactly the same disk, and if all other disks are present and working, you can use the "Trust my Parity" procedure as described in the wiki.

 

http://lime-technology.com/wiki/index.php?title=Make_unRAID_Trust_the_Parity_Drive,_Avoid_Rebuilding_Parity_Unnecessarily

 

Joe L.

Link to comment
You have your parity drive, a fast Seagate 1.5TB, connected to one of the 3 SiI3114 PCI cards ...

As far as the parity is concerned, again, I was sure I connected this to the MB not the pci card(s) although perhaps I got mixed up when I upgraded the drive - another thing to check. (I have 7 available connections on the MB the 8th 'free' (as you quite rightly pointed out) is broken, hence not in use).

 

So your motherboard appears to have 8 SATA ports, probably in 2 groups of 4.  However your syslog shows the setup like this:

 

nVidia nForce onboard disk controller:

  Port #1:  Samsung 1TB

  Port #2:  Samsung 1TB

  Port #3: 

  Port #4:  Samsung 1TB

SiI3114 chipset #1: (on PCI slot 5:6)

  Port #1: 

  Port #2: 

  Port #3:  Samsung 500GB

  Port #4:  Samsung 500GB

SiI3114 chipset #2: (on PCI slot 5:7)

  Port #1:  Samsung 1TB

  Port #2: 

  Port #3:  Samsung 500GB

  Port #4:  Hitachi 1TB

SiI3114 chipset #3: (on PCI slot 5:10)

  Port #1: 

  Port #2:  Seagate 1.5TB  (parity)

  Port #3:  Samsung 1TB

  Port #4:  WD 250GB

 

So it would appear your motherboard maker provided 4 SATA ports through the onboard nForce controller, then added 4 more by embedding a SiI3114 controller, exactly like the 2 cards you added.  In the syslog, I cannot tell them apart.  All 3 Silicon Image chipsets are on the PCI bus, so there is no advantage to connecting to the 4 SiI3114 ports on the motherboard.  Only your 4 nForce ports are fast SATA II ports, and not sharing the PCI bus.  If possible, you want the parity drive connected to one of the 4 nForce ports, currently handling 3 Samsung 1TB drives.

Link to comment

You have your parity drive, a fast Seagate 1.5TB, connected to one of the 3 SiI3114 PCI cards ...

As far as the parity is concerned, again, I was sure I connected this to the MB not the pci card(s) although perhaps I got mixed up when I upgraded the drive - another thing to check. (I have 7 available connections on the MB the 8th 'free' (as you quite rightly pointed out) is broken, hence not in use).

 

So it would appear your motherboard maker provided 4 SATA ports through the onboard nForce controller, then added 4 more by embedding a SiI3114 controller, exactly like the 2 cards you added.  In the syslog, I cannot tell them apart.  All 3 Silicon Image chipsets are on the PCI bus, so there is no advantage to connecting to the 4 SiI3114 ports on the motherboard.  Only your 4 nForce ports are fast SATA II ports, and not sharing the PCI bus.  If possible, you want the parity drive connected to one of the 4 nForce ports, currently handling 3 Samsung 1TB drives.

 

Spot on. (Foolishly) I hadn't really considered the on-board chipset on the 2nd bank of SATA, but hey, I'm learning now.....

 

When (if?!) I get the server acting normally (froze last night trying to perform the 'trust parity' proceedure so gave up for the night... it was late!) I'll swap the position of the parity drive with one on the nForce ports.

 

Thanks very much

 

Link to comment

 

When (if?!) I get the server acting normally (froze last night trying to perform the 'trust parity' proceedure so gave up for the night... it was late!) I'll swap the position of the parity drive with one on the nForce ports.

 

 

Again, another lockup a few seconds after booting up - can't get a syslog. Any ideas - PSU?? (only a 500W Zalman at the mo - 11 discs).

 

???

Link to comment

 

When (if?!) I get the server acting normally (froze last night trying to perform the 'trust parity' proceedure so gave up for the night... it was late!) I'll swap the position of the parity drive with one on the nForce ports.

 

 

Again, another lockup a few seconds after booting up - can't get a syslog. Any ideas - PSU?? (only a 500W Zalman at the mo - 11 discs).

 

???

 

If you can get the powerdown package installed you can 3 finger salute (control-alt-delete) the server to capture a syslog and restart.

Link to comment

 

When (if?!) I get the server acting normally (froze last night trying to perform the 'trust parity' proceedure so gave up for the night... it was late!) I'll swap the position of the parity drive with one on the nForce ports.

 

 

Again, another lockup a few seconds after booting up - can't get a syslog. Any ideas - PSU?? (only a 500W Zalman at the mo - 11 discs).

 

???

Memory is the first suspect, then power supply is a possibility. 

 

Make sure the memory voltage, timing, and clock speed are set for your specific RAM strips. (The MB tries to get it correct, sometimes it does, other times it does not)

Perform several cycles of memory test.

 

What specific 500W Zalman model?  Many do not have single 12 volt rails.  Ofther, 1/3 rd of the capacity is allocated for the CPU and MB, 1/3 for the PCI cards, and only 1/3 to the hard disks.  I've even seen that split where the SATA connectors were on one rail, and the Molex connectors another, so the capacity if you are only using the SATA power connectors is half of the available to the disks.

 

Joe L.

 

Link to comment

You lot reply quicker than I can type!

 

Right..... I read on another thread about "spurious" IRQ7 on nForce MB's. Had noticed this "spurious" entry in my syslog so I reserved the slot in the bios and rebooted (after having also taken the opportunity to swap the parity drive SATA cable). Server booted and did not lock-up this time. Co-incidence or otherwise I was then able to perform the "trust my parity" method and the system is now performing a parity check. 160 (corrected) sync errors in the first 20 seconds (but no more yet) and a long way to go. Fingers crossed.

 

Think I might treat myself to a new PSU for Xmas anyway (having just treated myself to a C200 popcorn hour to replace my A100 and also a WDTV live for the bedroom).

 

Don't tell the wife !!

 

:D

 

*** UPDATE ***

 

Parity check completed. All errors (160, mentioned above) corrected and all seems OK now.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.