removing BAD disk from array


Recommended Posts

HI Guys,

 

Had to go out of town for work and of course i have a failed disk when this happens :P

now i can remotely connect trough vpn to my system so no worries there but i can not replace the disk with a new one till i am back

so i copied the content of the failed disk to my main machine and now i would like to remove the failed disk from the array

 

so i would like to check with you guys if this is the correct procedure underneath

 

stop array

unasign bad disk

reboot server

telnet in

initconfig (to rebuild the array with only 9 disks and so that the parity is correct )

wait till the parity is synced for the new 9 disk only array

copy my content from disk 10 back to the array (have enough space for that)

 

when i come home just preclear a new disk and add to array :P

 

from reading the wiki i think this is the way to do this ? but i would like to be sure :P

don't want to loose 9 other data drives :)

 

 

 

 

Link to comment

HI Guys,

 

Had to go out of town for work and of course i have a failed disk when this happens :P

now i can remotely connect trough vpn to my system so no worries there but i can not replace the disk with a new one till i am back

so i copied the content of the failed disk to my main machine and now i would like to remove the failed disk from the array

 

so i would like to check with you guys if this is the correct procedure underneath

 

stop array

unasign bad disk

reboot server

telnet in

initconfig (to rebuild the array with only 9 disks and so that the parity is correct )

wait till the parity is synced for the new 9 disk only array

copy my content from disk 10 back to the array (have enough space for that)

 

when i come home just preclear a new disk and add to array :P

 

from reading the wiki i think this is the way to do this ? but i would like to be sure :P

don't want to loose 9 other data drives :)

Looks good to me, but you do not need to reboot the server after un-assigning the bad disk.

(it won't hurt, ad it will clean out the syslog, but unless you are running out of memory, it is not needed.)

Link to comment

Thanks Joe for confirming

 

Parity is rebuilding

 

this were the errors messages about disk 10 in the logs

 

Jan 18 12:02:03 p5bplus unmenu[1658]: Unrecognized state, drive sdl, assuming not spinning: drive state is: unknown

Jan 18 12:02:03 p5bplus kernel: sd 11:0:0:0: [sdl] Unhandled error code

Jan 18 12:02:03 p5bplus kernel: sd 11:0:0:0: [sdl] Result: hostbyte=0x04 driverbyte=0x00

Jan 18 12:02:03 p5bplus kernel: sd 11:0:0:0: [sdl] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 20 00

Jan 18 12:02:03 p5bplus kernel: end_request: I/O error, dev sdl, sector 0

Jan 18 12:02:03 p5bplus kernel: Buffer I/O error on device sdl, logical block 0

Jan 18 12:02:03 p5bplus kernel: Buffer I/O error on device sdl, logical block 1

Jan 18 12:02:03 p5bplus kernel: Buffer I/O error on device sdl, logical block 2

Jan 18 12:02:03 p5bplus kernel: Buffer I/O error on device sdl, logical block 3

Jan 18 12:02:03 p5bplus kernel: sd 11:0:0:0: [sdl] Unhandled error code

Jan 18 12:02:03 p5bplus kernel: sd 11:0:0:0: [sdl] Result: hostbyte=0x04 driverbyte=0x00

Jan 18 12:02:03 p5bplus kernel: sd 11:0:0:0: [sdl] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 08 00

Jan 18 12:02:03 p5bplus kernel: end_request: I/O error, dev sdl, sector 0

Jan 18 12:02:03 p5bplus kernel: Buffer I/O error on device sdl, logical block 0

They are still spawning in my logs

anything i can do to say to the OS that the disk can be disabled ?

 

Link to comment

OK parity rebuild on 9 drives :)

all well

did reboot the server after the parity rebuild

and did smartctl on bad disk

 

and it said PASSED

 

see attached txt, file

 

so i think let's try a preclear as i am not around the computer so running preclear now and get the syslog filled with these

 

Jan 18 20:17:24 p5bplus kernel: ata11.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0

Jan 18 20:17:24 p5bplus kernel: ata11.00: irq_stat 0x48000000

Jan 18 20:17:24 p5bplus kernel: ata11.00: failed command: READ FPDMA QUEUED

Jan 18 20:17:24 p5bplus kernel: ata11.00: cmd 60/00:00:00:b3:87/02:00:16:00:00/40 tag 0 ncq 262144 in

Jan 18 20:17:24 p5bplus kernel:          res 41/40:00:ab:b3:87/54:00:16:00:00/40 Emask 0x409 (media error) <F>

Jan 18 20:17:24 p5bplus kernel: ata11.00: status: { DRDY ERR }

Jan 18 20:17:24 p5bplus kernel: ata11.00: error: { UNC }

Jan 18 20:17:24 p5bplus kernel: ata11.00: configured for UDMA/133

Jan 18 20:17:24 p5bplus kernel: sd 11:0:0:0: [sdl] Unhandled sense code

Jan 18 20:17:24 p5bplus kernel: sd 11:0:0:0: [sdl] Result: hostbyte=0x00 driverbyte=0x08

Jan 18 20:17:24 p5bplus kernel: sd 11:0:0:0: [sdl] Sense Key : 0x3 [current] [descriptor]

Jan 18 20:17:24 p5bplus kernel: Descriptor sense data with sense descriptors (in hex):

Jan 18 20:17:24 p5bplus kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00

Jan 18 20:17:24 p5bplus kernel:        16 87 b3 ab

Jan 18 20:17:24 p5bplus kernel: sd 11:0:0:0: [sdl] ASC=0x11 ASCQ=0x4

Jan 18 20:17:24 p5bplus kernel: sd 11:0:0:0: [sdl] CDB: cdb[0]=0x28: 28 00 16 87 b3 00 00 02 00 00

Jan 18 20:17:24 p5bplus kernel: end_request: I/O error, dev sdl, sector 377992107

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249013

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249014

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249015

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249016

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249017

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249018

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249019

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249020

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249021

Jan 18 20:17:24 p5bplus kernel: Buffer I/O error on device sdl, logical block 47249022

Jan 18 20:17:24 p5bplus kernel: ata11: EH complete

Jan 18 20:17:27 p5bplus kernel: ata11.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0

Jan 18 20:17:27 p5bplus kernel: ata11.00: irq_stat 0x48000000

Jan 18 20:17:27 p5bplus kernel: ata11.00: failed command: READ FPDMA QUEUED

Jan 18 20:17:27 p5bplus kernel: ata11.00: cmd 60/08:00:a8:b3:87/00:00:16:00:00/40 tag 0 ncq 4096 in

Jan 18 20:17:27 p5bplus kernel:          res 41/40:00:ab:b3:87/54:00:16:00:00/40 Emask 0x409 (media error) <F>

Jan 18 20:17:27 p5bplus kernel: ata11.00: status: { DRDY ERR }

Jan 18 20:17:27 p5bplus kernel: ata11.00: error: { UNC }

no clue what it means

disk is a WD EADS 1TB

on the jmicorn esata port

jmicron is set to AHCI

 

guess the disk is a gonner ??

SMART_status_Info_for_sdl.txt

syslog.txt

Link to comment

The disk has 44 sectors pending re-allocation.

 

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      44

 

The errors were reported when they were not able to be read.  They will be re-allocated when next written.  Most disks have several thousand spare sectors to use when re-allocating bad sectors.

 

Joe L.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.