Error when adding more then 15 drives

ALR · May 15, 2009

Has anyone had problems adding more then 15 drives? I keep getting the following error in the syslog:

May 15 07:13:00 Tower2 kernel: ReiserFS: md12: checking transaction log (md12)

May 15 07:13:00 Tower2 kernel: BUG: unable to handle kernel paging request at 6d614e8e

May 15 07:13:00 Tower2 kernel: IP: [<f82e525c>] xor_block+0x76/0x84 [md_mod]

May 15 07:13:00 Tower2 kernel: *pdpt = 0000000002de0001 *pde = 0000000000000000

May 15 07:13:00 Tower2 kernel: Oops: 0000 [#1] SMP

May 15 07:13:00 Tower2 kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ho

st0/target0:4:0/0:4:0:0/block/sda/stat

May 15 07:13:00 Tower2 kernel: Modules linked in: md_mod ata_piix sata_promise e1000 sata_sil24 liba

ta

May 15 07:13:00 Tower2 kernel:

May 15 07:13:00 Tower2 kernel: Pid: 1906, comm: unraidd Not tainted (2.6.29.1-unRAID #2)

May 15 07:13:00 Tower2 kernel: EIP: 0060:[<f82e525c>] EFLAGS: 00010202 CPU: 0

May 15 07:13:00 Tower2 kernel: EIP is at xor_block+0x76/0x84 [md_mod]

May 15 07:13:00 Tower2 kernel: EAX: 00001000 EBX: 6d614e76 ECX: c2890000 EDX: c28fb000

May 15 07:13:00 Tower2 kernel: ESI: c2893000 EDI: c28fb000 EBP: f63cbefc ESP: f63cbedc

May 15 07:13:00 Tower2 kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068

May 15 07:13:00 Tower2 kernel: Process unraidd (pid: 1906, ti=f63ca000 task=f6fff740 task.ti=f63ca00

0)

May 15 07:13:00 Tower2 kernel: Stack:

May 15 07:13:00 Tower2 kernel: c2892000 c2893000 c2868000 6d614e76 00001000 00000001 c28f11e0 c28f1

3a0

May 15 07:13:00 Tower2 kernel: f63cbf3c f82e7474 00000001 00000005 00000011 00001000 00001000 f5aab

000

May 15 07:13:00 Tower2 kernel: c28fb000 c2890000 c2892000 c2893000 c2868000 c28f1910 00000000 c28f1

1e0

May 15 07:13:00 Tower2 kernel: Call Trace:

May 15 07:13:00 Tower2 kernel: [<f82e7474>] ? compute_parity+0x101/0x2cc [md_mod]

May 15 07:13:00 Tower2 kernel: [<f82e7ff0>] ? handle_stripe+0x8cc/0xc5a [md_mod]

May 15 07:13:00 Tower2 kernel: [<c0344f57>] ? schedule+0x5e3/0x63c

May 15 07:13:00 Tower2 kernel: [<f82e8804>] ? unraidd+0x9e/0xbc [md_mod]

May 15 07:13:00 Tower2 kernel: [<f82e8766>] ? unraidd+0x0/0xbc [md_mod]

May 15 07:13:00 Tower2 kernel: [<c01301bc>] ? kthread+0x3b/0x63

May 15 07:13:00 Tower2 kernel: [<c0130181>] ? kthread+0x0/0x63

May 15 07:13:00 Tower2 kernel: [<c01035db>] ? kernel_thread_helper+0x7/0x10

May 15 07:13:00 Tower2 kernel: Code: f8 8b 71 0c 89 45 ec 75 13 56 8b 45 f0 89 d1 53 8b 5d ec 89 fa

ff 53 14 5a 59 eb 15 ff 71 10 89 d1 8b 45 f0 89 fa 56 53 8b 5d ec <ff> 53 18 83 c4 0c 8d 65 f4 5b 5e

5f 5d c3 55 89 e5 57 31 ff 56

May 15 07:13:00 Tower2 kernel: EIP: [<f82e525c>] xor_block+0x76/0x84 [md_mod] SS:ESP 0068:f63cbedc

May 15 07:13:00 Tower2 kernel: ---[ end trace 6c2b49c6389e0999 ]---

When this happens the main web page hangs on "mounting". Any ideas? I'm using version 4.5-beta6

Thanks

Joe L. · May 15, 2009

The first line in the error suggests you might have some corruption with the file-system on disk12.

You might try a reiserfsck check of it. Instructions in the wiki.

ALR · May 15, 2009

Joe, Thanks for the suggestion.

I ran reiserfsck and it found no corruptions on disk12. Putting the array back to 15 disks eliminates the error. It only occurs when I try to expand the system past 15 drives.

SSD · May 17, 2009

I have not seen anyone reporting arrays over the 15 drive level. You may be in new territory. Suggest you post in the announcements thread so that Tom sees it.

rlung · June 3, 2009

Same issue here.

Array was great on 4.4final with 16 + cache for ~ 3 months, upgraded to 4.5beta6 initial migration was flawless. However, upon adding a 17th drive system will freeze.

- downgraded to 4.5beta4 (initial 20 drive support)... no change

- added 17th drive, assigned to disk 18 or 19... no change

- added 17th drive to different slots on norco-4020 (on both another aoc-sat2-mv8 & x7sbe controller)... no change

- added 17th drive using 3 different wd5000abys (500GB) and a seagate 320GB... no change

- format of USB key, fresh install of 4.4beta6 (erasing unmenu, dir_cache, etc)... no change

- temp disable parity and add 1 wd5000abys... good, array completes mounting (adding parity back will freeze)

- temp disable 2 existing data drives (wd10eacs) and added 2 wd5000abys... good, array completes mounting (adding another drive will freeze)

I can see the syslog and looks simliar to ALR's except without the md12 issue but with a segmentation error, capturing it is proving elusive.

WeeboTech · June 3, 2009

I can see the syslog and looks simliar to ALR's except without the md12 issue but with a segmentation error, capturing it is proving elusive.

I don't think you, or anyone else, needs to go through hardware tests.

It reads as if there is an issue in the driver. An array that is out of bounds somewhere.

hypyke · June 6, 2009

I get this segfault as well when adding a drive but I have 16 data and 1 parity already installed and working. It's when I add the 17th data drive that it borks. It gets all the way through clearing the new disk tried to mount the drives and segfaults. I sent the info to tom.

NAS · June 6, 2009

Me to same fault.

However moving the same drive to another free port made it go away. Something weird is afoot

kingpin · June 6, 2009

i am also having a problem adding a 16th data drive. if i install the 16th one i will get the same error as above, and the system will hang with the drive trying to be mounted. any suggestions with this would be greatly appreciated.

RobJ · June 6, 2009

I think the only suggestion that can be made right now, is to wait for Tom to correct the issue.

fitbrit · June 23, 2009

I think the only suggestion that can be made right now, is to wait for Tom to correct the issue.

Okay. I just bought 4 more drives and will wait to add them. Kinda sucks as drive prices fall rapidly, and I've been avoiding buying new stuff until I have the time to use/install it. Still, I am really looking forward to this new feature being fully implemented.

Joe L. · June 23, 2009

I think the only suggestion that can be made right now, is to wait for Tom to correct the issue.

Okay. I just bought 4 more drives and will wait to add them. Kinda sucks as drive prices fall rapidly, and I've been avoiding buying new stuff until I have the time to use/install it. Still, I am really looking forward to this new feature being fully implemented.

Although you must wait until you can assign them, I'd install them, and use the preclear_disk.sh script to test them and ensure they are working properly. Then, once Tom fixes the emhttp process, you can assign them. Hopefully, that won't be too long, but you'll be ready and you'lll be able to quickly add them.

Joe L.

fitbrit · June 23, 2009

I think the only suggestion that can be made right now, is to wait for Tom to correct the issue.

Okay. I just bought 4 more drives and will wait to add them. Kinda sucks as drive prices fall rapidly, and I've been avoiding buying new stuff until I have the time to use/install it. Still, I am really looking forward to this new feature being fully implemented.

Although you must wait until you can assign them, I'd install them, and use the preclear_disk.sh script to test them and ensure they are working properly. Then, once Tom fixes the emhttp process, you can assign them. Hopefully, that won't be too long, but you'll be ready and you'lll be able to quickly add them.

Joe L.

Sounds like a great idea in theory. I'll have to overcome a little phobia of getting to know what to do with these scripts first. Tomorrow is a holiday in Quebec, so I'll try to get things started then.

jetkins · June 23, 2009

Sounds like a great idea in theory. I'll have to overcome a little phobia of getting to know what to do with these scripts first. Tomorrow is a holiday in Quebec, so I'll try to get things started then.

The preclear_disk script is a doddle to use. It doesn't require any installation - you just copy it to your flash drive share, log in to unRAID's console (or telnet in), and execute the script with the target drive as the single parameter. It even has safety checks built in to keep you from clearing a drive that's already in use.

Guzzi · June 30, 2009

I think the only suggestion that can be made right now, is to wait for Tom to correct the issue.

Hi all,

long time nothing heard from Tom - did anybody get info if he is aware of the 17+ drives problem and /or working on it?

Istn't there a possibility for a small hotfix?

thanks, Guzzi

StevenD · June 30, 2009

The preclear_disk script is a doddle to use. It doesn't require any installation - you just copy it to your flash drive share, log in to unRAID's console (or telnet in), and execute the script with the target drive as the single parameter. It even has safety checks built in to keep you from clearing a drive that's already in use.

I also discovered (and this may already be known) that if you open more than one telnet session, you can preclear several drives at one time (one per session).

Error when adding more then 15 drives

Recommended Posts

ALR

Link to comment

Joe L.

Link to comment

ALR

Link to comment

SSD

Link to comment

rlung

Link to comment

WeeboTech

Link to comment

hypyke

Link to comment

NAS

Link to comment

kingpin

Link to comment

RobJ

Link to comment

fitbrit

Link to comment

Joe L.

Link to comment

fitbrit

Link to comment

jetkins

Link to comment

Guzzi

Link to comment

StevenD

Link to comment

Join the conversation