unRAID Server Release 5.0-rc16c Available


Recommended Posts

noticed a syslog message today, 'kernel: Disabling IRQ #16', looking at the syslog its:

Jul 15 10:06:41 husky kernel: mdcmd (144): spindown 3 (Routine)
Jul 15 10:25:56 husky kernel: irq 16: nobody cared (try booting with the "irqpoll" option) (Errors)
Jul 15 10:25:56 husky kernel: Pid: 0, comm: swapper/0 Tainted: G           O 3.9.6p-unRAID #23 (Errors)
Jul 15 10:25:56 husky kernel: Call Trace: (Errors)
Jul 15 10:25:56 husky kernel:  [<c105f89e>] __report_bad_irq+0x29/0xb4 (Errors)
Jul 15 10:25:56 husky kernel:  [<c105fa60>] note_interrupt+0x137/0x1ac (Errors)
Jul 15 10:25:56 husky kernel:  [<c135d6e1>] ? cpuidle_wrap_enter+0x2f/0x82 (Errors)
Jul 15 10:25:56 husky kernel:  [<c105dfd0>] handle_irq_event_percpu+0x109/0x11a (Errors)
Jul 15 10:25:56 husky kernel:  [<c105ff24>] ? handle_percpu_irq+0x3b/0x3b (Errors)
Jul 15 10:25:56 husky kernel:  [<c105e006>] handle_irq_event+0x25/0x3c (Errors)
Jul 15 10:25:56 husky kernel:  [<c105ff24>] ? handle_percpu_irq+0x3b/0x3b (Errors)
Jul 15 10:25:56 husky kernel:  [<c105ff91>] handle_fasteoi_irq+0x6d/0xab (Errors)
Jul 15 10:25:56 husky kernel:  <IRQ>  [<c100362e>] ? do_IRQ+0x37/0x9b
Jul 15 10:25:56 husky kernel:  [<c1403d2c>] ? common_interrupt+0x2c/0x31 (Errors)
Jul 15 10:25:56 husky kernel:  [<c135d6e1>] ? cpuidle_wrap_enter+0x2f/0x82 (Errors)
Jul 15 10:25:56 husky kernel:  [<c135d746>] ? cpuidle_enter_tk+0x12/0x14 (Errors)
Jul 15 10:25:56 husky kernel:  [<c135d1d1>] ? disable_cpuidle+0xf/0xf (Errors)
Jul 15 10:25:56 husky kernel:  [<c135d1ef>] ? cpuidle_enter_state+0xc/0x38 (Errors)
Jul 15 10:25:56 husky kernel:  [<c135d605>] ? cpuidle_idle_call+0x73/0x9b (Errors)
Jul 15 10:25:56 husky kernel:  [<c1008802>] ? cpu_idle+0x46/0x6f (Errors)
Jul 15 10:25:56 husky kernel:  [<c13f6b98>] ? rest_init+0x58/0x5a (Errors)
Jul 15 10:25:56 husky kernel:  [<c1574a0d>] ? start_kernel+0x2ad/0x2b3 (Errors)
Jul 15 10:25:56 husky kernel:  [<c15745db>] ? repair_env_string+0x53/0x53 (Errors)
Jul 15 10:25:56 husky kernel:  [<c15742a3>] ? i386_start_kernel+0x79/0x7d (Errors)
Jul 15 10:25:56 husky kernel: handlers:
Jul 15 10:25:56 husky kernel: [<c1321628>] usb_hcd_irq (Drive related)
Jul 15 10:25:56 husky kernel: [<f848009d>] mvs_interrupt [mvsas] (Drive related)
Jul 15 10:25:56 husky kernel: Disabling IRQ #16
Jul 15 11:19:43 husky kernel: mdcmd (145): spindown 3 (Routine)

 

anyone seen this before?

 

Only USB devices I have connected is the UPS... just tried and unmenu can still poll it just fine.

I did have a USB mouse attached but disconnected it since I didn't even have a keyboard attached anymore...  reading through the forums some people say this could be the culprit.

 

Regardless, looking though the 'pci devices' I see two entries for IRQ 16:

 

00:1a.0 USB Controller: Intel Corporation Unknown device 1e2d (rev 04) (prog-if 20 [EHCI])
Subsystem: Micro-Star International Co., Ltd. Unknown device 7752
Flags: bus master, medium devsel, latency 0, IRQ 16
Memory at f7f13000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCIe advanced features 
Kernel driver in use: ehci-pci

01:00.0 RAID bus controller: Marvell Technology Group Ltd. MV64460/64461/64462 System Controller, Revision B (rev 01)
Subsystem: Marvell Technology Group Ltd. Unknown device 6480
Flags: bus master, fast devsel, latency 0, IRQ 16
I/O ports at e000 [size=128]
Memory at f7e40000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at f7e00000 [disabled] [size=256K]
Capabilities: [48] Power Management version 2
Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Capabilities: [e0] Express Legacy Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting 
Kernel driver in use: mvsas
Kernel modules: mvsas

 

so far everything appears fine.. thinking about restarting the box just in case... will wait to hear back from Tom first.

Link to comment
  • Replies 392
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

noticed a syslog message today, 'kernel: Disabling IRQ #16',

...

so far everything appears fine.. thinking about restarting the box just in case... will wait to hear back from Tom first.

 

Here is a whole thread on my IRQ #16 being disabled issue.  Like with your system, IRQ 16 gets multiple assignments in my system. Simply switching my display input disabled IRQ #16 which killed my Marvell based SATA controller and the I/O to all drives attached to it.

 

lime-technology.com/forum/index.php?topic=17823.0

 

Don't think this is an RC16c issue.

Link to comment

wish unraid could warn me about irq's being assigned to critical things and plug-n-play devices.. which could result in disastrous problems from something so harmless.. if someone was to just unplug a mouse/keyboard/switch video/whatever and have half their array drop out.

 

# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:         12          0          0          0   IO-APIC-edge      timer
  1:          1          0          2          0   IO-APIC-edge      i8042
  9:          0          0          0          0   IO-APIC-fasteoi   acpi
12:          0          0          3          0   IO-APIC-edge      i8042
16:      89633       5892    1499972       4725   IO-APIC-fasteoi   ehci_hcd:usb1, mvsas
17:       9885       2247      80028       1955   IO-APIC-fasteoi   mvsas
23:        504        359       2885        326   IO-APIC-fasteoi   ehci_hcd:usb2
42:          0          0          0          0   PCI-MSI-edge      xhci_hcd
43:    3676852    1674574  121771747    2378309   PCI-MSI-edge      eth0
44:     120789      67740     263688      62826   PCI-MSI-edge      ahci
NMI:          0          0          0          0   Non-maskable interrupts
LOC:    5408092    3146394    4848004    1784953   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0   Performance monitoring interrupts
IWI:          2          0          0          0   IRQ work interrupts
RTR:          3          0          0          0   APIC ICR read retries
RES:      62805      92017      19523      13617   Rescheduling interrupts
CAL:      17599      14659      20829      16744   Function call interrupts
TLB:       2286       2085       3125       2369   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0   Machine check exceptions
MCP:       2153       2153       2153       2153   Machine check polls
ERR:          0
MIS:          0

 

guess moral of the story is just dont touch anything connected to the unraid box once its running.. 

Link to comment

Tom when did the unRAID code change (release) and add "fuse_remember="330" to the share.cfg file? only when clicking "apply" in the smb settings page?

Doing so "clicking apply in the smb settings page, crashed emhttp on my system, services restarted and shares/disk accessible but WebGui inaccessible). I don't see any statement in the release notes to add or click apply to have this additional value set in share.cfg file.

Please capture system log.  The presence or absence of this setting should not cause any crashes.

Syslog attached for you (managed to grab that before my un-graceful restart). When I hit apply in the smb settings, I was not aware it would re-start smb as I did not change any settings/values . Just wanted to see if it would write this new entry as it did for that other person in that post.

 

I had a ISO extracting on the cache drive at the time, when I looked at my ssh session with a tail of syslog, I noticed umount was not successfully (not sure what was unmounting??? to restart smb), so I quickly, canceled the ISO extract, which bombed due to this smb restart and just exited out of the app that was doing the extract from a windows client, the syslog entries proceeded after exiting the app (just letting you know some history to apply while reading the syslog attached, there was no app installed on unRAID nor do I have any installed on unRAID to be clear). Upon completion of smb restart from clicking apply, the WebGui was no longer available, I tried from several different clients. ssh/disk/user share via SMB were available, didn't have time to check the mac clients via AFP.

 

I can understand the presence or absence of this setting, should not have, but emhttp definitely did die when clicking "apply" in the smb settings (whether or not it was do to writing this value). The server was not being taxed, all drives were spun down less the cache drive where the source iso resided and a windows client was extracting the iso in the same folder as the iso. Nothing else was going on, all mac's were asleep.

 

Secondary, so how does one reboot gracefully via command line, as I need to right now.

 

Stop all plugins.

None to stop

 

Stop all network services.

What is the command to do that?

 

Unmount the user share file system.

What is the command to do that?

 

Unmount all the data disks.

What is the command to do that?

 

Stop the md-driver.

What is the command to do that?

 

Reboot.  Or stop any I/O to the server, reboot and cancel the subsequent parity check.

Luckily I do know the shutdown or restart command, and thats all i could do and cancel the parity check upon restart. If I am not mistaken there should be a 'sync' somewhere in the steps above (unless its apart of one of those comments above). But would be nice to know how to gracefully do this via command line, or maybe you could supply a script we could keep in the root of the flash drive? Quite a few of the options in the WebGui can be executed from the commandline, but not a graceful shutdown/reboot as the WebGui would (is it top secret?)

 

 

Loading a VM with basic version running Beta12a (I wanted to go pretty far back), clicked apply to every setting in the webgui.

Loaded another VM with basic version running RC16c, clicked apply to every setting in the webgui.

 

Results of differences:

 

ident.cfg

New value(s) in RC16c

SYS_MODEL=""

SYS_SLOTS=""

 

No idea what they are used for.

One would only see these new entries if you clicked 'Apply' in "Identification" section

 

disk.cfg

old value removed from Beta12c

startState=""

and replaced with in RC16c

startArray="no"

 

Understood.

 

 

network.cfg

New value(s) in RC16c

BONDING="no"

BONDING_MODE="1"

 

Understood.

One would only see these new entries if you clicked 'Apply' in "Network Setting" section

 

 

share.cfg

New value(s) in RC16c

fuse_remember="330"

 

Not understood, as I though its the NFS Settings that receive this new parameter in the 'NFS' section in the WebGui (unless NFS/AFP/SMB setting all get written to share.cfg, then it would be understood, seems like they are, looking at it now.)

 

 

One would only see this new entries if you clicked 'Apply' in the respective section without having to make any change to a setting. Not sure how to say this, expect unwelcome surprise to not have these entries added immediately upon an upgrade (and booting up) in the version they stem from.

syslogsmbsettingsclickedapply.txt

Link to comment

Hi all,

 

A small issue I've just noticed - the system won't power off on shutdown - I'm running rc16b now but I did not see anything new in rc16c release notes that may change this behaviour. It used to work with earlier releases - at least with those I've tried (I had rc12a before)

 

MB is an older Intel DP35DP / E2160 / 5x1.5T HDD + 3 TB parity.

 

Link to comment

Hi all,

 

A small issue I've just noticed - the system won't power off on shutdown - I'm running rc16b now but I did not see anything new in rc16c release notes that may change this behaviour. It used to work with earlier releases - at least with those I've tried (I had rc12a before)

 

MB is an older Intel DP35DP / E2160 / 5x1.5T HDD + 3 TB parity.

update to rc16c and let us know if it still happens

Link to comment

I'll say that rc16c seemed to resolve my Transport Endpoint Not Connected issues with Plex. I was able to recently add my entire music library (approx 9,000 tracks) and scan it on Plex with nary an issue, when previously even scanning a few TV shows at a time (maybe 200 episodes?) would cause the issue to happen.

Link to comment

Hi all,

 

A small issue I've just noticed - the system won't power off on shutdown - I'm running rc16b now but I did not see anything new in rc16c release notes that may change this behaviour. It used to work with earlier releases - at least with those I've tried (I had rc12a before)

 

MB is an older Intel DP35DP / E2160 / 5x1.5T HDD + 3 TB parity.

update to rc16c and let us know if it still happens

 

Performed the update to RC16C - no problems noticed so far, but it has the same behaviour .... on shutdown the system doesn't power-off. Could be some ACPI / APIC related (had no time to check further). Certainly, it did work up to 12/12a and with all previous betas I've been using during the last couple of years (far as I remember).

 

Forgot to mention earlier - stock unRaid/no addons.

 

Link to comment

Performed the update to RC16C - no problems noticed so far, but it has the same behaviour .... on shutdown the system doesn't power-off. Could be some ACPI / APIC related (had no time to check further). Certainly, it did work up to 12/12a and with all previous betas I've been using during the last couple of years (far as I remember).

 

Forgot to mention earlier - stock unRaid/no addons.

 

Some people also have problems with ACPI stuff since RC12a, see here: http://lime-technology.com/forum/index.php?topic=28055.msg248671#msg248671 I know sleep is not supported but still... It seems that something related has changed from 12a on :-(

Link to comment

I know sleep is not supported but still... It seems that something related has changed from 12a on :-(

 

S3 sleep in the past was problematic because of "buggy" motherboard bios.  I didn't want to get into a situation of having to write all kinds of specialized modules to support every motherboard out there.  These days the situation is much better and I agree that S3 sleep support should be "standard".

 

Should be...

Link to comment

One fairly minor point...

running rc15a

I used to rely on the PC 'beep' when unRAID came up, but now I've had to add it as a command in the go script. Can this be reinstated?

 

Not yet upgraded to 16c, so if anybody can verify this is the case therein, I will upgrade soon enough anyway. Thanks.

 

m

Link to comment

...

Not yet upgraded to 16c, so if anybody can verify this is the case therein, I will upgrade soon enough anyway. Thanks.

...

 

As a gerneral rule of thumb you shouldnt be running an rc unless its the latest and people arent going to pay a lot of attention if your not.

 

Join the party... upgrade :)

Link to comment

OK - I'm on rc16c. It still does not beep on boot as 4.7 used to, which is very useful in headless operation in determining when the box is "up".

 

Having been running V5 since beta4, I no longer remember what 4.x used to do.  However, I think that the definition "up" is a little uncertain.  Is it when disk shares are available, user shares available, the emhttp interface is available, or all "startup activity" has completed?  Termination of the 'go' script does not, necessarily, coincide with any of these.

 

I know that, in my system, some of my addons are active long before the emhttp interface starts responding.  I can access unMENU before emhttp.  I believe that this occurs because of the way that the events interface works, and the fact that I run a number of addons.

 

I've manually added beep;beep;beep at the end of my go script for now.

 

This may well be the best solution.  Add the beeps where they match your definition of "up".

This could be in the 'go' script, or it could be on one of the events.

Link to comment

It still does not beep on boot as 4.7 used to, which is very useful in headless operation in determining when the box is "up".

 

The beep isn't actually the system coming up but is the network daemon detecting a change (e.g. connecting). I always found I had to wait for a bit until emhttp & unmenu were actually started.

 

I've confirmed this by unplugging & reconnecting the ethernet cable and it beeps on both events.

Link to comment

Should the Cache Drive also conform to these instructions?

 

Click on each disk link on the Main page and examine the Partition format field. If you see "MBR: error", or "MBR: unknown" for any disk, do not Start the array; instead post your finding in the Forum announcement thread for this release. If everything looks ok, click Start to bring the array on-line.

 

After upgrading from 5rc8a to 5rc16c, my cache drive says:  Partition format: unknown

Link to comment

I thought Tom said he fixed that.

 

I guess it's not totally fixed. 1 share is still displaying incorrect, despite being restricted to 1 drive. Manually running the mover subtracts the cache size from the total, but it's still including free space available of all drives.

 

When I posted the above it was happening to a few of my shares. Now it's isolated to just the 1. I'll keep an eye on it while I wait for the next rc/final release. Thanks.

Link to comment
Guest
This topic is now closed to further replies.