Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Hardware Error MCE

Featured Replies

Oddly this started after I started transferring a few files directly from a rclone mount. Not sure if it was just a coincidence.

 

Quote

Jan 15 22:23:21 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:23:21 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:23:21 Backup kernel: CMCI storm detected: switching to poll mode
Jan 15 22:24:25 Backup kernel: mce_notify_irq: 15 callbacks suppressed
Jan 15 22:24:25 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:24:35 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:26:48 Backup kernel: mce_notify_irq: 2 callbacks suppressed
Jan 15 22:26:48 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:27:00 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:29:00 Backup kernel: mce_notify_irq: 2 callbacks suppressed
Jan 15 22:29:00 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:34:00 Backup kernel: CMCI storm subsided: switching to interrupt mode
Jan 15 22:37:07 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:37:54 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:38:20 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:38:29 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:39:20 Backup kernel: CMCI storm detected: switching to poll mode
Jan 15 22:39:21 Backup kernel: mce_notify_irq: 18 callbacks suppressed
Jan 15 22:39:21 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:39:22 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:40:26 Backup kernel: mce_notify_irq: 1 callbacks suppressed
Jan 15 22:40:26 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:40:40 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:43:14 Backup kernel: mce_notify_irq: 1 callbacks suppressed
Jan 15 22:43:14 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:46:30 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:50:00 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:53:10 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:54:32 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:54:37 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:55:42 Backup kernel: mce_notify_irq: 5 callbacks suppressed
Jan 15 22:55:42 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:55:52 Backup login[32403]: ROOT LOGIN on '/dev/pts/0'
Jan 15 22:56:00 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:56:43 Backup kernel: mce_notify_irq: 1 callbacks suppressed
Jan 15 22:56:43 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 15 22:56:54 Backup kernel: mce: [Hardware Error]: Machine check events logged



 

backup-diagnostics-20190115-2259.zip

  • Author

Anyone?

  • Community Expert

See if /var/log/mcelog has anything interesting, also if the board has a system event log, there might be some more info there.

  • Author

Thanks johnnie.black, always willing to help. Much appreciated. So I have mcelog installed in the nerd pack but it's never worked. I also know a ton of people who say the same thing. System event log only shows sys_fan4, 3, 2, 1, cpu2,1 fan all lower critrical going low asserted or deasserted. Probably due to the fans I'm using.
image.thumb.png.618a627f09aeab3f951525999c3ec4a7.png


I did notice this

Quote

70 01/10/2019 01:47:5734AC LostPower SupplyPower Supply Input Lost or Out of Range - Asserted

But realized that was when I shut down the server gracefully and had to pull it. I didn't open it, just did some rearranging in the rack. I'm doing a memtest right now. 

  • Author

1 Pass ran with no errors, not sure how many passes are sufficient in this case?

  • Community Expert

If you're using ECC RAM no errors will show on memtest, since they are corrected.

  • Author

Just started happening, so I'm not sure where to look here. 

  • 2 weeks later...
  • Author

Upgraded to the latest RC version available, hardware errors are still continuing. Here is the latest syslog messages that I haven't seen before.

 

Quote

Jan 25 06:00:14 Backup kernel: Uhhuh. NMI received for unknown reason 21 on CPU 0.
Jan 25 06:00:14 Backup kernel: Do you have a strange power saving mode enabled?
Jan 25 06:00:14 Backup kernel: Dazed and confused, but trying to continue
Jan 25 06:00:14 Backup kernel: DMAR: DRHD: handling fault status reg 2
Jan 25 06:00:14 Backup kernel: DMAR: [DMA Read] Request device [02:00.0] fault addr ff0cf000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:14 Backup kernel: DMAR: [DMA Read] Request device [03:00.0] fault addr fed22000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:14 Backup kernel: DMAR: DRHD: handling fault status reg 202
Jan 25 06:00:14 Backup kernel: DMAR: [DMA Read] Request device [03:00.0] fault addr fed13000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:14 Backup kernel: DMAR: DRHD: handling fault status reg 302
Jan 25 06:00:14 Backup kernel: DMAR: [DMA Read] Request device [03:00.0] fault addr fed14000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:15 Backup kernel: mce_notify_irq: 58 callbacks suppressed
Jan 25 06:00:15 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 25 06:00:16 Backup kernel: mce: [Hardware Error]: Machine check events logged
Jan 25 06:00:22 Backup kernel: dmar_fault: 8424 callbacks suppressed
Jan 25 06:00:22 Backup kernel: DMAR: DRHD: handling fault status reg 402
Jan 25 06:00:22 Backup kernel: DMAR: [DMA Read] Request device [02:00.0] fault addr fea3a000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:23 Backup kernel: DMAR: DRHD: handling fault status reg 502
Jan 25 06:00:23 Backup kernel: DMAR: [DMA Read] Request device [03:00.0] fault addr fec10000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:23 Backup kernel: DMAR: DRHD: handling fault status reg 602
Jan 25 06:00:23 Backup kernel: DMAR: [DMA Read] Request device [03:00.0] fault addr fe7df000 [fault reason 06] PTE Read access is not set
Jan 25 06:00:23 Backup kernel: DMAR: DRHD: handling fault status reg 702

 

backup-diagnostics-20190125-0754.zip

Edited by slimshizn

  • Author

Guess I can just live with the MCE logs for now. Any limetech/unraid admins have any ideas here? 

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.