SnickySnacks

February 11, 2020

You aren't trying to access the smb share with the root user are you?

December 17, 2019

On 12/13/2019 at 5:07 AM, TJOPTJOP said:

What I also do not understand is how thoses dimms kan be broken. ECC dimms are error corrected right?

Also, this is exactly what the log is telling you.

A "CE memory read error" or "CE memory scrubbing error" is a "Correctable Error (CE)". It would be worse if you were getting "Uncorrectable Errors (UE)"

My concern is that you've merely turned off the memory scrubbing and whatnot, hiding the errors rather than fixing them.

December 17, 2019

Just to be clear, are you now running with ECC off? That seems like a bad idea.

Doesn't that mean that instead of seeing the errors, it's just going to be failing silently and potentially corrupting the memory? (especially if you weren't seeing the errors in memtest in the first place)

Quote

So, I decide to install all the ram modules again, turn off ECC checking in my bios and run a full Memtest. Any suggestions how many pases I need to confirm if my ram is good or bad? I think that one pass will take 12+ hours.

December 17, 2019

Is your server exposed to the internet?

A few of those process names look suspiciously like malware.

/tmp/kdevtmpfsi

/var/tmp/kinsing

December 6, 2019

It will be something like:

1) copy everything off the existing licensed USB onto a PC

2) delete everything off the licensed USB

3) copy everything from the trial USB to the licensed USB

4) Delete the trial key from the licensed USB
5) copy the licensed Unraid key file from the PC backup onto the licensed USB

If you do it this way you should still have your trial USB and a backup copy of your licensed USB in case anything goes wrong.

December 5, 2019

Why not just copy your new server config to your current USB?

No transfer needed, then.

November 22, 2019

It's my preference to stop them on Unraid since I don't have to manage individual computers (or guests!).

If you notice any new files that need to be added to the veto list, please let the community know.
At this point my shares are all read-only as a ransomware preventative measure except for 1 share that I stage files to, so nothing can really make files in the array anymore anyways.

November 20, 2019

If you want Unraid itself to prevent these files from being created:

1. Stop array
2. Go to SMB Settings and add this to SMB Extra Configuration

veto files = /._*/.DS_Store/.AppleDouble/.Trashes/.TemporaryItems/.Spotlight-V100/
delete veto files = yes

November 11, 2019

Are the files that are having an issue all on the same disk? If so, which one?

I'm also a bit curious if you've tried comparing with a hex editor or something to see what is being changed.

November 7, 2019

Nobody is going to point out that he's running two emulated parity drives?

That's got to slow you down a bit, I'd imagine, since writes are going to spin up and read the whole array.

Am I wrong?

October 16, 2019

Oh. So that NVME error wasn't so harmless.

At least you got it figured out.

October 15, 2019

5 hours ago, Zonediver said:

That's a normal behavior.

After "every" boot/reboot you need to push the spindown-button only once and all is fine - until the next boot/reboot.

Eh? Not that normal.

I rarely if ever hit the spin down button and my disks always spin down normally after a reboot (once folder caching is done doing its thing).

Flam3h:
How sure are you that the drives aren't spinning down eventually? Fix Common Problems is running 10 minutes after your system comes up and you can see in the log that the system issues a spin down ~15 minutes after that, which generates a (likely harmless) error on an NVME drive:

Oct 14 01:26:33 Tower emhttpd: shcmd (138): /usr/sbin/hdparm -y /dev/nvme0n1
Oct 14 01:26:33 Tower root:  HDIO_DRIVE_CMD(standby) failed: Inappropriate ioctl for device
Oct 14 01:26:33 Tower root: 
Oct 14 01:26:33 Tower root: /dev/nvme0n1:
Oct 14 01:26:33 Tower root:  issuing standby command
Oct 14 01:26:33 Tower emhttpd: shcmd (138): exit status: 25

Other than that, everything seems normal in the log.

October 11, 2019

When you say "wire shelf" do you literally mean they are sitting exposed on a wire shelf and not in any sort of PC case?

October 9, 2019

I don't know much about cache disks.

Obvious suggestion is to fix the error on the disk and then enable plugins/dockers/VMs one at a time until you can determine which one is causing the problem, I suppose.

October 8, 2019

At the time those diagnostics were created, were any CPUs showing 100% load? If so, which ones?

Also, have you tried booting in safe mode and seeing if this occurs with no plugins/dockers/vms loaded?

There does seem to be some corruption on one of your disks:

Oct  7 23:56:33 Homebase kernel: BTRFS critical (device sdj1): corrupt leaf: root=5 block=1953586397184 slot=84, bad key order, prev (288230376157862467 96 4) current (6150723 96 5)
### [PREVIOUS LINE REPEATED 4 TIMES] ###

Should probably run a check on that one, as it looks like it eventually causes a kernel fault:

Oct  8 02:02:49 Homebase kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
Oct  8 02:02:49 Homebase kernel: PGD 4ad0b1067 P4D 4ad0b1067 PUD 4ad0b0067 PMD 0 
Oct  8 02:02:49 Homebase kernel: Oops: 0000 [#1] SMP NOPTI
Oct  8 02:02:49 Homebase kernel: CPU: 15 PID: 1848 Comm: fstrim Tainted: P           O      4.19.56-Unraid #1
Oct  8 02:02:49 Homebase kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Taichi, BIOS P2.10 09/09/2019
Oct  8 02:02:49 Homebase kernel: RIP: 0010:btrfs_trim_fs+0x166/0x369
Oct  8 02:02:49 Homebase kernel: Code: 00 00 48 c7 44 24 38 00 00 00 00 49 8b 45 10 48 c7 44 24 40 00 00 00 00 48 c7 44 24 30 00 00 00 00 48 89 44 24 20 48 8b 43 68 <48> 8b 80 80 00 00 00 48 8b 80 f8 03 00 00 48 8b 80 a8 01 00 00 0f
Oct  8 02:02:49 Homebase kernel: RSP: 0018:ffffc9001294fc90 EFLAGS: 00010297
Oct  8 02:02:49 Homebase kernel: RAX: 0000000000000000 RBX: ffff888f5db68200 RCX: ffff888fbf604878
Oct  8 02:02:49 Homebase kernel: RDX: ffff888cac98de80 RSI: ffff888f5d718c00 RDI: ffff888fbf604858
Oct  8 02:02:49 Homebase kernel: RBP: 0000000000000000 R08: ffff888f5911fa70 R09: ffff888f5911fa68
Oct  8 02:02:49 Homebase kernel: R10: ffffea0022918ec0 R11: ffff888ffe9e0b80 R12: ffff888fbfafe000
Oct  8 02:02:49 Homebase kernel: R13: ffffc9001294fd20 R14: 0000000000000000 R15: 0000000000000000
Oct  8 02:02:49 Homebase kernel: FS:  000014b7fa3ac780(0000) GS:ffff888ffe9c0000(0000) knlGS:0000000000000000
Oct  8 02:02:49 Homebase kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  8 02:02:49 Homebase kernel: CR2: 0000000000000080 CR3: 00000001a6aa2000 CR4: 0000000000340ee0
Oct  8 02:02:49 Homebase kernel: Call Trace:
Oct  8 02:02:49 Homebase kernel: ? dput.part.6+0x24/0xf6
Oct  8 02:02:49 Homebase kernel: btrfs_ioctl_fitrim.isra.7+0xfe/0x135
Oct  8 02:02:49 Homebase kernel: btrfs_ioctl+0x4f6/0x28ad
Oct  8 02:02:49 Homebase kernel: ? queue_var_show+0x12/0x15
Oct  8 02:02:49 Homebase kernel: ? _copy_to_user+0x22/0x28
Oct  8 02:02:49 Homebase kernel: ? cp_new_stat+0x14b/0x17a
Oct  8 02:02:49 Homebase kernel: ? vfs_ioctl+0x19/0x26
Oct  8 02:02:49 Homebase kernel: vfs_ioctl+0x19/0x26
Oct  8 02:02:49 Homebase kernel: do_vfs_ioctl+0x526/0x54e
Oct  8 02:02:49 Homebase kernel: ? __se_sys_newfstat+0x3c/0x5f
Oct  8 02:02:49 Homebase kernel: ksys_ioctl+0x39/0x58
Oct  8 02:02:49 Homebase kernel: __x64_sys_ioctl+0x11/0x14
Oct  8 02:02:49 Homebase kernel: do_syscall_64+0x57/0xf2
Oct  8 02:02:49 Homebase kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct  8 02:02:49 Homebase kernel: RIP: 0033:0x14b7fa4de397
Oct  8 02:02:49 Homebase kernel: Code: 00 00 90 48 8b 05 f9 2a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 2a 0d 00 f7 d8 64 89 01 48
Oct  8 02:02:49 Homebase kernel: RSP: 002b:00007ffc52c9f358 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Oct  8 02:02:49 Homebase kernel: RAX: ffffffffffffffda RBX: 00007ffc52c9f4b0 RCX: 000014b7fa4de397
Oct  8 02:02:49 Homebase kernel: RDX: 00007ffc52c9f360 RSI: 00000000c0185879 RDI: 0000000000000003
Oct  8 02:02:49 Homebase kernel: RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000415fd0
Oct  8 02:02:49 Homebase kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000415740
Oct  8 02:02:49 Homebase kernel: R13: 00000000004156c0 R14: 0000000000415740 R15: 000014b7fa3ac6b0
Oct  8 02:02:49 Homebase kernel: Modules linked in: veth xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost tap macvlan xt_nat ipt_MASQUERADE iptable_nat nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs dm_crypt algif_skcipher af_alg dm_mod dax md_mod bonding edac_mce_amd kvm_amd nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) drm_kms_helper btusb btrtl btbcm drm kvm btintel igb bluetooth agpgart syscopyarea sysfillrect crct10dif_pclmul sysimgblt fb_sys_fops crc32_pclmul crc32c_intel ghash_clmulni_intel i2c_piix4 i2c_algo_bit pcbc i2c_core aesni_intel aes_x86_64 crypto_simd wmi_bmof mxm_wmi ahci ecdh_generic cryptd ccp libahci glue_helper wmi button pcc_cpufreq acpi_cpufreq
Oct  8 02:02:49 Homebase kernel: CR2: 0000000000000080
Oct  8 02:02:49 Homebase kernel: ---[ end trace 9bdd9e618dc0d9c2 ]---
Oct  8 02:02:49 Homebase kernel: RIP: 0010:btrfs_trim_fs+0x166/0x369
Oct  8 02:02:49 Homebase kernel: Code: 00 00 48 c7 44 24 38 00 00 00 00 49 8b 45 10 48 c7 44 24 40 00 00 00 00 48 c7 44 24 30 00 00 00 00 48 89 44 24 20 48 8b 43 68 <48> 8b 80 80 00 00 00 48 8b 80 f8 03 00 00 48 8b 80 a8 01 00 00 0f
Oct  8 02:02:49 Homebase kernel: RSP: 0018:ffffc9001294fc90 EFLAGS: 00010297
Oct  8 02:02:49 Homebase kernel: RAX: 0000000000000000 RBX: ffff888f5db68200 RCX: ffff888fbf604878
Oct  8 02:02:49 Homebase kernel: RDX: ffff888cac98de80 RSI: ffff888f5d718c00 RDI: ffff888fbf604858
Oct  8 02:02:49 Homebase kernel: RBP: 0000000000000000 R08: ffff888f5911fa70 R09: ffff888f5911fa68
Oct  8 02:02:49 Homebase kernel: R10: ffffea0022918ec0 R11: ffff888ffe9e0b80 R12: ffff888fbfafe000
Oct  8 02:02:49 Homebase kernel: R13: ffffc9001294fd20 R14: 0000000000000000 R15: 0000000000000000
Oct  8 02:02:49 Homebase kernel: FS:  000014b7fa3ac780(0000) GS:ffff888ffe9c0000(0000) knlGS:0000000000000000
Oct  8 02:02:49 Homebase kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  8 02:02:49 Homebase kernel: CR2: 0000000000000080 CR3: 00000001a6aa2000 CR4: 0000000000340ee0

October 8, 2019

Diagnostics, etc, but I recall seeing this issue being related to certain dockers before:

October 4, 2019

I was contemplating the same thing a while back.
A lot of it will depend on what kind of drives you are running

WD Red NAS drives pull less than 2A peak:

https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-red-hdd/data-sheet-western-digital-wd-red-hdd-2879-800002.pdf

While older WD Blue drives could use up to 3A, modern ones are also sub 2A:
https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-blue-hdd/data-sheet-wd-blue-pc-hard-drives-2879-771436.pdf

It would seem to me that even accounting for motherboard/cpu/etc you should easily be able to handle 30+ something drives before you need to think about expanding your PSU (Seasonic Titan Prime 1000 should have 83A on the 12V rail). If you're running 7200 RPM drives or something this may be closer to 20, but 10 drives should be no problem at all.

I am running 16 drives, I think, on a Seasonic 860 (71A) with no issues that I've seen.

And, unless you are trying to absolutely max out your storage capacity, at some point you'll likely be replacing smaller drives with larger ones as you expand. Doesn't make as much sense to add four 3TB drives when two 6TB or one 12TB will do and likely be the same price or cheaper (parity, etc, I know....)

September 13, 2019

I ended up cheaping out a bit.
Here's what I went with:

CPU: Intel i3-9100

Motherboard: Gigabyte C246-WU4

Ram: Crucial CT16G4WFD8266 16GBx4

Turns out Xeons are really expensive. Still cost around $800 for this setup, fully half of which was for the RAM.
I decided the PCI lanes weren't as big of a deal as I was making them out to be. I settled for 8x + 8x + 8 native SATA ports, which will be enough to cover all 24 drives with two M1015s (or equivalent) and the onboard ports.

I'm not sure going from 16GB RAM to 64GB will really do anything for me, but in the back of my mind I'm hoping it will help with Folder Caching or Crashplan or something.

Pros:

Parity check single core CPU usage closer to 30% vs the 100% that was occurring before

Parity check speeds up to 115MB/sec vs 70MB/sec (so far, may get faster after the 2TB disks)

I was able to set my tunables back to the default, rather than the (very low) values I was using to prevent CPU stalls.

Cons:

The motherboard has displayport connectors on it. I'm old and grumpy and don't own a single displayport monitor or adapter. And didn't even realize there was another option than VGA/DVI/HDMI. Had to pull out an ancient, broken video card as I really have no spares.
Unraid won't boot unless I boot into GUI mode. I don't really care that much, so I'll leave it until I get the display situation worked out. Worst case I'll just set it as the default, so I don't have to hook a keyboard/monitor up to reboot it, if I can't fix it. Could be related to the video card or something.

Edit: Getting the onboard video card working instead of the old, broken one I was using seems to let it boot up properly now. yay.

July 31, 2019

On 7/29/2019 at 1:23 PM, johnnie.black said:

There can be dependent on the number of disks connected, but there isn't a performance hit just because you're using one, you can easily calculate the max available bandwidth, it just depends if it's SAS/SAS2/SAS3 and linked with single or dual link to the HBA.

Yes, that is what I said. When running "many drives". (I suppose that could be misinterpreted, but I meant "When there are a numerically large number of drives")

I was looking at your testing thread earlier:

I have 14 drives right now and it takes forever to get through (60MB/s last I checked), with 3 on the m1015 and 11 on the expander.

I had assumed it was due to my CPU, since that was getting pegged at 100% usage and stalling but...

Now I'm wondering if I did something silly like plugging my M1015 into a x4 slot or something, because I really should be getter better speeds than this.

More things to think about...

July 30, 2019

That's pretty cool, but at the end the only thing going through my head was:
"Man, I really hope he is also going to set up an offsite backup"

I'm also a bit curious how much data he actually ended up with.

I'd be pretty surprised if those drives in the bins were actually all full.

July 29, 2019

It's been a while but my thinking goes like this:

Last I checked there's a very real performance hit when running many drives off an expander:

Plus I feel like M1015s are probably easier/cheaper to get than RES2SV240s.
I'm not planning to replace it right now, since the expander is already paid for, but I'd like the option to run 3xM1015s for full bandwidth should my expander ever fail (or 2xM1015s and the rest of the drives off the motherboard, is also an option).

The motherboard/CPU I've been using (just a low end consumer board/phenom ii CPU I picked up from Microcenter) was what I had in my Unraid test build when I was seeing if it would work for what I wanted. I migrated it into the 4224 so I wouldn't have to spend money on new hardware. For years been meaning to upgrade to something a bit better and given the problems I've been having doing parity checks means now is probably a good time. Even though I have no plans to change the M1015 right now, it's always in the back of my mind that it, or the expander, could fail in the future.

The Norco should fit a full ATX board, so no worries there.

I'm a bit curious if one can actually run all the PCIe slots on a X11SCA-F with an Intel Xeon E. I thought those topped out at 16 lanes, but the board claims to support 8/8/4/1. Guessing the 4/1 must run through the PCH or something.

July 26, 2019

Here's what I use:

veto files = /._*/.DS_Store/.AppleDouble/.Trashes/.TemporaryItems/.Spotlight-V100/
delete veto files = yes

July 22, 2019

Using this post to see if anyone has recommendations and to "think out loud" while I work on this.

Currently running an AMD Phenom II X4 processor which chokes and dies (CPU stalls) when running dual parity checks, likely due to lack of AVX instructions. This locks up the UI/terminal when it happens. I've managed to reduce the occurrences by lowering md_stripes or something, but parity checks are very slow and will never be optimal with this setup.

I figure this is as good a reason as any to upgrade to some real hardware.

I run very few dockers and have no real plans to run VMs on the server (and if I do it's very unlikely they will need video card passthrough or anything as I have a dedicated gaming PC), but there's no kill like overkill so here is what I'm thinking:

I'm not sure I have any real need for multi processor support.

ECC RAM. I'd prefer a configuration that allows me to start with 32GB and move to 64GB (or more) later.

IPMI. Being able to manage the computer remotely would be great.

Built-in graphics or the ability to run headless. Right now I have a...Diamond Stealth video card from the 90s crammed into my unraid because the mobo won't boot without a graphics card. I have no need for a monitor on the system if there is IPMI support though.

No bios update needed to boot the recommended processor. I have no spare processors sitting around, so either the motherboard must start with a good bios or be one of those magic boards that can somehow update without a processor.

SATA ports don't matter. Running M1015+RES2SV240. Only change I'd make in the future is to move to multiple M1015s (or whatever is the new hotness) rather than using the expander. Currently running 12+2 drives in a Norco 4224

M.2 and internal USB would be nice.

Intel gigabit ethernet LAN port.

I remember back in the day some super micro boards (X9SCM?) had issues running multiple M1015s and wouldn't post. Is that still a thing?

Would the current version that Supermicro board be fine? The X11SCM-F with probably a Xeon E-2174G or E-2176G? Or is there another direction that I should be considering?

This isn't really anything fancy or powerful, but trying to work out what I need is tough because there's so many options in server hardware out there that I'm a bit overwhelmed, to be honest.

EDIT: It appears the X11SCM-F only has 1 PCI slot so that's probably not the way to go. Hmmm.

December 7, 2017

I haven't tested this or anything, as I don't have this issue.

But, given that the error seems to be generated from the tower kernel, it's likely you'd need to set it for unraid, not for each vm individually.

What you'd need to do to test, now that I am looking at it, is on your usb edit the file

/syslinux/syslinux.cfg

and either create a new entry or add to the append line of an existing one

clocksource=hpet

So like

label unRAID OS
menu default
kernel /bzimage
append initrd=/bzroot

would become

label unRAID OS
menu default
kernel /bzimage
append initrd=/bzroot clocksource=hpet

it might work, do nothing, or not boot, but it's simple enough to undo either way.

It's worth a try, at least.

December 5, 2017

Rather than replacing the motherboard/cpu, which seems like a rather drastic solution, is it possible to change the clock source in the go file itself?

It looks like linux should allow customization of what clock source the kernel is using. Having it start with hpet or jiffies instead of tsc might be an option?

see https://www.kernel.org/doc/html/v4.10/admin-guide/kernel-parameters.html

Is it possible to explicitly set clocksource=hpet for the people having this issue?

SnickySnacks

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by SnickySnacks

Accessing array from remote PC (in network)

Memory Errors

Memory Errors

Help, cpu goes to 100%, all cores

Transferring license to new USB

Transferring license to new USB

Apple Crapintosh computer leaving trash all over my Unraid shares!

Apple Crapintosh computer leaving trash all over my Unraid shares!

Checksum (hash) verification failures after file transfer to unraid

Looking to upgrade CPU on new build

Disks not spinning down after a reboot, after being set to 15min.

Disks not spinning down after a reboot, after being set to 15min.

HDDs overheating

CPU Load building up with no indication (after upgrading to AMD)

CPU Load building up with no indication (after upgrading to AMD)

CPU Load building up with no indication (after upgrading to AMD)

Staggered spin up

Xeon mobo/cpu recommendation wanted

Xeon mobo/cpu recommendation wanted

SmarterEveryDay Youtuber using UNRAID

Xeon mobo/cpu recommendation wanted

/mnt/user/._.DS_Store files everywhere. How do I prevent these files from generating?

Xeon mobo/cpu recommendation wanted

VM Stopped Working, Marking Clocksource "TSC" as unstable because skew is too large

VM Stopped Working, Marking Clocksource "TSC" as unstable because skew is too large