John_M

Members
  • Posts

    4725
  • Joined

  • Last visited

  • Days Won

    12

John_M last won the day on April 19 2021

John_M had the most liked content!

Retained

  • Member Title
    Away for much longer than I expected

Converted

  • Gender
    Male
  • Location
    London

Recent Profile Visitors

4391 profile views

John_M's Achievements

Experienced

Experienced (11/14)

402

Reputation

2

Community Answers

  1. Yes, I believe you're right about the BTRFS errors. You can clear them with btrfs dev stats -z /mnt/cache (assuming that's the correct mount point) then you'll find it easy to notice if any more occur.
  2. The reason I said the RAM was a slightly odd choice is that it's specced at DDR4-3600 but the maximum your CPU can run at is 3200 MT/s. That figure is derated by your particular configuration - you have two DIMMs per channel and they are dual-rank. Becuase that's a lot of physical chips connected across the bus the recommended maximum speed for a 3000-series CPU is 2666 MT/s. So you might have paid more for faster RAM when slightly slower RAM would suffice. 2133 MT/s is fine. I was asking why you weren't running it at 2666 MT/s and if the reason is that you had decided to slow it down in an attempt to avoid the errors. The BTRFS errors I pointed out are real errors - hardware errors, I believe, and they are present in both sets of diagnostics - i.e. before you start the VM. Check the timestamp. The system runs fine before you start the VM but since the VM makes heavy use of the cache pool the problems begin when you start the VM. On the question about power delivery, the Pro 4 series has weaker VRMs than more gaming orientated motherboards and the B450 series was designed for the 2000-series of CPUs. Your 3950X is rather more power hungry than the top CPU in the 2000 series (the 2700X). I don't think it's an issue in your case but it is why I asked what your VM was doing, thinking you might be using it to thrash the CPU. If you're happy with the RAM the next thing to address is the NVMe cache.
  3. The L3 cache is on the same chiplet as the L2 cache, L1 cache and CPU cores. The Infinity Fabric is not involved.
  4. I'd go back and re-test the RAM. It isn't clear from your post whether your problem started before or after re-seating the RAM but unless you're sure it's good you're wasting your time doing anything else. The RAM modules themselves seem to be a slightly odd choice. Being DDR4-3600 they're not the best match for your CPU but you're clocking them at 2133 MT/s so at least that's within spec. You should be able to run them at 2666 MT/s - have you reduced the speed to see if it would fix the problem? ASRock's website has been down for a couple of days now so I can't check on the specifics of your motherboard, but is the B450 Pro 4's CPU power delivery really adequate for the 3950X? What's special about the VM you use to reproduce the problem? Note that there are pre-existing errors on your NVMe cache before you start the VM: Sep 25 17:09:48 Jarvis kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 38166417, rd 30944114, flush 3780, corrupt 102327, gen 246 I can't tell whether subsequent errors reported in the log are "new" or simply manifestations of the pre-existing errors.
  5. The .local domain is a special case. It's a pseudo-top level domain that has been set aside for use by the multicast DNS (mDNS) service, originally developed by Apple as Rendezvous and more recently known as Bonjour, with the aim to provide "zero-configuration networking". Apple released the protocol to the open source community in a series of RFCs and it has been adopted and incorporated into other operating systems, e.g. the avahi daemon provides mDNS for Linux. Instead of querying a centralised DNS server, mDNS requests are multicast to all hosts on the local network and if an mDNS-supporting host recognises its own name it responds with its IP address, using a similar data structure to that used by a conventional DNS server. For a particular container to support mDNS it would need to be built with avahi included.
  6. I think the problem with your request is that the system doesn't know how big a file is going to be until it has finished writing it. That's the reason you have to specify for each share the minimum free space as being at least as big as the largest file you intend to store in it, otherwise you risk filling up a disk and the write failing.
  7. There's a plugin called Enhanced Log Viewer that opens at the end of the log. I appreciate that this isn't quite what you're asking for but it might make life easier for you in the meantime. The plugin doesn't replace the built-in log viewer but it is available from the Tools page as an alternative. It has other features, too such as customisable colours. I find it very useful.
  8. That's a feature that could easily be implemented in the Dynamix File Manager plugin. It already allows chown and chmod operations. The ability to twiddle the immutability bit in the GUI would be very useful, especially if the file's icon was to change to indicate that it has been set.
  9. The mcelog command doesn't support AMD processors and the resulting error message, which appears once in the syslog at boot-up, causes confusion and anxiety. For a list of supported CPUs type mcelog --help and to test whether the current CPU is supported run mcelog --is-cpu-supported which returns no error message and a return code of zero if the CPU is on the supported list, or the error message and a non-zero return code in the case of an unsupported CPU. So, to suppress the error message, first call "mcelog --is-cpu-supported" with the error message redirected to /dev/null and test the return code. If it is zero then call mcelog again with the appropriate options. If it is non-zero, check that the edac_mce_amd module is loaded instead. See here:
  10. The mcelog error message is a red herring and is unrelated to the OP's problem. The message itself explains the situation, namely that The solution is to To find which processors are supported by mcelog, type the following: root@Pusok:~# mcelog --help Usage: mcelog [options] [mcelogdevice] Decode machine check error records from current kernel. ... --help Display this message. Valid CPUs: generic p6old core2 k8 p4 dunnington xeon74xx xeon7400 xeon5500 xeon5200 xeon5000 xeon5100 xeon3100 xeon3200 core_i7 core_i5 core_i3 nehalem westmere xeon71xx xeon7100 tulsa intel xeon75xx xeon7500 xeon7200 xeon7100 sandybridge sandybridge-ep ivybridge ivybridge-ep ivybridge-ex haswell haswell-ep haswell-ex broadwell broadwell-d broadwell-ep broadwell-ex knightslanding knightsmill xeon-v2 xeon-v3 xeon-v4 atom skylake skylake_server cascadelake_server kabylake denverton icelake_server icelake-d snowridge cometlake tigerlake rocketlake alderlake lakefield sapphirerapids_server and to check whether the edac_mce_amd module is loaded: root@Pusok:~# lsmod | grep mce edac_mce_amd 32768 0 This confusion would be avoided if mcelog was only run after first checking for a compatible CPU by invoking it with the --is-cpu-supported option. This is the result with an AMD CPU: root@Pusok:~# mcelog --is-cpu-supported mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead. root@Pusok:~# a bash script could easily suppress the error message and check the return code for a non-zero value. For comparison, here's the result with an Intel Ivybridge CPU: root@Northolt:~# mcelog --is-cpu-supported root@Northolt:~# Finally, just to confirm that the Intel server doesn't have the edac_mce_amd module loaded: root@Northolt:~# lsmod | grep mce root@Northolt:~# EDIT: I've submitted a feature request to get rid of the error message.
  11. In the unique situation where you have one data disk and one parity disk and they are both the same size, their contents are identical because that's how even parity works. You could probably assign either as either and it would be ok. However, you might want to adopt a more cautious approach. Here's what I'd do. I'd choose one of them and temporarily disconnect the other (so if things go wrong you at least have a second chance). Do a new config and allocate your chosen drive as disk 1, no parity. Start the array and check that your files are ok. Once you're happy you can shut down, reconnect the other drive, add it as parity and let it rebuild.
  12. You need to identify the Super I/O chip, which is separate from the main chipset. It won't be made by AMD. Likely manufacturers include ITE and Nuvoton. You need a driver for that chip and it needs to be loaded at boot time. Have you tried running the sensors-detect command, as @bastl did a few messages up on this page? It might need the id of the chip to be overridden and you might need to do some web searching once you've determined what chip your motherboard actually uses. For example, I have a Gigabyte X370 motherboard that uses an ITE chip. All I had to do was add modprobe it87 to my /boot/config/go file. YMMV, of course.
  13. That is the current version. More information is needed about your problem though. I haven't experienced it for five years.
  14. The Unraid GUI uses eth0 so temporarily unplug the 10G NIC and set the gigabit NIC to eth0 and as you're using a static IP address you need to configure a default gateway - usually that would be your Internet router. An easier way to do it would be to use a DHCP-allocated address because the DHCP server built into your router can automatically set the default gateway for you. You should be able to reserve a particular IP address for your Unraid server by configuring the DHCP server in your router, then you have a situation that's functionally very similar to having a static IP address, without the hassle of configuring it manually. If you repeat that for all your 1G devices you centralise all the administration in one place, your router. However you achieve it, that should restore your connection to the Internet and allow you to use the webGUI. Once that's working, set the 10 gigabit NIC to eth1 and give it a static IP address that's compatible with your other 10G devices. When you connect to your server for file transfers refer to it by this IP address. I suspect that when you swapped over eth0 and eth1 you got muddled with the static IP addresses and forgot to swap them at the same time.
  15. It looks as though the size reported for your disk is bigger than the actual size available. That's 14 TB. But dd runs out of space before 1 TB is written. Are you using a disk controller that I've never heard of? Your diagnostics should reveal more.