RobJ

Members
  • Posts

    7135
  • Joined

  • Last visited

  • Days Won

    4

Everything posted by RobJ

  1. There's a couple of differences between the 6.1.9 and 6.3.2 lspci reports. 00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) [1002:4383] (rev 40) Subsystem: ASUSTeK Computer Inc. Device [1043:8436] Kernel driver in use: vfio-pci For the Azalia audio, 6.1.9 has the vfio-pci driver assigned, 6.3.2 does not, has nothing. I don't have VM experience yet, but I assume that's for audio passthrough, probably not working in your 6.3.2 VM. When you move to 6.3, you may need to rework your VM's. For 6.1.9: 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) [1002:5a14] (rev 02) Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) [1002:5a14] 00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) [1002:5a23] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) [1002:5a23] 00:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B) [1002:5a16] Kernel driver in use: pcieport 00:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port A) [1002:5a1d] For 6.3.2: 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD/ATI] RD9x0/RX980 Host Bridge [1002:5a14] (rev 02) Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) [1002:5a14] 00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD/ATI] RD890S/RD990 I/O Memory Management Unit (IOMMU) [1002:5a23] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) [1002:5a23] 00:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GFX port 0) [1002:5a16] Kernel driver in use: pcieport 00:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 5) [1002:5a1d] These changes may not look significant at first, but they represent changes in the main AMD motherboard support. In 6.1.9, the RD890 modules were for the RD800 series AMD chipsets. In 6.3.2, the same modules appear to have been enhanced to also provide support for the RD900 series, and renamed to be RD9X0. If they were modified to cover more chipsets, then there is a possibility that mistakes were made, and compatibility for your board was harmed. All you can do is check you have the latest BIOS for your board, and wait for a newer kernel with fixed support, *if* this is the cause of the problem (and it may not be). I would guess there are thousands of other users with your board, or similar enough, and complaining too (if this is the problem). What's odd is that you have a 990FX chipset, so you would think the RD9X0 would be better, but the RD890 did support the 990. One other difference, in 6.3.2 you used an MTU of 9000, in 6.1.9 you used an MTU of 1500. That could conceivably cause communication difficulties if not everything in the network path supports 9000.
  2. I'd be very interested in seeing test results at various values. In the Dynamix Plugins section of the Upgrading to UnRAID v6 guide, I added some comments and recommendations for CacheDirs, including the following: CacheDirs modifies vm.vfs_cache_pressure, a system parameter governing how aggressively the file and folder dir entries are kept in a cache. The Linux system default is '100', which is considered a "fair" value. Lower values like '1' or '10' or '50' will keep more dir entries cached, '200' would allow them to be more easily dropped, resulting in more drive spinups. The most aggressive would be '0', but unfortunately it may introduce a small risk of out-of-memory conditions, when other memory needs cannot be satisfied because dir entries are hogging it! By default, CacheDirs sets it to '10', which is a good value for most users. If you set it to '100', then it will remain the same as the Linux default value. If you wish to change it, add a -p # (that's a lowercase p and a number) to the User defined options field on the Folder Caching settings page. For example, to set it to more aggressively protect your cached dir entries, enter -p 1 in the options field. To avoid any possible side effects, add the parameter -p 100, which will restore it to the system default.
  3. I took a look and like eschultz, did not find any real issues, system looks like it's working fine. The earlier diagnostics showed a small network issue, but it's not in the latest, and it wouldn't have caused your issues anyway. The syslog does end with the following, don't know what that's about. Mar 26 17:38:01 BlackBoX crond[1609]: exit status 127 from user root /usr/local/emhttp/plugins/dynamix.system.stats/scripts/sa1 1 1 &> /dev/null The system looks far better after your cleanup, don't know how you operated at all with so many really old 2012 and 2013 packages installed! For example, the earlier diagnostics showed that python 2.7.5 was installed, then later uninstalled completely, then 2.7.10 was installed, and used to compile and set up denyhosts, then 2.7.10 was completely uninstalled, and python 2.7.13 was installed! And you removed some old stuff from your go file. I would also remove the hosts and nameserver lines, don't think they are needed any more, and the hosts line was wrong, an IP with no host. I would also consider removing UnMenu, can't see what it's loading from /boot/packages, but have to assume it's very old! Probably unnecessary at best, may be causing conflicts. If you'd like to submit a 6.1.9 diagnostics, from a working system, we can compare and see if any differences appear that may be relevant.
  4. There are quite a few old-timers here. Not me of course, I'm quite young, but at one time programmed Commodore Business Machines, the CBM series for small businesses, prior to IBM's first 8088-based lemon computer, and well before that advanced 80386. And I'm sure it was in a past life I wrote punch card programs in Fortran and Cobol. As to your hardware, if you are going to run multiple VM's, especially full blown desktops, I'm not sure there is such a thing as overkill in hardware. They are hungry beasts!
  5. Not sure if it's necessary, but I'm wondering if it's a good idea to reset the BIOS back to defaults, when you switch CPU's, just to make sure it reconfigures itself for the new CPU.
  6. I think everyone is happy with the recent release schedules, and the way we're keeping up with security patching, and I certainly don't want to change that. I don't in any way want to say what Tom should do, so this is just pure speculation, just one idea of what could happen: if Tom were to give Eric more build responsibility, at the first (or some other day) of each month or every other month, Eric could ask Tom if he could build the next stable release using a given set of CVE patches, and possibly a small set of newer features or enhancements already in the dev track release. That way, users could expect a stable release on schedule. At times, the answer might be that no stable release was warranted, no security issues to fix and nothing to add, and notify us accordingly. That IS a problem, no easy answer. We saw that recently with the attempts by users to return to 6.2, Dockers not backward compatible. I'm not sure it should stop this though, just means that users will be more likely to stick with whatever track they select. However, I'd say that most of the time, *someone* has come up with a set of steps to follow. I like the LTS strategy, but I didn't want to say it, I didn't want to limit Tom to any particular way of doing it. But the rationale is very similar. That's one of the possibilities of course, but I don't personally like it. I think there are probably a number of happy users still on 6.2.4, running VM's. To me, it's a philosophical choice users can make - run with what has been better tested and works fine for them, even if not as 'modern' or feature rich, or keep up with the latest tech, knowing there's a risk of more issues. If you need support for something that's only available in a certain release, you want that release no matter what it takes. Once you have the feature support you need, users can stick there, until the stable releases catch up with them, and then they can stay on the stable track.
  7. I'm probably missing something obvious, so feel free to snicker, but it looks like you're asking to remove command 1 and command 2, and replace them with command 3, *but* command 1 looks identical to command 3 ?
  8. Ooooo... can I ask for things too? None are vital. The lsmod list is randomly ordered, so that when I compare 2 diagnostics, they look very different, yet are usually identical. Would it be possible to change lsmod to lsmod | sort ? Another non-vital but useful snapshot, add 2 top listings, one at the beginning of diagnostic data collection, call it top1.txt, and the other at the end of the collecting, call it top2.txt. It would give us 2 quick snapshots of what's running, and how much memory they are using. Not needed much of the time, but really useful in certain situations.
  9. I don't know, for a specific card. But some cards have their own BIOS setup screen, or jumpers, a way to make significant changes in the PCI configuration of the card. I have no idea what your card has. Perhaps Tom will check on that, can't speak for him. But remember, it's not clear what the issue is, may not be the driver.
  10. Uh oh! Now I'm feeling pressured! As you've probably noticed, I'm easily and constantly side-tracked! And I have a bunch of little projects I'm either working on, or wanted to work on, plus other projects I don't want to work on, but my relatives do want me working on! I'll try to put a priority on it though. But the first draft won't be a step by step, but rather a summary of what has to be done. I have some reservations about the use of unBALANCE, the more I've thought about it. It's doable, but there are special issues that can come up, and I don't know what happens then. I first need to create a post in the unBALANCE thread with some questions I have, as to what happens in certain cases. The only simple case (I think!) is the case where the user doesn't use includes or excludes, all shares exist on all drives. Any other case is going to have extra issues and steps.
  11. This is not actually a feature request (something to add to the unRAID distro), but more of a request for an improvement in the unRAID release program. I doubt that Tom will like it(!), even if he recognizes its usefulness, because it clearly adds more work for him and his staff. But I think it's worth discussing. Increasingly, I've been thinking that we may be getting too close to the bleeding edge. We have strongly competing interests, driving interests, that can't be fully reconciled. We want the long tested stability of a base system, the NAS, the rock solid foundation of our data storage. But we also want the latest technologies - the latest Docker features, the latest KVM/QEMU tech, full Ryzen support, etc, and that means the latest kernels. In my view, it's impossible to do both well. We're trying now, not far behind the latest kernels, but we're seeing more and more issues that seem related to less tested additions and changes to the kernel. For example, a whole set of HP machines unsupported, numerous Call Traces, and other instabilities, that I really don't think would be present in older but well patched kernel tracks. But some users are now anxiously awaiting kernels 4.10 and 4.11! So what I'm proposing for consideration is moving to two track releases, a stable track and a 'bleeding edge' track (someone else can come up with a better name). Currently, 6.2.4 would be the stable release, and 6.3.2 would be the leading edge release. 6.1.9 was a great release, considered very stable. It was based on kernel 4.1.18, that's the 18th patched release of Linux kernel 4.1. 6.2.4 was a great stable release. It was based on kernel 4.4.30, the 30th patch release of 4.4. We're currently on 6.3.2, using kernel 4.9.10. What I would like to see is the stable release stay farther back, and only move to a new kernel track when it reaches a 15th or 20th patched release (just my idea). We can still backport easy features, cosmetic features, and CVE's. Then Tom can be free to move ahead with whatever he wants to add into the latest 'leading edge' releases, with a little less concern for the effects, because there's always a safer alternative. A nice side effect is that the stable releases won't need betas, just an occasional RC or 2 when the move up is larger than usual. Practically everything added to it has already been tested in the newest releases. Small non-critical features can be added to both tracks, but larger or more risky features always go in the risky track, not the stable, and are only added to the stable releases when clearly stable themselves. I know this is more work, and that may make it impractical. I wonder if Tom could concentrate on the latest versions, while Eric maintains and builds the stable releases. Finding ways to automate the build processes could help too. But this could keep 2 very different sets of unRAID users happy, those who want stability first and foremost, and those who want the latest tech. Naturally, there are many who want both, but I personally think it can't be done, not done well.
  12. There's a FAQ entry for that. Let me know if it needs improvement.
  13. The card or driver is broken, not working. Here are the relevant syslog sections: Looks like memory region conflicts (possibly with itself?!?). Both Microsoft and Linux kernel devs have gotten good about detecting and working around hardware 'quirks', and perhaps Microsoft has done a better job here. You can try reconfiguring the card, to see if it will improve its PCI declarations. Or perhaps Tom ( @limetech ) will have an idea, once he sees this. Could also be useful to see @ezhik's syslog, to compare the same sections.
  14. And if you add that, might be nice to have a timed logout, based on an activity timer (e.g. no user activity for 30 minutes - log the session out). I doubt many users would use it, but useful for special circumstances, shared machines.
  15. I wasn't sure what you meant, because drives with only 3000 hours are really young. Drives with 30000 are still in their prime. I don't think of drives as getting older until they are well over 50000 hours. I believe I've seen drives with over 70000 hours still going strong, no significant issues. Users tend to replace drives because of size issues, more than age issues, I think. They outgrow them after awhile.
  16. Del, I see you have only been here a few days, so I'll go easy on you! But you have seriously jumped too quickly to some very wrong assumptions about Lime Technology! As a long-time user, and I think I can speak for many other veteran users, your conclusions are totally different from ours. To have drawn some of your conclusions, you could not possibly have taken the time to review the numerous Release Notes, for the close to a hundred odd releases we have had. Tom has always been very responsive to users needs. As you said yourself, the 'threat landscape' has changed recently, in the last 2 years I think, and that's when unRAID began closely following the CVE's. I just checked and it was early to mid 2014 when they began, specifically for security reasons, adding patches and patched programs to the distro. Before that, they had been just staying current with Slackware. Yes there are frank discussions now and then, but I wouldn't draw too many conclusions from that. They're based on a history of respect. Tom and Jon aren't around here that often, a bit more at the moment because of some issues that have cropped up. But they can be gone for weeks at times, and they only monitor a subset of the boards. In a way, we don't need them! Things run pretty smoothly around here. If you do need a response from them, it's best to email them directly at support. Otherwise, it could be a week or 2, if they even check the board you posted on. Also, I've never seen them duck a hard question. If there's no response, they either haven't seen it yet, or others have answered. Or they missed it, Tom does seem to skim read at times. I did find your question above, and I have to say that's not a quick one. To fully answer is more like writing a white paper! They're busy, give them time.
  17. You may have an Internet connection, but you are almost certainly behind a NAT router, not directly connected to the Internet. Normally, that NAT router is the only machine that faces the Internet, has a direct connection, and is under constant attack by numerous bots roaming the IP's of the Internet. You only have a local IP for your local network. Only the router has your true IP that is seen on the Internet. When your browser or NTP service (or other Internet need you may have) needs to see something on the Internet, it makes a connection to an Internet server, and your router notes that connection and allows that server to respond, using the associated ports of your connection. The router will route those responses back to your machine, and not any other. The outside bots and servers cannot attack or connect to your machine, because they can't even see it, and they don't know your local IP. The only contact that outside machines can have with your machine is strictly through connections your machine initiates first, through your router. Now if you *did* want to put your server directly on the Internet, most routers have a setting where they can put any machine into a 'DMZ', a special unprotected zone, which means the Internet is directly connected to any machine you choose! And the router won't block any Internet traffic then, but allow all of it to come through to you. But I would strongly advise you to first disconnect ALL of your drives, and backup your boot drive, because you will be very rapidly attacked! Never use the DMZ unless you have a lot of security experience!
  18. It's a script, not a plugin. So you can put it anywhere, and run it from there. The root of your boot drive (/boot) is the common place.
  19. That's part of the new Preclear statistics function, apparently not working for you. Preclear has been updated, try updating it to the latest and see if it still happens. If it does, post again in the Preclear plugin support thread.
  20. The SMART reports look OK, no obvious major issues, but a few are a little suspicious, show some wear and age. The parity drive is one I suspect may be close to having issues, because the seek error rate has dropped below average. The very old Seagates with over 60000 hours could also be wearing out, although their SMART numbers look OK. Several of the WD Reds seem to have a very long test time, *might* be a problem. What I recommend is try the Drive performance testing tool. Download the diskspeed.sh script to your flash drive and run it. Defaults are fine.
  21. Earlier, you executed the command as /dskt, indicating it was in the root, but that error says it's not there any more. Did you reboot? If so, it's gone, and you'll need to download another copy to the flash drive.
  22. The parity check ran for 12 hours, between 9:30 and 9:30. That's 4 hours per terabyte, which is too slow. It's normally 2 to 3 hours per terabyte, 2 or less for fast drives, and 3 for very slow drives. It's time to check the SMART reports for all of the drives, plus run a drive performance test using the disk_speed script. The syslog does not show any relevant errors at all. Instead of the syslog, we usually want the diagnostics, which includes the syslog plus all of the SMART reports and a lot more.
  23. Also, you have IDE emulation turned on for your onboard SATA drives. When you next boot, go into the BIOS settings and look for the SATA mode, and change it to a native SATA mode, preferably AHCI if available, anything but IDE emulation mode. It should be slightly faster, and a little safer. You have Disk 4 and Disk 5 on the same IDE channel, at limited speed.
  24. This looks like it might be an old issue returned, involving extended attributes and AFP. I'm guessing you have accessed this drive over AFP at some point? Read the discussions and fixes found in these threads: A "mover" issue? (Mover uses rsync too) Running mover does not successfully move items Extended Attributes Fix