Jump to content

DarkMain

Members
  • Posts

    35
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

DarkMain's Achievements

Noob

Noob (1/14)

1

Reputation

  1. 5 days and there have been no more crashes. Looks like the Macvlan to ipvlan was the fix. Cheers.
  2. I did see that in the patch notes, but a couple of things stopped me from changing it. 1 - Unless the setting was changed when I updated to 6.12, its been macvlan for ages and never been a problem. I figured (perhaps incorrectly), its not caused a problem in the past so why change it. 2 - The "help" says "The ipvlan type is best when connection to the physical network is not needed.". Maybe in interpreting that incorrectly but my containers are a combination of host / bridge and one is br0 and has its own IP address. I was worried that by changing it to ipvlan I might break something so I just left it. I'll give it a shot though and see how it goes.
  3. So I rebuilt the docker image last night and when I got home from work today the server had crashed again. I've attached the new syslog. syslog-127.0.0.1.log
  4. Just out of curiosity, what in the log gave you that answer? And for my own piece of mind... are all following messages ok? Dec 4 20:51:00 Tower kernel: BTRFS error (device loop2: state EA): parent transid verify failed on logical 335855616 mirror 2 wanted 5033596 found 5032512
  5. K, here's the syslog. Looks like it might be something to do with unassigned drives and a btrfs file system? Note: These drives were all working perfectly fine before the update. syslog-127.0.0.1.log
  6. Yesterday I updated to 6.12.5 from 6.11. Everything seemed to go fine, but when I got home from work the server had crashed. I reset it and it was running fine... I watched a movie from Plex and then went to bed, then today when I woke up it had crashed again. There has been no major changes to the system. Hardware is the same and plugins were updated before the OS update but that's it. I have attached the diagnostics to this post. Syslog server was not enabled during the first 2 crashes but its on now (however it doesn't seem to be writing anything to the Local syslog folder). tower-diagnostics-20231203-1532.zip
  7. Update: The 870 EVO seems to be working fine as well (network speed inside the VM is lower than expected but that's another issue to figure out). Strange thing is, I have taken one of the older SSDs that wasn't working properly in UnRaid and I have put it into a Windows machine and I can not for the life of me, recreate the issue. So the problem has been 'solved' but I still don't quite understand why it was happening in the first place.
  8. Got the VM setup on the 8GB drive and I'm not able to recreate the issue using this drive either. This time its setup as an unassigned drive (rather than a cache pool). Its formatted as btrfs, however I'm pretty sure some of the SSDs I tested were also BFRS, so I don't think its a file system thing (although I cant be 100% on that) Looks like I'm off to the store tomorrow to pick up an 870 EVO and see how that goes.
  9. Cheers, I'll update the thread in a couple of days with the results.
  10. K, so I have tested it with the IronWolf and I was unable to recreate the issue with my usual go to method of making a VM and then copying a file to the running VM. This method has been pretty much a guarantee to reproduce the problem so that's good news its not happening. I also tested some older VMs running on the IronWolf and they ran much better than on the SSDs. That got me thinking... All of the problems have been when using SSDs (I assumed even a bad SSD would be better than a mechanical HDD so have never bothered testing with them). Right now, I'm making the IronWolf into the parity drive, and once that has been done (in about 18 hours) I am going to use the old 8GB Segate Barracuda (which you pointed out, is an SMR drive) and run the tests again. If I can recreate the issue then I can chalk it up to poor performing drives, however, if the issue still isnt present with the 8GB, I can probably say its an SSD only issue... If that is the case, my next step will be to go and buy a "high performance" SSD and test that. Can I get a recommendation from you for what SSD SHOULD work well in UnRaid. I keep seeing the 870 EVO and MX500 popping up as recommended drives but want to double check. Cheers.
  11. I'll give that a shot and let you know how it goes. Its eventually going to be the new parity (and I have a 2nd one to replace the older drives in the array). I'm actually dealing with another issue right now... In the last few days a lot of my drives have started giving me UDMA CRC error counts. Its actually since I put the new LSI 9201-16i 6Gbps 16P SAS HBA card in, so I'm going to have to remove that and put all the drives back onto the MBoard. I had to stop the parity rebuild as drive 5 was getting new error counts every few mins. Not having much luck with the system lately, I guess after, I dunno, 10+ years? (When was UnRaid 3 released?) of no issues they were bound to catch up to me.
  12. Were not talking about slow peformance though. We are talking about a complete freeze in I/O operations. If I try copying a single large file, it will copy, let say 2 or 3GB, and then the performance will literally drop to nothing for about 10+ seconds (and I get a whole bunch of CPU_IOWAIT errors. Makes sense as my understanding is that error is the processor waiting on the drives). Then the speed will jump up again, then back to nothing. Yo-Yoing up and down until the transfer is finished. It doesn't matter how the drives are used. Cache pool, unassigned drive, pass through to a VM... its always the same. Its a bad analogy, but think of it like a CPU that's throttling because of poor cooling. It get too hot so the CPU throttles and the performance drops... Because the performance has dropped the CPU cools down, because the CPU has cooled down the performance goes up again, but then it gets too hot and throttles... Its kinda like that but much more aggressive. It doesn't seem to matter how the copy is initialized. Network transfer, krusader in docker... even the mover script has exhibited this behavior. The SSDs were previously used as Windows boot drives and they were fine then, so even if they aren't performance drives they should NOT be acting this way. Its not normal and I have never seen this behavior in a drive before. Its really driving me insane and the fact that a brand new install of UnRaid exhibits the same behavior on two completely (all be it old) systems makes it even harder for me to try and narrow down what's causing the problem.
  13. As mentioned previously, the SSDs are NOT part of the array.
  14. Any idea why the SSDs would be going that slow?
  15. Its dropping to below 10MB/s at times and Glances is giving me a "CPU_IOWAIT" error. I know that SMR drives are slower but they shouldn't be dropping that slow. (I'm actually in the process of swapping the parity to one that's not, but got an error, hence the rebuild before swapping the drives) It's not just the array though. Its ALL the drives that are exhibiting this behavior, even the SSDs in cache pools or unassigned devices. I'm kind all out of ideas. Its getting bad enough that I've considered just getting a new server and starting again, but considering the issue followed me to completely different hardware on a brand new install of UnRaid, I don't want to spend a heap of money on a new computer just to find it still has problems.
×
×
  • Create New...