Magmanthe

Members
  • Posts

    30
  • Joined

  • Last visited

Everything posted by Magmanthe

  1. Ahh, yeah now I see what you mean.. I think I read that section kind of fast and wronlgy. Thought it was BOTH goth disabled when using the M2_2-slot... Not (like you say) that it is either SATA1 port is disabled IF SATA M.2 or PCI6 is disabled IF nvme M2.. Got it.. Thanks for pointing that out.. Anywho the whole thing is still kind of mysterious.. Like WHY would UNRAID eject the drive if it's in the M2_1-slot, while performs totally normal if the drive is in the M2_2-slot... Very strange indeed. But it is working now, and as long as I just note it down in my Notes I will remember this the next time I poke around in there Thanks again.. //Magmanthe
  2. Well, not according to MSI-manual.. Source: https://download.msi.com/archive/mnu_exe/mb/E7B79v3.0.pdf Page 32/110 on the M2-page it says: SATA1 port will be unavailable when installing SATA M.2 SSD in M2_2 slot. PCI_E6 slot will be unavailable when installing PCIe M.2 SSD in M2_2 slot. M2_1 slot only supports PCIe mode. //Magmanthe
  3. So just a quick update. Plan was to get a new nvme (probably a samsung-one). So to prepare for this, I removed the small EVO250GB from M2_1-slot. And then I installed the "old" Corsair Force MP510 in the M2_2-slot (so that the "main-slot" would be free and clear for when the new NVME-arrived in the mail). Put the server back in it's room and started it up. Yes, the first boot, it didn't understand much of what was happening (due to the removing of, and adding of a new drive) but after stopping the array, assigning the Corsair as the Cache and then starting the array, I did a full server reboot. Once the server came back up (I fully expecting it to eject the drive as previously stated in the diags earlier in the thread) and prepared to stop the array, remove/add the drive and start it again. But to my surprise, now it just worked, totally fine and normal. No ejection of the drive, no I/O error, no nothing.. I was very surprised. So I tried some more reboots and it still holds strong. Very strange indeed. So for now, I guess I don't have to get a new NVME.. (Just need to remember that the SATA1-port and PCIe6-slot is now disabled because I'm using the M2_2-slot..) Again thanks for the help, @JorgeB... Much appreciated //Magmanthe
  4. Happy Weekend Tried with a new (well used) nvme. Turned off server - Removed the 1TB corsairs Force MP510 (that did work on the "old" hw). Installed a 250GB Samsung 960 EVO, and turned on the server. After boot I took out a diagnostics-zip. The Samsung EVO was discovered as unassigned. I stopped the array put the EVO as Cache-drive and started the Array. Had to format the drive (as it was not XFS) and then it just kind of worked. I then installed 3 random apps/dockers (cause that data lives on the cachedrive) and then restarter the server. Now UNRAID boots totally normally and detects and treats the 250EVO as a normal cache-drive. No ejection or anything. After boot I took out another Diagnostics (for comparison reasons, but that might not be needed). Just based on my experiences here, it does seem that some kind of mix between I guess the MB, Unraid and that specific Corsair-series of NVME's just doesn't play nice with eachother? or? //magmanthe magnas-diagnostics-after_NVME_setCache.zip magnas-diagnostics-firstBoot after nvme-swap.zip
  5. Hi, Done. Attached is diags from a reboot nothing else. //magmanthe magnas-diagnostics-20210609-1931.zip
  6. Hi, Thanks for the quick reply, however I have to report NoJoy on this. Added line to the Syslinux Config and rebootet, and the same happens when it boots back up. New picture taken from attached monitor. Regarding BIOS: I upgraded the BIOS to latest from MSI (release 31|st of May I believe) before booting to unraid for the first time, so that is already covered. However something peculiar. If I stop the array, unmark/remove the cache from the pool-dropdown list re-add the nvme to the cache-pool start array. It does seem to work normally again. All dockers (that write to the cahce) function (opening KRUSADER i can browser the cache-drive). So I mean yeah, it works, if I take the time to stop/remove/add/start the array I guess... 🤔 Any clues? //Magmanthe
  7. Hey.. So long time no hear, but there’s been a bunch of stuff going on. First I tried my other HBA to see if it works, but that had some kind of major non-compatibility (I think with the mobo) as no matter which PCIe-slot I stuck it into, there was nothing happening. I got no life in it, nothing in Bios, nothing… put it back in my computer, and it works perfectly again. So I thought, let’s just skip the HBA and do SATA, but this also caused some issues. It seems that some of the SATA-ports on the MOBO were dead/not working and that meant I could not attach all the drives. So I thought fuck it.. The HW is around 6-7 years old and it was around the time of my birthday, so I’ll gift myself some new parts.. Got a new CPU, Motherboard and RAM, as well as I did order a new HBA as well. - CPU – Ryzen 5 3600 - MB - MSI x470 Gaming Plus Max - RAM – Crucial Ballistix 32GB - HBA - LSI 9211-8i After I got all the parts I put it together and started the Server. The HBA is detected, and all drives are also detected with the HBA-card, so that’s good. However with all the back and forth with the previous HW and multiple attempts at rebuilding of Parity, that probably failed due to the old faulty HBA-card, the data on Disk 1 is gone. Once the Server started with the new HW, it finished up a Parity-Rebuild, but Disk1, was now empty. Kind of sucks, but such is life.. But now, there is 1 new problem, which is to do specifically with the M2.drive I use as Cache-drive. But I’m making a new post for that, as this one can be closed. Link to new post I would like to thank trurl and JorgeB for the initial help and support with the previous HW-issue regarding the HBA-card.. Thanks a bunch. //magmanthe
  8. Hey so I had some issues with my old UNRAID HW TLDR I had some issues with a HBA as well as old HW, so I got some new HW and now I have some problems with the Cache-drive. I got an upgrade to the hardware, to the following: CPU – Ryzen 5 3600 MB - MSI x470 Gaming Plus RAM – Crucial Ballistix 32GB HBA - LSI 9211-8i + my old NVME from previous system – Crucial Force MP510 1TB This was also the cache-drive in my "old" Unraid server. So long story short, I started the new server, it boots nicely, and logs-in (checked from WebGui). However there is a problem with the nvme-Cache-drive. MB_Bios sees the Force MP510-drive fine. Unraid also sees it, but it sort of "ejects it"? I dunno 🤷‍♂️.. Also if I press the "mount"-button (see picture) nothing happens, it starts to mount (with the spinny-circle) for like .5seconds, before it turns back to the Mount-button. Attaching: pic from Server (taken just after auto-download of keyfile and log-in) UnraidWebGui-ss Logfile (without GO-file due to cleartext username and PW) the Cachedrive (nvme) worked fine before the HW-upgrade. I am using the M2_1-slot on the MB. It still has another slot M2_2-slot. I have NOT yet tried to switch it around, I'm trying here first. Reason: if using the M2_2-slot, it will disable SATA1 and one of the PCIe-slots, and I'm trying to avoid this. Thanks for any help and input from you all. //Magmanthe magnas-diagnostics-20210608-1613.zip
  9. Well would you look at that... So that would account for why the disk(s) sometimes "falls out" of the array, if I'm understanding that correctly.. But would that also account for the lock--up of the system, making WebGui totaly unrepsonsive and unreachable and unpingable on the network? I mean, the Unraid-OS runs of a USB and that is directly connected to the I/O on the motherboard.. The HBA only deals with the disks, correct? I mean, if I can replace the HBA, that is by far the cheaper-option than to replacing MB/CPU/RAM-setup... UPDATE: So when I bought the HBA i actually got two cards. 1 for the server and 1 that I use in my PC. In the server there is a LSI SAS 9207-8i and in the PC there is a Dell Perc H310. I can try to just swap over the DELL PERC card and see if the server stabilizes. Also is there any way to find out if the whole HBA-card is fubar, or if maybe it's localized to 1 of the 2 "ports" or connection on the card? //Magmanthe
  10. In Root-flash there is no syslog. There is a folder called "logs" and there is a syslog in there, attached. //magmanthe syslog.7z
  11. Yeah and it contains very little useful information like I've said thrice now.. But if you can find anything useful please let me know //Magmanthe syslog-127.0.0.1.log
  12. Well I only have that from the diagnostic-zip and the one that's in the appdata, but that syslog tells nothing. It just says server is up! May 7 11:31:23 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock May 7 12:19:16 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock May 7 12:19:22 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock May 9 09:26:52 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock May 9 09:26:57 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock unless there is a third syslog somewhere? Anywho.. restarted the server this morning and now I'm pretty sure with all the rebuilding and locking up of the system, hard-resets (powerswitch) it's generally not a good time. Is there ANY way to get some kind of wireframe/folder-structure fished out? (from the parity-disk or something?) I know the data might be gone, but if I could get some kind of overview of the root-folder structure, that would help a lot. I do still have the old MB/CPU/RAM from my first-unraid-box laying around somewhere, I'll see if I can swap some parts around but without knowing what is failing it might not work. Also I don't think there is a m.2-slot on that old stuff, so I cannot move the cache over to test. //Magmanthe
  13. the previous ZIP i added should containt any errors, cause the Syslog feature is active. However there are a million folders and files there, so I don't know where or what to look for. Seeing that the HW in there now is from ~2014 it is getting old, but it's not like it's ancient. I'm thinking that IF there is a HW-error it should be in either; MB / CPU or RAM? But have you heard of other HW-components causing these kinds (or other kinds) of problems? (PSU, LSI-card, GPU or other?) CPU - i7-4790K MB - ASUS Z97-DELUXE RAM - HyperX Fury 4 x 8GB DDR3 LSISAS2308 As you can see the HW in there now is not Amazing but it is good enough for my usecase of NAS/fileServer/NextCloud/etc. It was actually my daily-driver PC up until January of last year, when I bought new PC-HW and this system got "demoted" to Unraid-system... Dang this was annoying AF... //Magmanthe
  14. Hey again, Data has been copied out to spare drives (for the most part). I stopped array, removed Parity 1 and disk 1 (6tb) started array, stopped array, then added only Disk 1back in (and left the parity 1 disk out), and started rebuild of the 6tb drive. (I did this around 9-10ish this morning, Saturday). Now at around 12 (3 hours later) I came back to check, and system is all locked (again). WebGui does not respond and I'm back to previous picture earlier in the thread (Posted Thursday at 05:06 PM). Does any1 have a clue WTF is going on? Software error? some of the components (i.e hardware) that is broke? Where do I continue with the troubleshooting? //Magmanthe
  15. I'm guessing UD means Unassigned devices? As you can see, in the system I have 1 x 6tb 2 x 4 tb 1 x 1 tb I also have 2 x 4 TB (that was intended as replacements for the 2x4TB in the system, as they are old and have like 50k hours on them). So for now the plan is to copy out data from the 2 x 4TB and the 1 TB (this is the most crucial data). The 6TB only has TV so nothing important. Once that is done I will try to any and all mitigations to try and fix the issue. I am somewhat unsure how, but I'll try Rebuilding first and if that does not work, maybe some other ideas.. Last resort is just wiping everything and starting a new build which would suck due to all the nice customization I've done (hours of tinkering and following SpaceInvader-guides).. My suspicion is that it might be some HW-error that is not being collected by the syslog (mb gone haywire, PSU or cables fucking things up? HBA-card?).. Once (like 15 years ago) I had a MB in my first computer, where the transistors on it started bulging from the printboard, so I've seen MB's just crapping out before.. //Magmanthe
  16. Yeah, bunch of stuff happening now.. (all bad).. I know UNRAID is not a backup-solution per se.. but compared to what I used to have (everything on a disk in my computer) being in the server "protected" by 2 parity-disks, i did consider the data somewhat more secure that having it in my PC. However with what is happening now, I'm copying out as much data as I can just in case... Attaching latest syslog (whiteout go-file). Woke up this morning, and there was no change, so I took the power to the server, and then turned it back on. During boot-section it came to the login-screen and after that it reported lots of XFS-errors. I tried restarting again, and the same happened again. Restarted again this time I chose to start in safe-mode without any plugins and it started, but during the array-screen, every disk was missing from the overview 😱 I pressed the reboot-button (in gui) and this time it started "somewhat normally". However the Parity 1 disk is still not valid and my Disk 1 is being emulated. My disk 1 says "unmountable: not mounted". So before I do anything now I'm copying out the data to some external drives.. //Magmanthe magnas-diagnostics-20210507-1717.zip
  17. So I'm back with an update.. Since yesterday, I came back from work today to check the Rebuild-progress, and lo and behold the "unresponsiveness" is back. System is all locked-up (both from WebGui) and from direct-connected mouse/keyboard/screen. However instead of hard-resetting and turning the system off/on at the current time, i will leave it on until tomorrow morning (Friday) just in case the actual parity and disk-rebuild is still going on in the background. From experience I know it should take around 1 day and ~3-5 hours, and I started it at around 22 yesterday.. So around midnight tonight (if it is still going) it will be done. However it does not seem that the RAM-sticks are at fault then, as I now have 4 fully functional sticks from what I tested yesterday. any particular BIOS-settings that might interfere? (I remember reading about AMD and C-states that might make the system unresponsive , but I'm running Intel so it shouldn't be that). Other HW-issues like a MB error perhaps? I turned CPU-graphics on in Bios due to locally connected screen via HDMI (or maybe it was DP; but anyway, directly to the I/O of the MB), with a mouse/keyboard. the screen there is also "stuck" on the log-in screen (as you can see) and after it locked-up (sometime during the night, or during daytime when I was at work) that too, becomes unresponsive.. Any halp is appreciated //Magmanthe
  18. Rebuild in progress.. Now we play the waiting game Thanks a bunch for the input and help, @trurl much appreciated...
  19. Yeah, no I don't take them physically out of the server.. I remove them "softwareliy" from the array when it is stopped, start the array. Then i stop the array again, and add it back in. Okay, so that's how I have been doing it before too, so I know how to do it.. However, earlier it was always only 1 disk that was "lost" (disabled). and it was also not a parity disk, so rebuilding it was not an issue. Now it is a parity disk and and regular. do I build them back into the array at the same time, or take 1 disk at a time? //Magmanthe
  20. Hi again, So I've now run MemeTest on all 4 sticks, and wouldn't you know... 1 of them was really bad. 3 passed fine without any errors, but the last is garbage as you can see from the pix.. Luckily I did have a spare so no biggie.. So thanks for pointing me in that direction much appreciated.. Didn't think of that at all.. However when starting the server now, the Parity 1 disk and Disk 1 is still marked as disabled for some reason.. Anything else I can do? //Magmanthe
  21. Okay. But for all intents and purposes there is nothing wrong with doing it, as long as the include and exlude-rule don't overlap. but let me ask you then. If audiobook is set to include disk 3. and disk 3 is full, what happens when I shove in MORE audibooks? Which disk does it begin to put that data into? Chronologically the first disk? the next disk in line after Disk 3? None, as I set include disk 3 and thus if full, it just discards the data? //Magmanthe
  22. quick update.. i know to check sticks separetly, but did a quick-check with all 4, and it spit out errors (just as a sanity check).. So there is for sure something wrong with one (or more), so I'll pull the sticks and check them one-by-one.. Thanks for the tips, didn't even consider it to be a RAM-error... //Magmanthe
  23. Yeah, I am aware of the inc/exlude, but that is also a bit of the OCD that I am forcing specific data onto specific disks without any possible spill-over. So i will respectfully disagree that there are scenarios where you use both include and exlude. But this is also besides the point. Disk-inclusion or exslusion for certain shares, does not have anything to do with a locked-up system with disk falling out of the array? (assumption) No, have not run memtest.. maybe I give it a whirl and see what it spits out. I'll come back in a few days after MemTest is finsihed. //Magmanthe