Jump to content

Johann

Members
  • Posts

    74
  • Joined

  • Last visited

Posts posted by Johann

  1. Here's the problems right now. Disk 2 is disabled. Disk 3 is rebuilt, but I'm uncertain whether or not the data is correct since there weren't writes to the disk for a while but other disks were reading. Parity 2 has dropped itself from the array and currently its "unassigned.

  2. I found this similar situation but didn't really identify the fix here:

    Looks like the rebuild thinks is continuing and reading from drives but its not writing to the rebuilding drive. I'll let it finish its course. I think the next step before doing anything would be to wait for its to finish "rebuilding" and running xfs_repair -v /dev/mdX on each disk in maintenance mode from https://docs.unraid.net/unraid-os/manual/storage-management/#xfs-and-reiserfs to verify everything all filesystems are working. Once confirmed all the filesystems are good to go, i think I would keep disk 2 disabled since the data should be intact on that drive but I'm not sure why it had read errors. But next step to rebuilding disk 3 on top of itself? The disk it is replacing is still in unassigned drives with data intact and untouched.

     

    What would be the best course of action here, I'm gonna wait for a response before I break everything...

     

  3. Wow i'm scared at the moment. I'm trying to replace my remaining 12TB with 18TB. I unassigned a 12TB and assigned a precleared 18TB, all fine no issues, rebuild starts. 4TB in I get a notification that disk 2 has read errors and is disabled, 2048 errors. Ok i'll let it rebuild and replace that drive maybe its dying after. I wake up today and parity 2 has 563,197,701 errors and increasing by the second?! What is going on I attached my diagnostics. I've never had issues with unraid till the last few weeks in the years Ive been using it! Current rebuild has 5 hrs left. Looks like syslog is full cause of "Mar  2 09:37:00 Toblerone kernel: md: disk29 read error, sector=30864609368" I just increased log size to get more info. The temp on parity 2 has a star idk why but its still marked as green, "normal operation"?

    toblerone-diagnostics-20240302-0932.zip

  4. I thought it might be the backplane of my 4U case wasn't getting enough power, so I took out some drives I was preclearing back to what it was before. The "Power-on or device reset occurred" error still persist. I know I recently switched HBAs from the 2008 to 9500, so I switched it back and switched back the cables and its still persisting. I'm really lost and don't know whats wrong.

  5. I don't think its overheating it gets pretty good airflow with the fans in the 4u case. I'll definitely have to try reseating it, I have to finish rebuilding an existing drive first tho only a few more hrs. Just looking to learn, where can I see that its having problems communicating with some disks?

  6. 6 hours ago, MAM59 said:

    Turn on FLOW CONTROL on BOTH sides!

    Your Unraid is set to "Receive only", your Switch to "none". This will very likely end up in lost packets, timeouts and retransmissions.

    grafik.png.d7f14f5079342b2a05f13d8dbda2d068.png

     

    Thanks for the response! I just turned it on. I saw this post, where he says flow control being on causes issues?

     

  7. 6 hours ago, Mainfrezzer said:

    The drops are caused by the macvtap interface.

    I can recreate them on different machines at will. Big issue that you cannot disable the macvtap if you dont need it. The versions where just eth0 is present work without an issue.
     

     

     

    https://forums.unraid.net/bug-reports/stable-releases/6126-macvtap-causes-consistent-package-loss-r2836/

    If you wanna contribute to it^^ 


    Edit: For the funsies, here is the result from my main pc, as before, completely different hardware, completely different cables.
     

     



    Heres a comparison with an older Unraid version
     

     

    Wow thanks for the videos! This is definitely it, I have bridging disabled because I use macvlan which I followed here https://docs.unraid.net/unraid-os/release-notes/6.12.4/, I like macvlan because it assigns a different MAC address. Is this still the recommended way to use macvlan, by disabling bridging? I have the same behavior in the RX drops as you showed in the videos.

  8. 6 hours ago, MAM59 said:

    usually drops at 10G mean: BAD CABLE!!!

    Make sure, you have got a real 10G cable. Most of them sadly are are "raw cables" which means, the cable is fine, but the plugs are not capable of 10G.

    As a last resort the speed will drop to 5 or even 2,5 G to compensate the transmission errors.

     

    BTW: it has NOTHING to do with power efficence. The link speed is the same for 2.5, 5 and 10G. Just the Usage/Pause times are different. So it is not wrong or uncommon, that a switch reports a 10G link while the real used speed is lower. Make sure that Flow Control is turned on and working, else you will notice a lot of retransmissions and slowdowns.

     

    BTW2: if possible, avoid twisted pair 10G completly and go safe with fiber or direct connect SFP+

     

    BTW3: you HAVE a 10G connection, read your list correctly! the 10000 comes before 2500 and 5000. And this is just an offered list, the picked speed is 10000 as you can see below.

    grafik.png.ed759a956cd93ce1da7dd1bcdbaffdbc.png

    Ah yes I didn't read the list correctly! I'll try a new cable and see if it fixes it. I did have a X540-T2 which worked well, but I did an upgrade and there was 10Gb on the motherboard and I wanted to simplify/have more PCIE slots available. I agree with you and I also prefer SFP+, much simpler and cheaper! Thank you!

  9. 7 hours ago, SimonF said:

    Eth tool is showing 10Gb

     

    image.png

     

    image.png

     

    Not sure why you are getting drops. There are comments about turning off power efficiency settings for windows, not found an option on linux yet.

    Ah yes thank you. Right after I posted I noticed that, I deleted the post cause I just didn't see it at first for some reason, I assumed it listed it in order of speed. Thank you for pointing that out.

  10. I just upgraded to the Asus Z790 ProArt Creator which has the 10gb nic built in. I have it connected via RJ45 at the board into a SFP+ to RJ45 adapter. On the Ubiquiti switch, it negotiates at 10GB, and in the webui main page, it says 10000. Using ethtool however, i see that the ubiquiti switch is asking for 10000, but the Asus motherboard can only advertise to 5000. I tried a different 10GB switch which has RJ45 ports and ran ethtool again and the same thing shows up, so I dont think its any fault of the switches or cables? I included the diagnostics and the screenshot below. Any help is greatly appreciated! I get occasional drop out in accessing the web ui and also over 500k read drops.

    Screenshot 2024-02-17 at 1.20.47 AM.png

    toblerone-diagnostics-20240217-0124.zip

  11. Today, many of my services were down with only some being up. I wasn't able to access the web interface, and logging in over KVM, I wasn't able to login to the shell with a login timeout error after X seconds. I was able to ping the server ip, but I wasn't able to SSH in either. On my PiKVM i did a shutdown, which I was able to see was initiated on the KVM output, but after 300sec, it sais unclean shutdown and I have attached the diagnostics if someone would be able to help me out.  I suspect it might be from running out of resources but i'm not sure. Thank you in advance!

     

    I hope the unclean shutdown diagnostics are anonymous...

    toblerone-diagnostics-20231213-0951.zip

×
×
  • Create New...