Jump to content

Vr2Io

Members
  • Posts

    3,668
  • Joined

  • Last visited

  • Days Won

    6

Posts posted by Vr2Io

  1. 4 hours ago, casperse said:

    I got message that even at 30% load the fans will be LOUD!

    I think so, I test that too. I replace those fan with some cheap PWM fan, currently have over temp issue,. but my rack not 7x24. ( FAN max in 1200rpm )

     

    4 hours ago, casperse said:

    Do you have alternative fans to recommend?

    I will prefer replacement in around 3000rpm fan, Noctua or SILENT WINGS PRO 4 should be my first choice.

     

    One of problem you need to be address, those fan thinner then bundle fan, so if fan mount to the housing, then a big gap with exist between the fan and fan plate, you may need mount the fan to the plate for better performance.

     

     https://www.bequiet.com/en/casefans/3705 ( it have speed switch, I think it would have large speed rang adjustment )

     

    image.thumb.png.d2e8f1f579530948075ae21adadb88f4.png

  2. 20 minutes ago, casperse said:

    I will try to take a picture of the Fan when I get home (Should be some sticker with data on them)

    I enlarge the photo, confirm it is 3A, pls don't direct plug to mobo.

     

    image.png.3d9fd93f44df97230001024f4182d319.png

     

    20 minutes ago, casperse said:

    Just got a strange message from the vendor, claiming that I should be able to control all fans by a cable between the backplane and the MB using the CHA_FAN pin?

    It is possible if backplane / fan controller support connect to mobo fan header to take control. For external or 3rd party fan controller, this usually won't support high power fan.

     

    20 minutes ago, casperse said:

    I might end up swapping them all with Noctua fans like many others....

    Need consider air flow / pressure will drop a lot even Noctua Industrial type.
     

     

  3. AMD 3000G / A320 chipsert may not support ECC memory.

     

    Below table ( not A320 ) also indicate different CPU with different chipset have different ECC support.

     

    image.png.b210a683d244ffb5f66d18e5699df0f9.png

     

    I believe motherboard problem ( or other unknown reason ) more then memory.

    Could you try copy some file to /tmp ( ram disk ) then monitor does file content / hash will change with time ?

     

    There also not sense if it is memory problem

    - If memory problem, system will crash, not just file corrupt.

    - If array disk file corrupt, then it also expect system file / USB file corrupt too, as result system will also crash.

     

    I have a strange experience, a new build platform, all test was great and running well, but when I insert a NVMe, file copy to it will corrupt immediate, no different with different NVMe ( no any error log / even PCIe error and no problem on SATA disk ). But problem suddenly gone after few days troubleshoot. Finally, I RMA this mobo and problem never happen again.

     

     

  4. On 1/7/2024 at 11:13 AM, MACGoof said:

    It ran fine in safe mode for ~9hrs today.

     

    2 hours ago, MACGoof said:

    I ran memtest this weekend and it passed.


    Tested with some other ram I have and it crashed then as well.

     

    Does use Unraid build-in memtest ? You must solid pass memtest 1st.

     

    I check your diagnostic, the memory config have problem. no reason memory stick would run at 3200MT/s but just 1.2v. ( Pls tuning BIOS memory setting or simple disable XMP, so it will clock down the memory )

     

        Configured Memory Speed: 3200 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

     

     

    2 hours ago, MACGoof said:

    What are normal temps for the drives? Most are ~35 C but my parity drive can get ~45 C. Just trying to think of anything here. 

    This not a problem.

    • Like 1
  5. 7 hours ago, zali01 said:

    Question 1: Was the RAM test that was done through the ASUS BIOS an inferior test? Why would that test pass and the unraid memtest detect errors/fail?

    That depends on how robust of those memtest program to make test, different software have different test algorithms.

     

    Unraid have update the build-in memtest86+ ver. since 6.12.5, in old ver., it also unreliable and I always use 3rd party memtest program.

     

    Anyway, all depends on which test program use, it doesn't relate come from BIOS, Unraid or whatever.

     

    image.png.ee897608b80b403141986336ce3b0329.png

     

    https://github.com/memtest86plus/memtest86plus?tab=readme-ov-file#memory-testing-philosophy ( your BIOS memtest were come from PassMark software, they are compititor )

     

    image.png.c7bceb46eecb0ed0e7a7ba652ddc4bae.png

     

     

  6. If memory test fine, I believe it is other issue cause corrupt problem.

     

    As mention, if static file just seat at storage, no much reason it will corrupt. Could you change it to read only and monitor its content will be change ?

     

    Pls ensure no one have copy/backup it in same destinations, and no user share to disk share or vise reverse.

  7. I agree 24hrs memory test already enough, in fact, I usually test in several hrs or several pass only. I never have ecc-memory.

     

    The problem is not much trouble shoot info. have provide, so I don't know how to provide problem fix direction.

     

    When you download a file to Unraid, what hash at source and destination ? How long it will corrupt ? If it is a static file, this also hard to imagine why it will corrupt later.

     

    Pls make some test by rsync, copy some file to /mnt/user/test1/

     

    then

     

    rsync -ah /mnt/user/test1/ /mnt/user/test2/

     

    Does success without error ?

     

     

  8. All above not the reason for file corrupt. Are you access file through network ? Pls check network part too.

     

    In local ( Unraid ), you can hash a file i.e. b3sum <file> and keep track does the hash result always different.

  9. 1 hour ago, casperse said:

    Anyway I am trying to figure out how to control the case fans? (THEY ARE VERY VERY LOUD !!!)

     

    Those sever fan may draw 1.5A to 3A, don't direct plug to mobo.

    For me, I use PWM fan ( 4 wire , but most bundle server fan won't be PWM type ) and split the TACH & PWM ( speed control ) to one of mobo fan harder ( 2 wire ). The power draw ( molex ) was direct connect to PSU. So, speed adjust will sync to all 4 fan.

     

    Below are the sample what does those spliter looklike.

     

    image.png.29114eeb26e5221aa08887b028bc32e2.png

     

    image.thumb.png.b2230fe67ee4ba88d5b3fde82f583a78.png

  10. I have similar experience, BIOS post sometimes will stuck and no boot, but reset several times usually will solve, if not, I need power off and re-insert some PCIe card then also fix. Due to problem occur were random and intermittent, so I think it is mobo issue and I haven't another mobo support same CPU, so I live this in longtime.

     

    But, one day I want to shoot this out ..... finally identify problem come from LSI 9201-16, and I also remember other platform have similar problem once with this HBA. That means I wrongly recognise all cause by mobo unstable and put it in test bed.

  11. You have many ARRS application and Plex, could you disable those and monitor crash or not ? If still crash, you may need disable docker service, the aim were identify does problem come from docker.

     

    Your system temperature in 6x degree, does this true or wrong sensors select ? ( It doesn't make sense as CPU temp in 7x degree only with 50% loading when system temp in 6x )

  12. 24 minutes ago, hansolo77 said:

    If I wake up tomorrow and the system has crashed again, how can I recover a log?

    Pls enable syslog mirror to flash, so you will got previous log even system reboot.

     

    25 minutes ago, hansolo77 said:

    The only change in hardware is a new drive. The system has been very stable for a long time. I moved into my own apartment back in June and haven’t had a single issue. 

    Got it.
     

    26 minutes ago, hansolo77 said:

    Could the errors be caused by the drives starting to fail from usage?

    Could be, SSD have write limitation, but general check cache pool NVMe in good state although they write a lot.

     

    Percentage Used:                    13%
    Data Units Read:                    392,859,154 [201 TB]
    Data Units Written:                 1,081,961,255 [553 TB]

     

    Percentage Used:                    13%
    Data Units Read:                    392,129,855 [200 TB]
    Data Units Written:                 1,079,967,300 [552 TB]

     

  13. Call trace have found in previous log, Do you perform memory test to prove system stability ?

     

    1 hour ago, zali01 said:

    it seems some folks who had other unrelated issues had some success with using an earlier version of Unraid. Is this something I should try? 

     

    Yes .... some post say newer Intel platform trend unstable with current Unraid, but I can't verify that because I stay in 9th gen.

  14. The log haven't error and just new boot, so no crash info. have provide.

     

    Seems system unstable issue last long time ( according your old  post ), some through as below

    - 4 memory stick run in 2666Mhz, it may have problem, pls try memory test or clock down it, make sure they are stable enough.

    - With GPU, 10G NIC, expander and lot of spinner disk, pls ensure PSU can power them well.

     

    Could you test whole system without spinner disk or just keep minimal, best remove HBA and expander, then test remain component ...... after all show positive then add back them. 

  15. From description, it show memory stick was good, just can't run in dual stick with current mobo, that's why I suggest you clock down the memory clock. Those problem are quite common and RMA those strick won't help much.

  16. 14 hours ago, Oublieux said:

    This might sound silly, but how does this plugin handle deleted files? I had this plugin build a database of hashes, but deleted some files afterwards. When running a verification, the plugins note that there are problems because files are now missing. Is this expected behavior?

    You need perform export again.

×
×
  • Create New...