Jump to content

omartian

Members
  • Posts

    105
  • Joined

  • Last visited

Posts posted by omartian

  1. 4 hours ago, Tigerherz said:

    I had similar problems.

    It was a problem with spin down disks.

    I set spin down in disksettings to never

    and clear stats on the mainpage.

    I think my controller has a problem with spin down.

     

    Do you disable spin down for parity checks or at all times?  If your disks are spinning all the time, isn't that bad for longevity?

     

    Also, if you got an error this way, do you run a correcting check afterwards?

  2. On 6/1/2021 at 2:50 AM, itimpi said:

    It has just occurred to me that if you have the Parity Check Tuning plugin installed then you might be able to investigate this far more rapidly using it's Tools -> Parity Problem Assistant feature?  That feature was developed for exactly your scenario but I have never had any feedback on how useful it turns out to be in practice so would be interested to get some (plus any suggestions for making it more useful).

     

    Which sectors should i point it to.  I have a hard time

    On 6/1/2021 at 2:50 AM, itimpi said:

    It has just occurred to me that if you have the Parity Check Tuning plugin installed then you might be able to investigate this far more rapidly using it's Tools -> Parity Problem Assistant feature?  That feature was developed for exactly your scenario but I have never had any feedback on how useful it turns out to be in practice so would be interested to get some (plus any suggestions for making it more useful).

     

     

    So tried using this plugin, but getting an error message.  

     

    Based on the syslog that jorge highlighted above, it looks like the error happens between sectors 18795850000 and 25097678000

     

    When i try to set it to go, i get the error message:  "end point too large:  The end has been set to more than the size of the disk."  i punched in the above #'s as start and endpoint bc it's asking for sector numbers.  Do i need to adjust it somehow?

  3. 14 minutes ago, JorgeB said:

    It means parity doesn't match de calculate from the arrays data devices, but with data corruption the problem can be anywhere, could be already written corrupt, could be parity that is wrong, or just the calculation at that time is wrong.

     

    Thanks for all of your help Jorge.  I'll keep tinkering, run a correcting check.  

  4. 38 minutes ago, JorgeB said:

    If there are still errors on consecutive checks you basically need to rule out the hardware involved, RAM is still a good candidate even without memtest finding errors, but could also be board/CPU or a disk, I would start by using just one DIMM at at time since it's the easiest thing to rule out.

     

    Ok. One dimm at a time on a non correcting check. 

     

    Do these sync error mean that the media files on the data disk are no longer valid or just that there is a discrepancy w the parity. 

     

    I'm wondering if I fixed this issue but since I never ran a correcting check, the same random sync issues pop up. If I run a correcting now, I'm hoping the next non correcting check would be clean. 

     

    I wish unraid made it easier to isolate the issue. Too many variables....

  5. On 6/1/2021 at 4:18 AM, JorgeB said:

    #1 reason for unexpected sync errors is RAM related, if just downclocking didn't fix it it could be a bad DIMM.

     

    Memtest is currently on pass 9 and has been running for about 20 hrs w/0 errors.  

     

    At this point, should i run a correcting check, or is there anything else i should be doing?

  6. 18 minutes ago, itimpi said:

    It has just occurred to me that if you have the Parity Check Tuning plugin installed then you might be able to investigate this far more rapidly using it's Tools -> Parity Problem Assistant feature?  That feature was developed for exactly your scenario but I have never had any feedback on how useful it turns out to be in practice so would be interested to get some (plus any suggestions for making it more useful).

     

    Will check out that plugin. Thank you.

     

    Weird. Could have sworn I downloaded the 2 error after a full scan. 

     

    Anything else you can make out of those bad sectors?

  7. On 5/27/2021 at 1:43 PM, JorgeB said:

    First check after fixing the problem can still find errors, but after that it should always be 0 errors, which is the only acceptable number of sync errors.

     

    So i just ran two parity checks.  after switching off xmp on my ram, i ran a non-correcting check and only got 2 errors which i thought was weird since i was expecting the 9 from before.  

     

    I attached that diagnostic.

     

    I then went to my server and re-seated all my sata cables and my sata-sas adapter (LSI SAS 9207-8i SATA/SAS 6Gb/s PCI-E 3.0 Host Bus Adapter IT Mode SAS9207-8i US).

     

    Decided to run another non-correcting check, and received 6 errors.  attached below.

     

    I noticed that when the process is running, i get about 65% of the way through w/0 errors.  It seems like when i get to disk 6 + 7 (or maybe just 7), the parity errors occur.  Do you think it might be that connector or that disk based on the diagnostics?

     

      

    2 errors noncorrecting check.zip 6 errors noncorrecting check.zip

  8. 10 hours ago, JorgeB said:

    You're overclocking the RAM, Ryzen with overclocked RAM is known to corrupt data resulting in sync errors, see here.

    Checked in the Bios and XMP was enabled for my ram.  I disabled the xmp profile and took speeds from 3200 to 2100hz.  Hopefully that won't affect my plex server performance.  Will try doing a correcting check now, hoping for only 9 errors. 

  9. 7 hours ago, JorgeB said:

    You're overclocking the RAM, Ryzen with overclocked RAM is known to corrupt data resulting in sync errors, see here.

     

    That's strange. I updated the bios a few months ago but don't recall manually doing it. 

     

    I'll take a look and see. Hoping that's the issue. 

     

    How would I identify what the 9 errors are?  Should I click write corrections to parity for my next check?

  10. Hi Everyone-

     

    Ran my monthly parity check (for the first time where i unchecked "write corrections to parity") and came up with 9 errors.  

     

    Last 2 months, i had 1 error each and the month before that I had 9. 

     

    Don't recall any unclean shutdowns in the last 30 days.  I occasionally get a message about once a week from unraid stating, "the connection to your UPS has been restored" for my APC UPS.  Haven't had power go out in months.  

     

    When i look under the main tab, all of my disks have 0 errors.

     

    Is this something to be worried about?  Any idea how i identify which files/disks are involved?  

     

    Attached is my diagnostics.

     

    I'm currently on 6.8.3 and haven't updated to 6.9.2

     

    The other issue is my krusader and plex docker programs won't let me update them. It states "version not available".  I've restarted both dockers and clicked the check for updates box.  Any idea? Are the 2 problems somehow related?

     

    Thanks in advance.  

     

    nasgard-diagnostics-20210526-1506.zip

  11. On 2/3/2021 at 8:54 PM, Hoopster said:

    If you click on a disk name form the Main screen, you can download the SMART report, see the SMART History for the disk and run additional tests.

     

    image.png.1e296d61f9a844bba1b43009ab00b5c6.png

    Ran a second parity test.  Took a little bit longer but no errors. Will chalk it up to the ups error I received during 1st parity check. If I get errors in the future, I'll run smart reports on the discs. Thank you. 

  12. 8 hours ago, Energen said:

    Easiest thing to do is to check all your SMART values for your drives, run some tests on them.

     

    Parity errors don't necessarily mean any drives are failing, only that at some point there was an error -- power failures, bad copies, etc.. anything could cause it. 

    Is the smart history under settings somewhere or is it an application I need to install?

  13. Hi everyone-

     

    I've been using unraid for over a year now.  things have been going pretty well and have never had any parity checks.  last night a parity check was scheduled and thought nothing of it.  

     

    This morning when i got the email, i was suprised to see that there were 9 errors.  

     

    My diagnostic file is attached.  i did get 2 ups alerts w/in the last day, but i get those from time to time even when the power doesn't go out.  

     

    Also, under parity check, i have "write corrections to parity" checked.  is this a problem?  Is one of my discs failing?

    Screenshot (9).png

    en.zip

  14. Hi Guys-

     

    Recently upgraded the firmware for my aorus b450 pro wifi to version f60e.  Had an issue where my unit wouldn't boot to bios.  It turned out that i could get into bios w/out my sas controller was unplugged.  I changed the pcie settings in the bios and got my server up and running again.  

     

    Did a pre-clear of a 14 tb drive which took approximately 4 days and noticed i would get emails in the morning w/the following error code:

     

    fstrim: /etc/libvirt: FITRIM ioctl failed: Input/output error
    fstrim: /var/lib/docker: FITRIM ioctl failed: Input/output error

     

    Would also get this code:

     

    **** Unable to write to cache ****   **** Unable to write to Docker Image ****   **** unRaid's built in FTP server is running ****  

     

    I've never precleared a disk before so i attributed it to the preclear process.  It took 4 days for the 14 tb drive but i got the ok from preclear that the disk was good.  I haven't received the "fstrim" error message since monday (did the preclear over the weekend), but still get the cache/dockerimage/ftp server messages the last few mornings.  

     

    Also, another thing i've noticed is that my docker's (plex and binhex-krusader) run fine, but if i stop them, i can't restart them.  I get an "execution error, error code 403" message and both dockers refuse to start.  if i reboot my system or stop/start array, it'll come back up but i've never had this issue before. 

     

    Attached is my diagnostics file. 

     

    Do you think there can be an issue w/the sas controller for one or both of the error messages.  

     

    Any and all help is much appreciated.  

    nasgard-diagnostics-20210107-0840.zip

  15. Hi Everyone-

     

    Was having some random shutdowns w/my unraid server running latest version, so i decided it was a bright idea to update the bios on my aorus pro wifi b450 mobo.  I was running version f50 and decided to update it to f60e.  

     

    I updated the bios via usb and got a confirmation that it went through.  The pc restarted but then when the gigabyte title screen popped up, I couldn't enter the bios screen when i would hit "delete, f12, end, f9".  Same thing would occur w/a power cycle. Just a black screen w/a blinking cursor. 

     

    I decided to unplug everything other than gpu, psu, cpu, and ram and i was able to get to the bios screen and it's saying that i'm updated to f60e.  

     

    Turns out, when i try to plug in my sas controller (LSI SAS 9207-8i), the same thing occurs. 

     

    I went back to the bios after unplugging the sas controller and made sure my boot order was topped by the unraid boot disk.  CSM is on. i made sure xmp was on for my ram.  

     

    Not really sure what to do know. This was perfectly fine for the last year but i just had to go and meddle w/it.  

     

    Any suggestions would be much appreciated.  

  16. On 8/22/2020 at 4:25 PM, trurl said:

    Nothing apparent in those. 2 FAQs that might be relevant:

     

    So the same thing happened twice within the last week.  no errors after 1st parity check. 2nd parity check occuring now.  

     

    I have an aorus b450 pro wifi-cf mobo.  I could upgrade the mobo to version f51 (i'm on f50).

     

    i also have an amd ryzen 5 2600 cpu.  can't determine what the bios is for that from the main dashboard in unraid. 

     

    Do i need to update bios for cpu and mobo?  Do i need to disable c6 and adjust power supply control?

     

     

  17. Hi guys-

     

    My unraid server has been running w/o issues for the last few months.  I set a file transfer last night via krusader and when I checked on it this morning.  My server was unresponsive and unreachable.  I hooked up a monitor to my tower and didn't get any video.  

     

    Tried to Ctrl-Alt-f1 (all the way through f7) and still nothing so had to do a manual power down.  my server is back up and running and i ran a diagnostic.  Can someone take a look at my diagnostic and let me know if there is any cause for concern?  

     

    Hoping this is just a freak thing.  

    nasgard-diagnostics-20200822-1550.zip

  18. 1 hour ago, kolla said:

    Have you solved this problem? I have been having similar issues accessing my shared folders from my Oppo203 in nfs. I have about 1400 shared folders all of which are nobody/users. In smb all of them show up, but in nfs only about 73 shows up. Sometimes this number increases but never to the full 1400 folders.

    I'm running the latest unraid stable version and I'm pretty sure this started happening from about 2-3 release updates ago. Before that I had no trouble seeing all my shared folders via nfs..

    any suggestions on how to debug this?

    Never sorted it out. Just started using smb instead. It bums me out bc I'd prefer nfs.  Let me know if you sort it out. 

×
×
  • Create New...