Jump to content

itimpi

Moderators
  • Posts

    20,696
  • Joined

  • Last visited

  • Days Won

    56

Everything posted by itimpi

  1. That is always an excellent sign as it normally means that the repair process was 100% successful.
  2. It might be worth asking in the UD plugin support thread? Can you ‘ping’ that address from the Unraid server? As far as I know that is how UD determines if the server is online.
  3. You are likely to get better informed feedback if you post your system’s diagnostics zip file.
  4. Can’t see any obvious reason why that would not work and I do something similar myself using NoMachine to access a Windows machine (rather than a VM) on my LAN routed through the Unraid WireGuard tunnel. I think you need to provide more information on how you have WireGuard set up. You also need to make sure that your remote client is on a different subnet to the home LAN.
  5. I think this would all be handled for you if you use the Unassigned Device plugin to mount the azure share. It would handle both saving credentials and mounting the share on reboot. Should be easy enough to try it.
  6. Thanks, I can see that you have set the critical disk threshold to 0 which explains some of the unexpected values I was seeing in the logs around the critical value. I need to cater for that special case to tidy things up. I also need to cater for the warning threshold set to 0 as that is also a legitimate case. Regarding the fact that your disks spin down when the parity check is paused so that their temperature can not be read I am looking at some logic that will create then delete a file on such disks (which should spin them up) and then wait for the next monitor point before making a decision on whether the temperatures (which should now be readable) are correct for a resume. Whether this will work as I hope and help in your case I am not certain but it feels like it could be worth trying. It is definitely very much an edge case as most people do not spin their disks down as aggressively, but maybe with energy prices rising rapidly it may become more common.
  7. That sounds as if the parity check tuning plugin has tried to write a status file to the flash drive and that write has failed. This suggests that either the flash drive dropped offline or the flash drive has problems.
  8. I am confused - according to the diagnostics you have started the array in normal mode.
  9. That is by design. For security reasons files on the flash drive cannot have their ‘execute’ bit set. if you want to execute script files from the flash then you have to precede the file name with ‘sh’ (or ‘bash’).
  10. not sure there is a bug per se here as I think it is a side-effect of the disks spinning down which the plugin is not currently designed to handle in the middle of a check (and where the check has not got beyond the drive size). Looking at the log in the diagnostics it seems all the array drives had spun down (and thus were assumed to be cool). This is why some of the drives had a temperature logged as *C at that point ). That was when the plug-in decided the drives had cooled down enough and restarted the array operation. This caused the drives to spin up and so their temperature could be read again. There is also something going on that I cannot fathom in that the disks are reported as having exceeded the ‘critical’ value, not merely the warning level. It is as if the plugin is getting a value of 0C for that value from the Unraid configuration files. I might like to see the dynamix.cfg file from your system to see what is set there.
  11. Definitely a bad idea. it might be possible if you have a UPS between the smart switch and Unraid to switch off the power supply to the UPS, and make that trigger Unraid to shutdown tidily. After you have given enough time to be sure Unraid has shutdown you then re-enable the power to the UPS to keep it charged.
  12. It is slightly more complicated than that under the covers Initially the Unraid built-in parity check code writes a history record, but with less detail than the plugin can provide. The plugin then later updates that record with additional information. If you look just after the check finishes then you may see the information written by the built-in code if the plugin has not gotten around to updating the record.
  13. I have reproduced this and It IS a a plugin bug and I will get a fix out. I think it is a typo introduced in a recent update that is causing this. BTW: The Parity History entry is from the plugin as well so it is not surprising they agree
  14. I have now pushed the release that allows you to manually configure the plugin's monitor task timeouts Since you already have the plugin installed you should start by using the Defaults button to get the entries into the stored parity.check.tuning.cfg file in the plugins folder on the flash. Now set the settings in the GUI as you want them and press Apply to update that file with those settings. At this point you can now manually edit this file to play with monitor task frequency with the entries of interest being: parityTuningMonitorHeat="7" which is the monitor frequency (in minutes) if you have enabled the plugin option to have the plugin checking for temperatures. This is the delay that could happen before the plugin even detects you have started a parity check, and I suspect changing it would not make much difference to overall behaviour but you are welcome to try to see if it does. parityTuningMonitorBusy="6" which it the monitor frequency that the plugin sets after it has detected an array operation is active. I think this is the setting that is most likely to reduce the chances of temperature overshoot for you and I would be interested to know if reducing the value helps in any way. Let me know how things go. I would be interested in seeing any diagnostics with at least Debug logging enabled regardless of the outcome of your tests.
  15. I also note that that the docker.img file is explicitly set to be on disk1. Normally you want it to be on the cache to get better performance from your docker containers. It is also set to be 100GB which seems excessive. The default of 20GB is enough for most people unless you have a lot of docker containers. Is there a reason you have such a large size? If you find you are filling that up when docker containers are running you may have a docker container mis-configured to write into the docker.img file when it should be mapped to use Unraid storage external to the containers.
  16. You are likely to get better informed feedback if you post your system’s diagnostics zip file.
  17. OK - Good to know the basic mechanism is working properly now. I can post a release that will allow you to adjust the intervals at which the monitor task runs if you want to check out if shorter intervals give you better control.
  18. Settings are easy as they are stored on the Unraid flash drive. If you delete the existing docker.img file so that Unraid creates a new empty one on restarting docker you can then use Apps->Previous Apps and tick of the containers to be reinstalled and they will be put back with their settings intact. the above unfortunately does nothing for any databases the containers created in their appdata folders.
  19. If the power supply was faulty (or you used the wrong cables with a modular supply) then you could burn out the electronics on all the drives. Hopefully that is not what has happened in your case but your symptoms suggest it is a possibility.
  20. Each docker container will have its own unique requirements around permissions so there is no easy fix. The only way that is guaranteed to work is to wipe the existing data for the containers and then start them up as though they were freshly installed so they can rebuild their appdata content
  21. No, the plug-in sets up cron jobs with the frequency chosen according to the current plugin settings. The values I have chosen are basically arbitrary and ones I have chosen as seeming appropriate. It is possible, however that the setting you mention DOES affect the frequency at which the temperature values I read in the plugin are updated, but it is not something I have looked into.
  22. It is under your Account settings for the forum.
  23. From what I can see in the diagnostics this is due to the speed at which the drives are heating up between the checks to monitor temperatures so the drives temps are overshooting before the excessive heat is detected. At the moment that is only every 7 minutes (I did not want it to be too frequent to keep the cost of monitoring down) when temperature monitoring is active - maybe I need to consider using a more frequent check. I could introduce easily some settings into the plugin's .cfg file that control this interval and that are not exposed in the GUI but could be changed by editing the file directly to allow for experimentation with different values. The downside is that more frequent checks would result in more log entries for everything except Basic mode logging unless I revisit what gets logged at the different levels.
  24. The problem is that at the moment I have no logic that would keep the drive spinning so something like that would take some research (although I have some ideas). I think it is worth waiting to see if it really would be needed as it adds complications that might not be needed. If it was done it would need to take into account if the check has passed the size of the disk as in that case you would not want to keep it spinning.
  25. @Quejo Thanks, your last diagnostics showed me where the plugin was losing track of the fact a pause had been done because the drives overheated and I have pushed an update that corrects that. The question is whether there is any other lurking bug in this area, and if so another run with Testing logging active should show me where. I have slightly improved my logging in this area to help me identify problems. Note that if you have not set the option to run Manual checks in increments then it should not matter if the pause/resume because drives get hot fall inside the increments time window when you actually do a Manual check.
×
×
  • Create New...