Alby24 Posted July 9, 2021 Share Posted July 9, 2021 (edited) Hi, I'm currently turning the system on and off using a Raspberry Pi connected to the MB's pins. This way I can emulate the pressing of the power button, remotely. I believe that this procedure doesn't let the array stop properly. If the mover is running while I press the power button, the system will shutdown cleanly with no error and no data loss, but the file that was being transferred from the cache pool to the array will stick to the cache forever. In other words, the file is ignored by the mover, even if it is still on the cache drive. Which is the proper way to cleanly shutdown the system without breaking the mover? Edited July 9, 2021 by Alby24 Quote Link to comment
itimpi Posted July 9, 2021 Share Posted July 9, 2021 Normally it is by issuing the 'shutdown' command from the GUI or the command line. If trying to do it via simulating pressing the power button then you must simulate a short press (just a few seconds) which should also initiate the shutdown sequence. What you must NOT do is simulate pressing the power button for a long time (e.g. 30 seconds) as that will force the BIOS to do an immediate cut of the power. Quote Link to comment
Alby24 Posted July 9, 2021 Author Share Posted July 9, 2021 (edited) 23 minutes ago, itimpi said: Normally it is by issuing the 'shutdown' command from the GUI or the command line. If trying to do it via simulating pressing the power button then you must simulate a short press (just a few seconds) which should also initiate the shutdown sequence. What you must NOT do is simulate pressing the power button for a long time (e.g. 30 seconds) as that will force the BIOS to do an immediate cut of the power. Yes a short press is what I am simulating already, but that seems to generate the issue. How is the mover supposed to behave if system shuts down while running? Could someone else try to reproduce this? Edited July 9, 2021 by Alby24 Quote Link to comment
itimpi Posted July 9, 2021 Share Posted July 9, 2021 4 minutes ago, Alby24 said: Yes a short press is what I am simulating already, but that seems to generate the issue. How is the mover supposed to behave if system shuts down while running? Could someone else try to reproduce this? Have you tried a short press on the real power button to see if that works? You need to get the Pi out of the loop at this stage and make sure that is being captured, and if not then look into why. If that seems to be captured but the system does not shutdown tidily then it is probably worth raising it as a bug report. Quote Link to comment
Alby24 Posted July 9, 2021 Author Share Posted July 9, 2021 (edited) 2 minutes ago, itimpi said: Have you tried a short press on the real power button to see if that works? You need to make sure that is being captured. It is being captured because the system does turn off. Anyway, I haven't tried to reproduce it with the real power button yet, but tomorrow I definitely will. Edited July 9, 2021 by Alby24 Quote Link to comment
itimpi Posted July 9, 2021 Share Posted July 9, 2021 Just now, Alby24 said: It is being captured because the system does turn off. You need to make sure that it is being captured as a "short" press to initiate the Unraid shutdown sequence. If it is being treated as a "long" press then that is captured by the BIOS and does an immediate poweroff bypassing Unraid. Quote Link to comment
Alby24 Posted July 9, 2021 Author Share Posted July 9, 2021 (edited) This is the code that simulates the pressing of the button (it pilotes a relay) gpio mode 0 out sleep 0.500 gpio write 0 0 sleep 0.250 gpio write 0 1 As you can see, the signal is up for less than a second. In fact, I never got problems about unclean shutdowns, it is just this mover "thing". Edited July 9, 2021 by Alby24 Quote Link to comment
itimpi Posted July 9, 2021 Share Posted July 9, 2021 11 minutes ago, Alby24 said: This is the code that simulates the pressing of the button (it pilotes a relay) gpio mode 0 out sleep 0.500 gpio write 0 0 sleep 0.250 gpio write 0 1 As you can see, the signal is up for less than a second. In fact, I never got problems about unclean shutdowns, it is just this mover "thing". It sounds like a bug in the shutdown code then where the mover process is being forcibly aborted leaving behind an incomplete file on the array rather than shutting down tidily and making sure that any partially copied file on the array is removed. Because mover never overwrites existing files on the array this would give the symptom you describe of the file getting 'stuck' on the cache. Quote Link to comment
Alby24 Posted July 10, 2021 Author Share Posted July 10, 2021 14 hours ago, itimpi said: It sounds like a bug in the shutdown code then where the mover process is being forcibly aborted leaving behind an incomplete file on the array rather than shutting down tidily and making sure that any partially copied file on the array is removed. Because mover never overwrites existing files on the array this would give the symptom you describe of the file getting 'stuck' on the cache. I managed to reproduce the problem by pressing the power button manually, steps: Transfer a large file (mine was 271 GB) to a share that uses the cache pool Invoke the mover, or wait for its schedule, doesn't make any difference Press the power button, the system shuts down Turn on the system again The file that was being transferred is stuck on the cache drive and the mover will ignore it from now on A copy of the file is present on the array too (perhaps damaged, even if showing the correct size, as there is no way 271 GB were copied in a few minutes) So yes, it sounds like a bug in the shutdown/mover code. Could someone else try to reproduce it? Quote Link to comment
Alby24 Posted July 10, 2021 Author Share Posted July 10, 2021 Alright, maybe I have understood what is happening. UnRaid this time noticed that something was off, this notification was shown: Automatic unRaid Non-Correcting Parity Check will be started Unclean shutdown detected I believe this happened every other time but I just ignore it because I didn't realize the mover was running. Probably unRaid will now wait for the parity check to end, and it will realize that the file was moved only partially and eventually invoke the mover again. Still, I need the parity check to be complete to verify this, so...See you tomorrow Quote Link to comment
Squid Posted July 10, 2021 Share Posted July 10, 2021 AFAIK, a powerdown will stop mover. However, it'll probably only manage to cleanly stop it after the current transfer is finished. If (with everything else going on during a shutdown), mover takes longer than the shutdown timeout, then the system will forcibly kill the process. This will result in a partial file being on the array (in the case of useCache:yes), or a partial file being on the cache pool (useCache:prefer). Because of how Unraid handles multiple identically named files on the array / cache, in the latter case it will appear that the file is corrupted / incomplete. (even though the source file isn't) Quote Link to comment
itimpi Posted July 10, 2021 Share Posted July 10, 2021 @Squid I agree with you description of what is happening. Sounds like a bug in mover and mover should have a way of being told to needs to forcibly close down and tidily remove the (incomplete) target file before exiting. Quote Link to comment
Alby24 Posted July 10, 2021 Author Share Posted July 10, 2021 So basically unRaid uses COW in the case of useCache:yes and the source file cannot be corrupted, am I correct? I will still wait for this parity check to end, to see if unRaid can solve the issue automatically. I believe more tests need to be done, does the same behaviour occur when the shutdown is sent using the GUI or with the "powerdown" command in a terminal? (I read somewhere that "powerdown" is a cleaner way to stop the system) Thanks for your help guys Quote Link to comment
itimpi Posted July 10, 2021 Share Posted July 10, 2021 35 minutes ago, Alby24 said: So basically unRaid uses COW in the case of useCache:yes and the source file cannot be corrupted, am I correct UnRaid is not using COW - just a simple copy that it looks like is being aborted prematurely. I would expect the copy that appears to be ‘stuck’ on the cache to be fine. Quote Link to comment
Alby24 Posted July 11, 2021 Author Share Posted July 11, 2021 Hello again, Parity check completed successfully (finding 0 errors), but the mover is still ignoring those files and so they are "stuck" on the cache drive. Today I will make another attempt by shutting down the system via the GUI button. What now? Can we officially call it a bug? Quote Link to comment
JorgeB Posted July 11, 2021 Share Posted July 11, 2021 39 minutes ago, Alby24 said: but the mover is still ignoring those files and so they are "stuck" on the cache drive. You just need to delete the partial files and run the mover again, it can be considered a bug but a very corner case, or do you plan to regularly shutdown the server while the mover is running? Quote Link to comment
Alby24 Posted July 11, 2021 Author Share Posted July 11, 2021 10 minutes ago, JorgeB said: or do you plan to regularly shutdown the server while the mover is running? I do not plan to do that, but it will inevitably happen sporadically, since both the mover and the shutdown are scheduled. Quote Link to comment
JorgeB Posted July 11, 2021 Share Posted July 11, 2021 Feel free to create a bug report, I just don't see LT fixing this anytime soon since there are much more serious issue yet to be fixed. Quote Link to comment
Alby24 Posted July 11, 2021 Author Share Posted July 11, 2021 7 minutes ago, JorgeB said: I just don't see LT fixing this anytime soon since there are much more serious issue yet to be fixed. Yes I totally understand that. I will create a bug report and keep my fingers crossed. Thanks for the help Quote Link to comment
chansearrington Posted March 10, 2022 Share Posted March 10, 2022 @Alby24 did this bug ever get resolved? Quote Link to comment
Alby24 Posted March 11, 2022 Author Share Posted March 11, 2022 13 hours ago, chansearrington said: @Alby24 did this bug ever get resolved? Of course not sir. I posted a bug report here but I haven't got even a single sign of life from the dev team in 8 months. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.