November 26, 20241 yr i been rsync ing files to one of my back up servers and its crashed a few times and user script says its rysncing yet the drives are powered down i did see this Nov 25 11:34:30 Tardis root: Kill pid: 7969 Nov 25 19:55:13 Tardis root: Kill pid: 602 is it saying in the diagnostics whats causing these errors crashes as i know shinobi crashes and looses the videos it records... so i going to make a new unraid server just for shinobi but do these pids killing rsync yet say its still syncing tardis-diagnostics-20241125-1959.zip
November 26, 20241 yr may need to script on how you called rsync not all of rsync is a file copy it may be touching acls or checking other for a "dry run". Revieing diag file
November 26, 20241 yr The syslog reveals some potentially relevant entries: Rsync Process Termination: The system logged two "Kill pid" entries: Nov 25 11:34:30 Tardis root: Kill pid: 7969 Nov 25 19:55:13 Tardis root: Kill pid: 602 These indicate that processes, potentially related to rsync, were terminated. The cause could be resource exhaustion, conflicts, or unresponsive tasks. Kernel Errors: igb: probe of 0000:08:00.0 failed with error -5 suggests an issue initializing a network adapter. Bluetooth: hci0: FW download error recovery failed (-19) shows a problem with Bluetooth firmware, although this may not be directly related to rsync or Shinobi. Recommendations for Rsync and Shinobi: Rsync Troubleshooting: Check Resource Limits: Ensure there’s enough memory and CPU for rsync. Running out of system resources can cause process termination. Verbose Logs: Add the -v or --progress flag to your rsync command to capture more details about what it is doing when the system kills it. Disk Spin-down: If drives are spinning down during an active rsync process, adjust the power management settings to prevent this. Shinobi-Specific Errors: No specific errors related to Shinobi were found in this portion of the logs. However, Shinobi crashes might indicate compatibility or resource allocation issues. Running Shinobi on a dedicated server as you plan could help isolate and resolve these problems. Further Actions: Inspect syslog-previous.txt: This file may contain logs from earlier sessions that could help identify prior crashes or additional details about process termination. Dedicated Syslog Server: Set up remote logging to capture all events, even during crashes. previos boot errors: Observations: Repeated Network and Bluetooth Errors: The same errors from the current syslog regarding the network adapter (igb) and Bluetooth firmware recovery (hci0) are present. These might not be directly causing crashes but indicate hardware or driver-level issues. *ignore Bluetooth... ? nginx reverse proxy? Mutipel simlar error with nginx... Quote [error] 12530#12530: connect() failed (111: Connection refused) while connecting to upstream *Which could be killing rsync depending on how you ahve them talking... Killed Processes: Many "Kill pid" entries are logged Quote Nov 25 08:43:05 Tardis root: Kill pid: 19388 The system is actively terminating processes. This can occur due to: Out-of-memory (OOM) situations. Hung or unresponsive tasks. Array Management Errors: Errors related to mdcmd (Unraid array management) Quote error: mdcmd, 4184: Invalid argument (22): write: nocheck cancel retval: -1 *run check disk and rebiar / scrubs... My Recommendations: Investigate Resource Limits: Check memory usage during rsync operations. Add swap space or adjust rsync parameters to minimize resource demands. Use htop or similar tools to monitor which processes consume the most resources. Shinobi Isolation: Since Shinobi appears to crash and lose recordings, your plan to isolate it on a dedicated Unraid server is wise. Network Adapter (igb) and Bluetooth: Ensure drivers for your hardware are up to date. Disable Bluetooth in the BIOS/UEFI if unused. Rsync and Disk Spin-down: Prevent drives from spinning down during rsync by adjusting disk settings in Unraid's settings. Persistent Logging: Configure a remote syslog server to capture full logs during crashes for deeper analysis. run memtest and HD tests... smartmon? repair system scrubs
November 26, 20241 yr Author @bmartino1 ive never been able to fix those igb and bluetooth ive googled. ive asked for help on here nothing has helped... the igb i guess thats the 10gb fiber nic? as for the rsync command i use rsync -avzue ssh --exclude-from='/boot/config/plugins/user.scripts/scripts/exclude1.txt' -s --stats --numeric-ids --progress '/mnt/user/Videos/' root@$serverip:'/mnt/user/Videos/' --delete as memory it shows right now 82% used of the 64gig and i re restarted the rsync not sure how u adjust power level for disks.. for spin down.. when i look at the user script log it says its copying a a file.. but its not as its hanged at least 15 min.. as the spin dowsn are 15 min of no activity so both servers shut down there hard drives after 15 min... yet user script says its still running... so userscript /rysnc says its still running yet its no longer running now i dont know how to do half the things you mentioned... i dont know how to get that igb driver i need more help i can do that disable the bluetooth if the igb is the onboard network card i not using it i use it as a mangement port but there is a new motherboard bios update i can try ok so i enabled syslog server.. doesnt offer full capture unless thats what it does.. not sure how to "Add swap space or adjust rsync parameters to minimize resource demands." Hard drive do you mean run scrub and repair and what smartmon here is a screen capture currently as i rysnc i took 3 screen shots of the htop does it show anything it doesnt help me see i wish it showed like Windows Task manager but i guess its the best
November 26, 20241 yr Author also you mention error: mdcmd, 4184: Invalid argument (22): write: nocheck cancel retval: -1 does it say what hardrive.. and when you said array is that the actual array or any of the other pools whats also frustrating me the unassigned devices when i unmount a device and pull out the drive.. it stays ON and i get this hostbyte warning.. its driving me mad as i have to reboot unraid just to put it back in.. even when i unmounted it and even spun down a SSD drives me mad and on both servers i get this stupid error ive never found help for and google never helped i dont know how to fix Interface "tunl0" added. Warning: no bandwidth limit has been set. does that also cause issues i have Edited November 26, 20241 yr by comet424
November 26, 20241 yr Author so rsync shut down paused whatever you wanna call it. i have the htop running i was able to sort and scroll it looks like rsync is inthere multiple times i guess each time it crashes and i re run and they go idle here is screen shot somestimes the red D turns to a green R whatever D and R mean Edited November 26, 20241 yr by comet424
November 26, 20241 yr Author @bmartino1 here new diganostic with more kill pids i currenty doing scrub /repair on all my array disks i did the sys log server it says it save to my local share but i saw nothing there but here is the diagnostic tardis-diagnostics-20241125-2318.zip
November 26, 20241 yr Alot going on here have yet to fully read all of your comments here... you can modprobe balck list the Bluetooth driver to remove error as it is not used and can latter use Bluetooth with vm passthorugh if you vfio bind. Bluetooth only works in gui mode with a few extra slackware packages installed. at best uses to connect a bluethotth keyboard and mouse... the application is called bluez... https://slackware.pkgs.org/15.0/slackware-x86_64/bluez-5.63-x86_64-2.txz.html *It is best to ignore bluetooth all together.. Mentioned as it was an error in the logs... if want to used, it is better to vfio bind then blacklist Bluetooth. As unraid doesn't have the 3rd party application to use bluetooth as it doesn't need them. So Ignore Bluetooth. the IGB nic is more a concern here, as there are driver based plugins and other kernel level stuff and grub boot side options to potential fix. I assume this is the igb onboard... thx for confimring the onboard nic. yes you should try to maintain the lattest bios. this can be harder on liniux. There are intel ME and other firmware that gets updated of onboard nics with bios updates... WOW! more then 80% of your ram is in use wow... This could be a reason why the kils and failures as rsync is ram intensive like unraid the data is sent to ram for processing... Your htop done't help much as there to much variable... when i last looked it was more networking failure and remote rsync that was the problem... untill you fix the networking issue this is going to keep happening... the power levels is a reference to the power management it more making sure disk are not spinning down or going into sleep states this would cause the kill or file copy to break. as explain in initial post. Thx for your rsync command outline lets work shop that a bit to use other options and log... rsync -avz --partial --exclude-from='/boot/config/plugins/user.scripts/scripts/exclude1.txt' --stats --numeric-ids --progress '/mnt/user/Videos/' root@$serverip:'/mnt/user/Videos/' --delete --log-file='/path/to/rsync.log' Explanation of Options: -a: Archive mode; preserves permissions, timestamps, symbolic links, etc. -v: Verbose output; provides detailed information about the transfer process. -z: Compresses file data during the transfer to reduce bandwidth usage. --partial: Keeps partially transferred files to allow resumption if interrupted. --exclude-from='/boot/config/plugins/user.scripts/scripts/exclude1.txt': Excludes files and directories specified in the given file. --stats: Provides detailed statistics about the transfer after completion. --numeric-ids: Transfers numeric user and group IDs without mapping them to user names. --progress: Displays progress information during the transfer. --delete: Deletes files in the destination directory that are not present in the source directory. --log-file='/path/to/rsync.log': Logs the transfer details and errors to the specified file. Additional Recommendations: Monitor System Resources: Ensure your system has sufficient memory and processing power, as rsync can be resource-intensive. Review Log Files: Regularly check the log file (/path/to/rsync.log) for any errors or warnings that may require attention. Test the Command: Before executing the command on critical data, consider running it with the --dry-run option to simulate the operation and verify its behavior without making actual changes. By implementing these adjustments, you can improve the efficiency and reliability of your rsync operations while maintaining comprehensive logs for troubleshooting purposes. smartmon is the application that runs the smart checks and reads/writes on the SSD/ HDD, yes when I said HD tools run scrubs, file system repair checks and other hd tools built into unriad. htop is essential window task manger... you can see the performance at the top the details of application listed and use the search/and other items to edit stuff... review basic linux essential YouTube... Edited November 26, 20241 yr by bmartino1 data...
November 29, 20241 yr Author @bmartino1 so i upgraded to the latest motherboard bios for the board... i forgot to disable the bluetooth but you mentioned not a big deal the igb errors warnings still there... but then i dont use that to copy just use it to be able to WOL the server when needed or backdoor to get in when the 10gb isnt working i changed my rsync to yours you modified... i havent had a chance to watch the youtube video yet.. currently rsyncing so we shall see how it goes
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.