s986 Posted June 9, 2024 Posted June 9, 2024 Parity starts every time i reboot, please help, diagnostics attached. s986-diagnostics-20240609-1816.zip Quote
JorgeB Posted June 9, 2024 Posted June 9, 2024 The 2 minutes currently set as the shutdown timeout are not enough, instead of shutdown, stop the array, then time out long it takes, add 30 secs to that and set it as the new shutdown timeout. Also see a zfs crash, there may be an issue with the pool Quote
s986 Posted June 9, 2024 Author Posted June 9, 2024 1 hour ago, JorgeB said: The 2 minutes currently set as the shutdown timeout are not enough, instead of shutdown, stop the array, then time out long it takes, add 30 secs to that and set it as the new shutdown timeout. Also see a zfs crash, there may be an issue with the pool Hi thankyou @JorgeB for replying, I timed the stopping time, and it was 56 seconds while Shutdown timeout is 120 seconds, also not sure what you mean see ZFS crash and how to identify any issue! Quote
JorgeB Posted June 9, 2024 Posted June 9, 2024 2 hours ago, s986 said: while Shutdown timeout is 120 seconds Yes, but that wasn't enough for the previous shutdown: Jun 9 17:30:30 s986 shutdown[11409]: shutting down for system reboot ... Jun 9 17:32:32 s986 root: Active pids left on /dev/md* Jun 9 17:32:32 s986 root: Generating diagnostics... But looks like the 120 secs were almost enough, since it had already unmounted the disks, the zfs crash happened between those time codes, you can see a call trace mentioning zfs, and that could have been part of the reason it took longer to stop everything, still recommend increasing the timeout by 60 secs, it may be enough to prevent the unclean shutdowns, though you still need to see if there continue to be call traces during shutdown, you can just check the syslog-previous after a reboot, or post new diags. Quote
s986 Posted June 10, 2024 Author Posted June 10, 2024 16 hours ago, JorgeB said: Yes, but that wasn't enough for the previous shutdown: Jun 9 17:30:30 s986 shutdown[11409]: shutting down for system reboot ... Jun 9 17:32:32 s986 root: Active pids left on /dev/md* Jun 9 17:32:32 s986 root: Generating diagnostics... But looks like the 120 secs were almost enough, since it had already unmounted the disks, the zfs crash happened between those time codes, you can see a call trace mentioning zfs, and that could have been part of the reason it took longer to stop everything, still recommend increasing the timeout by 60 secs, it may be enough to prevent the unclean shutdowns, though you still need to see if there continue to be call traces during shutdown, you can just check the syslog-previous after a reboot, or post new diags. Hi, I increased the time to 200 and tried rebooting and it did not reboot, I had to power cycle, attached diagnostics after reboot s986-diagnostics-20240610-1816.zip Quote
JorgeB Posted June 10, 2024 Posted June 10, 2024 There's no syslog-previous this time, co cannot see what happened Quote
s986 Posted June 10, 2024 Author Posted June 10, 2024 40 minutes ago, JorgeB said: There's no syslog-previous this time, co cannot see what happened here is syslog, rebooting and no new entry in syslog or 6 mins now. s986-syslog-20240610-1513.zip Quote
JorgeB Posted June 10, 2024 Posted June 10, 2024 zfs related crash is there again, try starting the array without the pool assigned to see if there's still a problem. Quote
s986 Posted June 16, 2024 Author Posted June 16, 2024 I noticed the in dashboard system memory, ZFS is always 100% here is system log that is showing these errors. I had drives with health problems and when I looked the drive error it was old age, but now I removed all drives and kept the necessary ones and until I but new drive to expand the capacity. thanks tower-syslog-20240616-0905.zip Quote
JorgeB Posted June 16, 2024 Posted June 16, 2024 1 hour ago, s986 said: ZFS is always 100% This is normal after zfs pool is used, did you try On 6/10/2024 at 6:35 PM, JorgeB said: starting the array without the pool assigned to see if there's still a problem. Quote
s986 Posted June 16, 2024 Author Posted June 16, 2024 1 hour ago, JorgeB said: This is normal after zfs pool is used, did you try I set the cache to unassigned, rebooted, array started and parity check did not start. attached log. tower-syslog-20240616-1133.zip Quote
JorgeB Posted June 16, 2024 Posted June 16, 2024 So the problem could be the zfs filesystem, I would suggest to back up that pool and re-format it. Quote
s986 Posted June 16, 2024 Author Posted June 16, 2024 17 minutes ago, JorgeB said: So the problem could be the zfs filesystem, I would suggest to back up that pool and re-format it. reformat to the same? ZFS? Quote
JorgeB Posted June 16, 2024 Posted June 16, 2024 You can do that, and if it happens again in the near future there's likely an underlying hardware issue, like bad RAM. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.