brent3000 Posted October 13, 2020 Share Posted October 13, 2020 Hi All, Hoping to get some help or if there is some log i can do to find whats causing the issues with my unraid sever, Recently i have noticed that the system just randomly goes offline and the only way to bring it back online is a power cycle. When does it happen? When I'm transferring to the share drives (using Cache) doing a data transfer of aprox 100GB the system will just go offline and even the connected monitor just goes blank and no response from a keyboard either. The switch its attached to shows the ports online but I'm unable to ping the unit or access it via the browser. When I reboot the system it all comes back online fine, parity check is passed and no issues and transfer works again. First time it happened I thought I just did something funky with the transfer or the system was unstable at all (but it was running for a solid week of just use reading, updates, dockers etc . Then when I transferred another bunch of data it went offline again, is there a log or something I can dig into abit further to find what may be causing it post reboot? The log file itself (from the menu) seems to just show the current logs not the history from the previous boot, Quote Link to comment
JorgeB Posted October 13, 2020 Share Posted October 13, 2020 Try this and then post that log after a crash. Quote Link to comment
brent3000 Posted October 19, 2020 Author Share Posted October 19, 2020 (edited) Ok so it locked up again, and it was only working with the Cache drive so I thought maybe an issue with the drive writes but the share i was using was cache only, See attached syslog-192.168.2.1 - Copy.log syslog-192.168.2.1 - Copy.log Edited October 19, 2020 by brent3000 Found more info Quote Link to comment
JorgeB Posted October 19, 2020 Share Posted October 19, 2020 Oct 16 18:14:16 GLaDOS kernel: ata1.00: configured for UDMA/133 Oct 16 18:14:16 GLaDOS kernel: ata1: EH complete Oct 16 18:16:59 GLaDOS kernel: ata1.00: exception Emask 0x10 SAct 0x10000 SErr 0x400100 action 0x6 frozen Oct 16 18:16:59 GLaDOS kernel: ata1.00: irq_stat 0x08000000, interface fatal error Oct 16 18:16:59 GLaDOS kernel: ata1: SError: { UnrecovData Handshk } Oct 16 18:16:59 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 16 18:16:59 GLaDOS kernel: ata1.00: cmd 61/00:80:c0:d4:af/0a:00:12:00:00/40 tag 16 ncq dma 1310720 ou Oct 16 18:16:59 GLaDOS kernel: res 40/00:80:c0:d4:af/00:00:12:00:00/40 Emask 0x10 (ATA bus error) Oct 16 18:16:59 GLaDOS kernel: ata1.00: status: { DRDY } Oct 16 18:16:59 GLaDOS kernel: ata1: hard resetting link Check/replace cables on this drive, if you don't how to find it please post complete diags. Quote Link to comment
trurl Posted October 19, 2020 Share Posted October 19, 2020 1 hour ago, JorgeB said: please post complete diags. You should post them anyway since they would give more context and might even contain additional information that points to the issue which is causing lockups. Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread. Quote Link to comment
brent3000 Posted October 20, 2020 Author Share Posted October 20, 2020 See attached, Regarding the cables i actually have replaced the cables before (3rd time actually) and it was happening all three times (i changed Sata cables for all my drives from normal, to ultra thin, to shorter thin ones due to case limits) Also ATA1 is that referencing the port on the Mobo or a specific slot? cuz that seems to be one of the cache drives (i have two in raid) glados-diagnostics-20201020-1307.zip Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 4 hours ago, brent3000 said: cuz that seems to be one of the cache drives In this case it's SATA port1, currently connected to cache1. Quote Link to comment
brent3000 Posted October 20, 2020 Author Share Posted October 20, 2020 1 hour ago, JorgeB said: In this case it's SATA port1, currently connected to cache1. So is that deff the issue? Why would it cause the whole system to lock up and not just disconnect the sata cable? Where would i see the port or this info un the UI (or only in the logs? The way the system is setup is the following Mobo -> Sata Cable -> HDD BackPlane -> HotSwap pot -> SSD (I have a DS380 case) Now I have already replaced the sata cables in the whole unit so I dont think thats the issue, if I switch Cache 1 and Cache 2 in the bays that would mean that cache 2 should start reporting errors (meaning its the not the SSD and its isolated to the connections along that specific chain) Would then switching the back-plane ports over (aka Port 1 to Port 2 and 2 to 1 etc) this will show if its between the Mobo and the cable or the backplane and the drive Does that sound like a good diagnosis? Also was this outlined just from the initial log or from the diagnostics report? (aka if it happens again and i wanna check what file do i look in to see the issue? ) Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 26 minutes ago, brent3000 said: So is that deff the issue? It's an issue, that should be fixed. Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 27 minutes ago, brent3000 said: if I switch Cache 1 and Cache 2 in the bays that would mean that cache 2 should start reporting errors (meaning its the not the SSD and its isolated to the connections along that specific chain) Yes, swap cables/bays with another device. Quote Link to comment
brent3000 Posted October 20, 2020 Author Share Posted October 20, 2020 OK illl start some testing, but should it be crashing the system the way it does? If its the cache drive in raid should it not just fail that drive? May be awhile till i reply to this with soem testing but il try do some data dumping to see if i can make it trigger any sooner hahah Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 ATA errors can cause timeouts that can make the system unresponsive for several minutes and sometimes appear to have crashed. Quote Link to comment
brent3000 Posted October 20, 2020 Author Share Posted October 20, 2020 So quick one, I swapped the drives over in the hot swap bays and its still showing the same error but can someone confirm is it Sata Port 1 or Sata Port 0? As the log shows ATA1 but there is refrence to SAT0PR0 Quote Oct 20 21:07:47 GLaDOS kernel: ata1.00: exception Emask 0x10 SAct 0x1e000000 SErr 0x400100 action 0x6 frozen Oct 20 21:07:47 GLaDOS kernel: ata1.00: irq_stat 0x08000000, interface fatal error Oct 20 21:07:47 GLaDOS kernel: ata1: SError: { UnrecovData Handshk } Oct 20 21:07:47 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 20 21:07:47 GLaDOS kernel: ata1.00: cmd 61/00:c8:c0:16:f1/0a:00:13:00:00/40 tag 25 ncq dma 1310720 ou Oct 20 21:07:47 GLaDOS kernel: res 40/00:c8:c0:16:f1/00:00:13:00:00/40 Emask 0x10 (ATA bus error) Oct 20 21:07:47 GLaDOS kernel: ata1.00: status: { DRDY } Oct 20 21:07:47 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 20 21:07:47 GLaDOS kernel: ata1.00: cmd 61/00:d0:c0:20:f1/0a:00:13:00:00/40 tag 26 ncq dma 1310720 ou Oct 20 21:07:47 GLaDOS kernel: res 40/00:c8:c0:16:f1/00:00:13:00:00/40 Emask 0x10 (ATA bus error) Oct 20 21:07:47 GLaDOS kernel: ata1.00: status: { DRDY } Oct 20 21:07:47 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 20 21:07:47 GLaDOS kernel: ata1.00: cmd 61/00:d8:c0:2a:f1/0a:00:13:00:00/40 tag 27 ncq dma 1310720 ou Oct 20 21:07:47 GLaDOS kernel: res 40/00:c8:c0:16:f1/00:00:13:00:00/40 Emask 0x10 (ATA bus error) Oct 20 21:07:47 GLaDOS kernel: ata1.00: status: { DRDY } Oct 20 21:07:47 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 20 21:07:47 GLaDOS kernel: ata1.00: cmd 61/00:e0:c0:34:f1/0a:00:13:00:00/40 tag 28 ncq dma 1310720 ou Oct 20 21:07:47 GLaDOS kernel: res 40/00:c8:c0:16:f1/00:00:13:00:00/40 Emask 0x10 (ATA bus error) Oct 20 21:07:47 GLaDOS kernel: ata1.00: status: { DRDY } Oct 20 21:07:47 GLaDOS kernel: ata1: hard resetting link Oct 20 21:07:47 GLaDOS kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 20 21:07:47 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Oct 20 21:07:47 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Oct 20 21:07:47 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Oct 20 21:07:47 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) My next thing is to switch over port 1 and 2 (aka 0 and 1) on the backplane to see if the errors stay the same or if they switch (which should allow me to remove the backplane from the issues list and then focus on the cable/mobo port) Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 ATA is the first MB port, some boards call it port0, others port1. Quote Link to comment
brent3000 Posted October 25, 2020 Author Share Posted October 25, 2020 SO i did an initial swap on the cables to the HDD bay and the error is still the same, so I'm going to change the ports over now but quick question, as i have two cache drives, would the error change in any way to advise which actual drive is causing the issue or would it only show the Sata port error? Quote Oct 25 20:16:17 GLaDOS kernel: ata1.00: exception Emask 0x10 SAct 0xe000 SErr 0x400100 action 0x6 frozen Oct 25 20:16:17 GLaDOS kernel: ata1.00: irq_stat 0x08000000, interface fatal error Oct 25 20:16:17 GLaDOS kernel: ata1: SError: { UnrecovData Handshk } Oct 25 20:16:17 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 20:16:17 GLaDOS kernel: ata1.00: cmd 61/00:68:40:3d:42/0a:00:1e:00:00/40 tag 13 ncq dma 1310720 ou Oct 25 20:16:17 GLaDOS kernel: res 40/00:68:40:3d:42/00:00:1e:00:00/40 Emask 0x10 (ATA bus error) Oct 25 20:16:17 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 20:16:17 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 20:16:17 GLaDOS kernel: ata1.00: cmd 61/00:70:40:47:42/0a:00:1e:00:00/40 tag 14 ncq dma 1310720 ou Oct 25 20:16:17 GLaDOS kernel: res 40/00:68:40:3d:42/00:00:1e:00:00/40 Emask 0x10 (ATA bus error) Oct 25 20:16:17 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 20:16:17 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 20:16:17 GLaDOS kernel: ata1.00: cmd 61/00:78:40:51:42/0a:00:1e:00:00/40 tag 15 ncq dma 1310720 ou Oct 25 20:16:17 GLaDOS kernel: res 40/00:68:40:3d:42/00:00:1e:00:00/40 Emask 0x10 (ATA bus error) Oct 25 20:16:17 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 20:16:17 GLaDOS kernel: ata1: hard resetting link Oct 25 20:16:17 GLaDOS kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 25 20:16:17 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Oct 25 20:16:17 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Oct 25 20:16:17 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Oct 25 20:16:17 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Oct 25 20:16:17 GLaDOS kernel: ata1.00: configured for UDMA/133 Oct 25 20:16:17 GLaDOS kernel: ata1: EH complete I was also wondering if its possible to go from a RAID type setup on the Cache drives to a non raid and not have to re-install or change anything on the docker files or VM's etc? Quote Link to comment
brent3000 Posted October 25, 2020 Author Share Posted October 25, 2020 Also changed the cables over and same error codes seems to come through, Any ideas ? or was there anything else in the logs? Or is it looking like a Mobo fault cant say ive ever had a port cause me issues on a board before :? Quote Oct 25 21:22:12 GLaDOS root: Fix Common Problems: Other Warning: Background notifications not enabled Oct 25 21:25:09 GLaDOS kernel: ata1.00: exception Emask 0x10 SAct 0x1e0 SErr 0x400100 action 0x6 frozen Oct 25 21:25:09 GLaDOS kernel: ata1.00: irq_stat 0x08000000, interface fatal error Oct 25 21:25:09 GLaDOS kernel: ata1: SError: { UnrecovData Handshk } Oct 25 21:25:09 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 21:25:09 GLaDOS kernel: ata1.00: cmd 61/00:28:40:8e:34/0a:00:24:00:00/40 tag 5 ncq dma 1310720 ou Oct 25 21:25:09 GLaDOS kernel: res 40/00:28:40:8e:34/00:00:24:00:00/40 Emask 0x10 (ATA bus error) Oct 25 21:25:09 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 21:25:09 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 21:25:09 GLaDOS kernel: ata1.00: cmd 61/80:30:40:98:34/09:00:24:00:00/40 tag 6 ncq dma 1245184 ou Oct 25 21:25:09 GLaDOS kernel: res 40/00:28:40:8e:34/00:00:24:00:00/40 Emask 0x10 (ATA bus error) Oct 25 21:25:09 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 21:25:09 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 21:25:09 GLaDOS kernel: ata1.00: cmd 61/80:38:c0:a1:34/09:00:24:00:00/40 tag 7 ncq dma 1245184 ou Oct 25 21:25:09 GLaDOS kernel: res 40/00:28:40:8e:34/00:00:24:00:00/40 Emask 0x10 (ATA bus error) Oct 25 21:25:09 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 21:25:09 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Oct 25 21:25:09 GLaDOS kernel: ata1.00: cmd 61/80:40:40:ab:34/09:00:24:00:00/40 tag 8 ncq dma 1245184 ou Oct 25 21:25:09 GLaDOS kernel: res 40/00:28:40:8e:34/00:00:24:00:00/40 Emask 0x10 (ATA bus error) Oct 25 21:25:09 GLaDOS kernel: ata1.00: status: { DRDY } Oct 25 21:25:09 GLaDOS kernel: ata1: hard resetting link Oct 25 21:25:10 GLaDOS kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 25 21:25:10 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Oct 25 21:25:10 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Oct 25 21:25:10 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Oct 25 21:25:10 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Oct 25 21:25:10 GLaDOS kernel: ata1.00: configured for UDMA/133 Oct 25 21:25:10 GLaDOS kernel: ata1: EH complete Quote Link to comment
brent3000 Posted October 25, 2020 Author Share Posted October 25, 2020 Attached is the two lattest log packs, if anything else i should try or should i verify by taking the cache into single drives and test the drives based on the ports? glados-diagnostics-20201025-2150.zip syslog-192.168.2.1.log Quote Link to comment
JorgeB Posted October 25, 2020 Share Posted October 25, 2020 If you haven't yet nonnect a different device to that port, if errors persist it's likely a board problem. Quote Link to comment
brent3000 Posted December 10, 2020 Author Share Posted December 10, 2020 Sorry to bring this one back but getting the RMA board processed took abit longer than normal, Soooo i ended up getting the board switched and moved everything around however doing a test just now gets me this, the exact same error, Dec 10 14:29:28 GLaDOS kernel: ata1.00: exception Emask 0x10 SAct 0x20000000 SErr 0x400100 action 0x6 frozen Dec 10 14:29:28 GLaDOS kernel: ata1.00: irq_stat 0x08000000, interface fatal error Dec 10 14:29:28 GLaDOS kernel: ata1: SError: { UnrecovData Handshk } Dec 10 14:29:28 GLaDOS kernel: ata1.00: failed command: WRITE FPDMA QUEUED Dec 10 14:29:28 GLaDOS kernel: ata1.00: cmd 61/00:e8:40:74:17/0a:00:1f:00:00/40 tag 29 ncq dma 1310720 ou Dec 10 14:29:28 GLaDOS kernel: res 40/00:e8:40:74:17/00:00:1f:00:00/40 Emask 0x10 (ATA bus error) Dec 10 14:29:28 GLaDOS kernel: ata1.00: status: { DRDY } Dec 10 14:29:28 GLaDOS kernel: ata1: hard resetting link Dec 10 14:29:28 GLaDOS kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 10 14:29:28 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Dec 10 14:29:28 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Dec 10 14:29:28 GLaDOS kernel: ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.SAT0.PRT0._GTF.DSSP], AE_NOT_FOUND (20180810/psargs-330) Dec 10 14:29:28 GLaDOS kernel: ACPI Error: Method parse/execution failed \_SB.PCI0.SAT0.PRT0._GTF, AE_NOT_FOUND (20180810/psparse-514) Dec 10 14:29:28 GLaDOS kernel: ata1.00: configured for UDMA/133 Dec 10 14:29:28 GLaDOS kernel: ata1: EH complete Dec 10 14:30:12 GLaDOS ntpd[1809]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized Now the one final thing i have thought off but i did rule out (but i dont think i did enough switching to get it done fully) I'm using a Silverstone DS380 (which has a MOBO back-plane installed) would it be possible the fault is with the backplane at all? Tests i have done, (note both cache drives are in Bay 1, SATA0 and Bay 2, SATA1) When i swapped the cache drives around, as they are running in a 'raid' type format the same data is always hitting both drives (aka ATA0 and ATA1 still received the same traffic) which is also the case when i switched them back and forth on the physical backplane. my TLDR question, to avoid me ripping out the system and moving cables around, can I turn off RAID on the cache drives, do some testing without the cache drives being in RAID, (Write tests to each drive individually) find if the backplane is the issue, then simply re-enable the RAID? Quote Link to comment
JorgeB Posted December 10, 2020 Share Posted December 10, 2020 If it's a raid1 pool you can remove one of the devices, then test with the remaining one. Quote Link to comment
brent3000 Posted December 23, 2020 Author Share Posted December 23, 2020 Im back after a bunch more testing and overall pulling my hair out, kind of a new issue now, So the system still crashs and with a Mobo replacement and some other things (cables moved/changed etc) now the system just goes offline and the logs show nothing vs before it listed the Sata port fault, not sure if I'm missing something so asking for some help on this one Attached is the report and below is the running syslog, i posted two showing that doing the same thing (data dump on one of the drives it just goes offline) Dec 23 20:57:11 GLaDOS unassigned.devices: Don't spin down device '/dev/sdc'. Dec 23 20:57:11 GLaDOS unassigned.devices: Removing SMB share 'Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R' Dec 23 20:57:11 GLaDOS unassigned.devices: Unmounting disk 'Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R'... Dec 23 20:57:11 GLaDOS unassigned.devices: Unmounting '/dev/sdc1'... Dec 23 20:57:11 GLaDOS unassigned.devices: Unmount cmd: /sbin/umount '/dev/sdc1' 2>&1 Dec 23 20:57:11 GLaDOS kernel: XFS (sdc1): Unmounting Filesystem Dec 23 20:57:11 GLaDOS unassigned.devices: Successfully unmounted '/dev/sdc1' Dec 23 20:57:11 GLaDOS unassigned.devices: Disk with serial 'Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R', mountpoint 'Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R' removed successfully. Dec 23 20:57:11 GLaDOS emhttpd: shcmd (119): /etc/rc.d/rc.samba stop Dec 23 20:57:11 GLaDOS emhttpd: shcmd (120): rm -f /etc/avahi/services/smb.service Dec 23 20:57:11 GLaDOS emhttpd: Stopping mover... Dec 23 20:57:11 GLaDOS emhttpd: shcmd (123): /usr/local/sbin/mover stop Dec 23 20:57:11 GLaDOS root: mover: not running Dec 23 20:57:11 GLaDOS emhttpd: Sync filesystems... Dec 23 20:57:11 GLaDOS emhttpd: shcmd (124): sync Dec 23 21:01:46 GLaDOS cache_dirs: Arguments=-l off Dec 23 21:01:46 GLaDOS cache_dirs: Max Scan Secs=10, Min Scan Secs=1 Dec 23 21:01:46 GLaDOS cache_dirs: Scan Type=adaptive Dec 23 21:01:46 GLaDOS cache_dirs: Min Scan Depth=4 Dec 23 21:01:46 GLaDOS cache_dirs: Max Scan Depth=none Dec 23 21:01:46 GLaDOS cache_dirs: Use Command='find -noleaf' Dec 23 21:01:46 GLaDOS cache_dirs: ---------- Caching Directories --------------- Dec 23 21:01:51 GLaDOS dnsmasq[7446]: reading /etc/resolv.conf Dec 23 21:01:51 GLaDOS dnsmasq[7446]: using nameserver 192.168.1.1#53 Dec 23 21:01:51 GLaDOS dnsmasq[7446]: read /etc/hosts - 2 addresses Dec 23 21:01:51 GLaDOS dnsmasq[7446]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Dec 23 21:01:51 GLaDOS dnsmasq-dhcp[7446]: read /var/lib/libvirt/dnsmasq/default.hostsfile Dec 23 21:01:51 GLaDOS kernel: virbr0: port 1(virbr0-nic) entered disabled state Dec 23 21:01:51 GLaDOS kernel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details. Dec 23 21:02:05 GLaDOS unassigned.devices: Adding disk '/dev/sdb1'... Dec 23 21:02:05 GLaDOS unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime,discard '/dev/sdb1' '/mnt/disks/Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R' Dec 23 21:02:05 GLaDOS kernel: XFS (sdb1): Mounting V5 Filesystem Dec 23 21:02:05 GLaDOS kernel: XFS (sdb1): Ending clean mount Dec 23 21:02:05 GLaDOS unassigned.devices: Successfully mounted '/dev/sdb1' on '/mnt/disks/Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R'. Dec 23 21:02:05 GLaDOS unassigned.devices: Adding SMB share 'Seagate_BarraCuda_120_SSD_ZA500CM10003_7QV03Q1R'. Dec 23 21:02:05 GLaDOS unassigned.devices: Don't spin down device '/dev/sdb'. Dec 23 21:06:55 GLaDOS ntpd[1806]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized Dec 23 21:11:00 GLaDOS root: Fix Common Problems Version 2020.12.19 Dec 23 21:11:13 GLaDOS root: Fix Common Problems: Other Warning: Background notifications not enabled Dec 23 21:33:20 GLaDOS cache_dirs: Arguments=-l off Dec 23 21:33:20 GLaDOS cache_dirs: Max Scan Secs=10, Min Scan Secs=1 Dec 23 21:33:20 GLaDOS cache_dirs: Scan Type=adaptive Dec 23 21:33:20 GLaDOS cache_dirs: Min Scan Depth=4 Dec 23 21:33:20 GLaDOS cache_dirs: Max Scan Depth=none Dec 23 21:33:20 GLaDOS cache_dirs: Use Command='find -noleaf' Dec 23 21:33:20 GLaDOS cache_dirs: ---------- Caching Directories --------------- glados-diagnostics-20201223-2158.zip Quote Link to comment
JorgeB Posted December 23, 2020 Share Posted December 23, 2020 Don't see any issues in the snippets posted, or in the syslog. Quote Link to comment
brent3000 Posted December 23, 2020 Author Share Posted December 23, 2020 Well thats not good I assume its part of the fact its a headless system normally but is there a reason the display just shows a black screen on a crash? I assume unraid dosnt have a MS BSOD type system for any on-screen errors? Quote Link to comment
JorgeB Posted December 24, 2020 Share Posted December 24, 2020 One thing you can try it to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
brent3000 Posted December 26, 2020 Author Share Posted December 26, 2020 Ok so for the good part of the day I have been ramming data onto the system to try and stress it with both transfer and handbreak encoding (trying to do a bunch of things to hit the drive) Still randomly after a good day of hitting it hard it out of no where drops off, whats odd is the time it happened, around 2am (or close to it as i dont have a timestamp) which is around when the mover is due to kick off nomrally, Any thoughts? There was no other actions happening apart from a file transfer (100gb at the time) but as part of the data for the day i have been hitting it with around 2tb of data transactions to test some recent changes (HW replacements) Dec 27 00:33:42 GLaDOS kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethe106026: link becomes ready Dec 27 00:33:42 GLaDOS kernel: docker0: port 1(vethe106026) entered blocking state Dec 27 00:33:42 GLaDOS kernel: docker0: port 1(vethe106026) entered forwarding state Dec 27 00:51:27 GLaDOS kernel: veth80a2f59: renamed from eth0 Dec 27 00:51:27 GLaDOS kernel: docker0: port 1(vethe106026) entered disabled state Dec 27 00:51:27 GLaDOS kernel: docker0: port 1(vethe106026) entered disabled state Dec 27 00:51:27 GLaDOS kernel: device vethe106026 left promiscuous mode Dec 27 00:51:27 GLaDOS kernel: docker0: port 1(vethe106026) entered disabled state Dec 27 01:43:15 GLaDOS crond[1825]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Dec 27 01:45:40 GLaDOS emhttpd: shcmd (240): /usr/local/sbin/mover &> /dev/null & Dec 27 02:09:10 GLaDOS cache_dirs: Arguments=-l off Dec 27 02:09:10 GLaDOS cache_dirs: Max Scan Secs=10, Min Scan Secs=1 Dec 27 02:09:10 GLaDOS cache_dirs: Scan Type=adaptive Dec 27 02:09:10 GLaDOS cache_dirs: Min Scan Depth=4 Dec 27 02:09:10 GLaDOS cache_dirs: Max Scan Depth=none Dec 27 02:09:10 GLaDOS cache_dirs: Use Command='find -noleaf' Dec 27 02:09:10 GLaDOS cache_dirs: ---------- Caching Directories --------------- Dec 27 02:09:10 GLaDOS cache_dirs: Movies Dec 27 02:09:10 GLaDOS cache_dirs: Music Dec 27 02:09:10 GLaDOS cache_dirs: Network Apps Dec 27 02:09:10 GLaDOS cache_dirs: TV Shows Dec 27 02:09:10 GLaDOS cache_dirs: appdata Dec 27 02:09:10 GLaDOS cache_dirs: backups Dec 27 02:09:10 GLaDOS cache_dirs: domains Dec 27 02:09:10 GLaDOS cache_dirs: iso archives Dec 27 02:09:10 GLaDOS cache_dirs: isos Dec 27 02:09:10 GLaDOS cache_dirs: system Dec 27 02:09:10 GLaDOS cache_dirs: torrent Dec 27 02:09:10 GLaDOS cache_dirs: zCacheStore Dec 27 02:09:10 GLaDOS cache_dirs: zPublic Dec 27 02:09:10 GLaDOS cache_dirs: zTehPurge Dec 27 02:09:10 GLaDOS cache_dirs: ---------------------------------------------- Dec 27 02:09:10 GLaDOS cache_dirs: Setting Included dirs: Dec 27 02:09:10 GLaDOS cache_dirs: Setting Excluded dirs: Dec 27 02:09:10 GLaDOS cache_dirs: min_disk_idle_before_restarting_scan_sec=60 Dec 27 02:09:10 GLaDOS cache_dirs: scan_timeout_sec_idle=150 Dec 27 02:09:10 GLaDOS cache_dirs: scan_timeout_sec_busy=30 Dec 27 02:09:10 GLaDOS cache_dirs: scan_timeout_sec_stable=30 Dec 27 02:09:10 GLaDOS cache_dirs: frequency_of_full_depth_scan_sec=604800 glados-diagnostics-20201227-0209.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.