Jump to content

JorgeB

Moderators
  • Posts

    67,645
  • Joined

  • Last visited

  • Days Won

    707

Everything posted by JorgeB

  1. It's a known btrfs bug with an odd number of devices in raid1, but just the reported free space is wrong, you can still use the full capacity, and free space will be less wrong as the pool gets filled.
  2. Problem appears to start with various Nvidia related crashes.
  3. May 7 10:23:26 Isuserver kernel: mdcmd (36): check nocorrect Auto parity check after an unclean shutdown is always non correct, 2nd one was correct so it's corrected now.
  4. Please post the diagnostics: Tools -> Diagnostics
  5. There's an optical drive generating a lot of timeout errors, these can make the server unresponsive, you should fix it (by replacing cables) or disconnected it if cables don't help. May 9 21:50:40 Tower kernel: sr 3:0:0:0: [sr0] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=21s May 9 21:50:40 Tower kernel: sr 3:0:0:0: [sr0] tag#12 Sense Key : 0x4 [current] May 9 21:50:40 Tower kernel: sr 3:0:0:0: [sr0] tag#12 ASC=0x3e ASCQ=0x2 May 9 21:50:40 Tower kernel: sr 3:0:0:0: [sr0] tag#12 CDB: opcode=0x28 28 00 00 00 72 5e 00 00 03 00 May 9 21:50:40 Tower kernel: blk_update_request: I/O error, dev sr0, sector 117112 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0 May 9 21:50:57 Tower kernel: sr 3:0:0:0: [sr0] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=16s May 9 21:50:57 Tower kernel: sr 3:0:0:0: [sr0] tag#12 Sense Key : 0x4 [current] May 9 21:50:57 Tower kernel: sr 3:0:0:0: [sr0] tag#12 ASC=0x3e ASCQ=0x2 May 9 21:50:57 Tower kernel: sr 3:0:0:0: [sr0] tag#12 CDB: opcode=0x28 28 00 00 00 72 5e 00 00 03 00 May 9 21:50:57 Tower kernel: blk_update_request: I/O error, dev sr0, sector 117112 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
  6. That will happen if you cache is a single device and those shares have some files (or folders) on cache.
  7. It's fine if it's a standard backplane, if it's a NetApp or similar the disks identification can change.
  8. It dropped again, try swapping NVMe slots and see if the problems stays with the slot or follows the device.
  9. Hmm: May 10 18:41:58 Tower root: mkdir: cannot create directory '/mnt/user/domains': No space left on device Bot seeing a reason for the not enough space error though.
  10. You need to remove that or nothing will be done.
  11. Unless errors are expected it should be non correct, because in some rare cases if there are disk problems a correcting check can wrongly update parity ant corrupt it.
  12. The second one is showing corruption errors, unless they are old it suggests a hardware problem, like bad RAM. NVMe devices dropping are usually a BIOS/kernel issue, but could also be a bad device, though unlikely, this sometimes helps: Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 e.g.: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference.
  13. CPU temps are not a stock feature, you should post instead on the support thread of the plugin you're using.
  14. It won't hurt, make sure schedule checks are non correct.
  15. If it doesn't find any errors now I would assume problem is fixed, then wait until next scheduled check.
  16. I would think mentioning the disk is failing makes it clear that it should be replaced, but yeah, it has a lot of pending sectors and it's generating media errors.
  17. That log shows HBA problems: May 8 11:06:21 magnas kernel: mpt2sas_cm0: SAS host is non-operational !!!! May 8 11:06:22 magnas kernel: mpt2sas_cm0: SAS host is non-operational !!!! May 8 11:06:23 magnas kernel: mpt2sas_cm0: SAS host is non-operational !!!!
  18. One of the NVMe devices dropped offline, see here for better pool monitoring.
  19. Looks more line a connection problem, start by replace the cables.
  20. Tools -> Diagnostics
  21. Disks look fine, it can be power or cable related, cables can go bad even without touching them, both disks have several UDMA CRC errors, so unless those errors are old bad SATA cables would be the prime suspect.
×
×
  • Create New...