VM won't start after update to 6.3.0 and restart


stottle

Recommended Posts

Balance failed and there are other errors.

 

Here's a snippet

Feb 15 18:30:52 Tower2 emhttp: shcmd (147): set -o pipefail ; /sbin/btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/cache |& logger &
Feb 15 18:30:52 Tower2 emhttp: shcmd (148): sync
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1937353211904 flags 1
Feb 15 18:30:52 Tower2 emhttp: shcmd (149): mkdir /mnt/user0
Feb 15 18:30:52 Tower2 emhttp: shcmd (150): /usr/local/sbin/shfs /mnt/user0 -disks 62 -o noatime,big_writes,allow_other   |& logger
Feb 15 18:30:52 Tower2 emhttp: shcmd (151): mkdir /mnt/user
Feb 15 18:30:52 Tower2 emhttp: shcmd (152): /usr/local/sbin/shfs /mnt/user -disks 63 2048000000 -o noatime,big_writes,allow_other  -o remember=0  |& logger
Feb 15 18:30:52 Tower2 emhttp: shcmd (153): cat - > /boot/config/plugins/dynamix/mover.cron <<< "# Generated mover schedule:#01240 3 * * * /usr/local/sbin/mover |& logger#012"
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1936279470080 flags 1
Feb 15 18:30:52 Tower2 emhttp: shcmd (154): /usr/local/sbin/update_cron &> /dev/null
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1935205728256 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1934131986432 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1933058244608 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1931984502784 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1930910760960 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1929837019136 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1928763277312 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1927689535488 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1926615793664 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1925542051840 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1924468310016 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1923394568192 flags 1
Feb 15 18:30:52 Tower2 emhttp: Starting services...
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1922320826368 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1921247084544 flags 1
Feb 15 18:30:52 Tower2 kernel: ata8.00: exception Emask 0x10 SAct 0x70000 SErr 0x400000 action 0x6 frozen
Feb 15 18:30:52 Tower2 kernel: ata8.00: irq_stat 0x08000000, interface fatal error

 

Diagnostics attached

tower2-diagnostics-20170216-1754.zip

Link to comment

These are not errors, it's a normal balance operation:

 

Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1935205728256 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1934131986432 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1933058244608 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1931984502784 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1930910760960 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1929837019136 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1928763277312 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1927689535488 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1926615793664 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1925542051840 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1924468310016 flags 1
Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1923394568192 flags 1

 

These are hardware errors (ATA8 is your cache2, still has issues, if you already replace both cables and are using in a different SATA port then it's probably a bad SSD):

 

Feb 15 18:30:52 Tower2 kernel: ata8.00: exception Emask 0x10 SAct 0x70000 SErr 0x400000 action 0x6 frozen
Feb 15 18:30:52 Tower2 kernel: ata8.00: irq_stat 0x08000000, interface fatal error
Feb 15 18:30:52 Tower2 kernel: ata8: SError: { Handshk }
Feb 15 18:30:52 Tower2 kernel: ata8.00: failed command: WRITE FPDMA QUEUED
Feb 15 18:30:52 Tower2 kernel: ata8.00: cmd 61/80:80:40:c4:00/00:00:00:00:00/40 tag 16 ncq dma 65536 out
Feb 15 18:30:52 Tower2 kernel:         res 40/00:80:40:c4:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 15 18:30:52 Tower2 kernel: ata8.00: status: { DRDY }
Feb 15 18:30:52 Tower2 kernel: ata8.00: failed command: WRITE FPDMA QUEUED
Feb 15 18:30:52 Tower2 kernel: ata8.00: cmd 61/80:88:c0:c4:00/00:00:00:00:00/40 tag 17 ncq dma 65536 out
Feb 15 18:30:52 Tower2 kernel:         res 40/00:80:40:c4:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 15 18:30:52 Tower2 kernel: ata8.00: status: { DRDY }
Feb 15 18:30:52 Tower2 kernel: ata8.00: failed command: WRITE FPDMA QUEUED
Feb 15 18:30:52 Tower2 kernel: ata8.00: cmd 61/80:90:40:c5:00/00:00:00:00:00/40 tag 18 ncq dma 65536 out
Feb 15 18:30:52 Tower2 kernel:         res 40/00:80:40:c4:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 15 18:30:52 Tower2 kernel: ata8.00: status: { DRDY }
Feb 15 18:30:52 Tower2 kernel: ata8: hard resetting link

 

A little after this it dropped offline and made btrfs crash:

 

Feb 15 18:31:02 Tower2 kernel: ata8: softreset failed (1st FIS failed)

Feb 15 18:31:02 Tower2 kernel: ata8: hard resetting link

Feb 15 18:31:12 Tower2 kernel: ata8: softreset failed (1st FIS failed)

Feb 15 18:31:12 Tower2 kernel: ata8: hard resetting link

Feb 15 18:31:47 Tower2 kernel: ata8: softreset failed (1st FIS failed)

Feb 15 18:31:47 Tower2 kernel: ata8: limiting SATA link speed to 3.0 Gbps

Feb 15 18:31:47 Tower2 kernel: ata8: hard resetting link

Feb 15 18:31:52 Tower2 kernel: ata8: softreset failed (1st FIS failed)

Feb 15 18:31:52 Tower2 kernel: ata8: reset failed, giving up

Feb 15 18:31:52 Tower2 kernel: ata8.00: disabled

Feb 15 18:31:52 Tower2 kernel: ata8: EH complete

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#21 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#21 CDB: opcode=0x2a 2a 00 00 00 c5 40 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50496

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#22 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#22 CDB: opcode=0x2a 2a 00 00 00 c4 c0 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50368

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 2, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#23 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#23 CDB: opcode=0x2a 2a 00 00 00 c4 40 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50240

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 3, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#24 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#24 CDB: opcode=0x2a 2a 00 00 00 c5 c0 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50624

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 4, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#25 CDB: opcode=0x2a 2a 00 00 00 c6 40 00 02 00 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50752

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 5, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#26 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#26 CDB: opcode=0x2a 2a 00 00 00 c4 40 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50240

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#27 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#27 CDB: opcode=0x2a 2a 00 00 00 c4 c0 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50368

Feb 15 18:31:52 Tower2 kernel: BTRFS error (device sdb1): bdev /dev/sdh1 errs: wr 10, rd 0, flush 0, corrupt 0, gen 0

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#28 CDB: opcode=0x2a 2a 00 00 00 c5 40 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50496

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#29 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#29 CDB: opcode=0x2a 2a 00 00 00 c5 c0 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50624

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Feb 15 18:31:52 Tower2 kernel: sd 8:0:0:0: [sdh] tag#30 CDB: opcode=0x2a 2a 00 00 00 c6 40 00 00 80 00

Feb 15 18:31:52 Tower2 kernel: blk_update_request: I/O error, dev sdh, sector 50752

Feb 15 18:31:52 Tower2 kernel: BTRFS: error (device sdb1) in write_all_supers:3741: errno=-5 IO failure (errors while submitting device barriers.)

Feb 15 18:31:52 Tower2 kernel: BTRFS info (device sdb1): forced readonly

Feb 15 18:31:52 Tower2 kernel: BTRFS warning (device sdb1): Skipping commit of aborted transaction.

Feb 15 18:31:52 Tower2 kernel: ------------[ cut here ]------------

Feb 15 18:31:52 Tower2 kernel: WARNING: CPU: 0 PID: 11044 at fs/btrfs/transaction.c:1850 cleanup_transaction+0x8c/0x238

Feb 15 18:31:52 Tower2 kernel: BTRFS: Transaction aborted (error -5)

Feb 15 18:31:52 Tower2 kernel: Modules linked in: md_mod x86_pkg_temp_thermal coretemp kvm_intel kvm mxm_wmi ahci e1000e ptp i2c_i801 i2c_smbus i2c_core pps_core libahci video wmi backlight [last unloaded: md_mod]

Feb 15 18:31:52 Tower2 kernel: CPU: 0 PID: 11044 Comm: btrfs Not tainted 4.9.8-unRAID #1

Feb 15 18:31:52 Tower2 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z170 OC Formula, BIOS P2.10 01/25/2016

Feb 15 18:31:52 Tower2 kernel: ffffc9000776b988 ffffffff813a34fa ffffc9000776b9d8 ffffffff8196add8

Feb 15 18:31:52 Tower2 kernel: ffffc9000776b9c8 ffffffff8104d04c 0000073a0776ba40 ffff8808601b4000

Feb 15 18:31:52 Tower2 kernel: ffff88085e731800 ffff8808601ac000 00000000fffffffb 0000000000000000

Feb 15 18:31:52 Tower2 kernel: Call Trace:

Feb 15 18:31:52 Tower2 kernel: [<ffffffff813a34fa>] dump_stack+0x61/0x7e

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8104d04c>] __warn+0xb8/0xd3

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8104d0ad>] warn_slowpath_fmt+0x46/0x4e

Feb 15 18:31:52 Tower2 kernel: [<ffffffff812edc8e>] cleanup_transaction+0x8c/0x238

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8107c00f>] ? wake_up_bit+0x25/0x25

Feb 15 18:31:52 Tower2 kernel: [<ffffffff812ef9f6>] btrfs_commit_transaction.part.11+0x912/0x927

Feb 15 18:31:52 Tower2 kernel: [<ffffffff812efa51>] btrfs_commit_transaction+0x46/0x4d

Feb 15 18:31:52 Tower2 kernel: [<ffffffff81337ff1>] prepare_to_merge+0x1db/0x1f5

Feb 15 18:31:52 Tower2 kernel: [<ffffffff813388a3>] relocate_block_group+0x216/0x500

Feb 15 18:31:52 Tower2 kernel: [<ffffffff81338ccc>] btrfs_relocate_block_group+0x13f/0x26b

Feb 15 18:31:52 Tower2 kernel: BTRFS: error (device loop0) in write_all_supers:3741: errno=-5 IO failure (errors while submitting device barriers.)

Feb 15 18:31:52 Tower2 kernel: BTRFS info (device loop0): forced readonly

Feb 15 18:31:52 Tower2 kernel: ------------[ cut here ]------------

Feb 15 18:31:52 Tower2 kernel: WARNING: CPU: 5 PID: 11193 at fs/btrfs/tree-log.c:2951 btrfs_sync_log+0x7b3/0x9a1

Feb 15 18:31:52 Tower2 root: ERROR: error during balancing '/mnt/cache': Read-only file system

Feb 15 18:31:52 Tower2 kernel: BTRFS: Transaction aborted (error -5)

Feb 15 18:31:52 Tower2 kernel: Modules linked in: md_mod x86_pkg_temp_thermal coretemp kvm_intel kvm mxm_wmi ahci e1000e ptp i2c_i801 i2c_smbus i2c_core pps_core libahci video wmi backlight [last unloaded: md_mod]

Feb 15 18:31:52 Tower2 root: There may be more info in syslog - try dmesg | tail

Feb 15 18:31:52 Tower2 kernel: [<ffffffff81312704>] btrfs_relocate_chunk.isra.16+0x43/0xb8

Feb 15 18:31:52 Tower2 kernel: [<ffffffff81314311>] btrfs_balance+0xf3d/0xfd5

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8113b4ae>] ? mntput+0x28/0x2a

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8131bc64>] btrfs_ioctl_balance+0x24a/0x2c8

Feb 15 18:31:52 Tower2 kernel: [<ffffffff813212a5>] btrfs_ioctl+0x12e3/0x1f13

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8111d96c>] ? mem_cgroup_commit_charge+0x9f/0xc0

Feb 15 18:31:52 Tower2 kernel: [<ffffffff810d2533>] ? lru_cache_add_active_or_unevictable+0x31/0x9d

Feb 15 18:31:52 Tower2 kernel: [<ffffffff810edcc3>] ? handle_mm_fault+0x65d/0xf96

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8112fe66>] vfs_ioctl+0x13/0x2f

Feb 15 18:31:52 Tower2 kernel: [<ffffffff81130396>] do_vfs_ioctl+0x49c/0x50a

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8104231e>] ? __do_page_fault+0x350/0x3ed

Feb 15 18:31:52 Tower2 kernel: [<ffffffff81130442>] SyS_ioctl+0x3e/0x5c

Feb 15 18:31:52 Tower2 kernel: [<ffffffff8167d1b7>] entry_SYSCALL_64_fastpath+0x1a/0xa9

Feb 15 18:31:52 Tower2 kernel: CPU: 5 PID: 11193 Comm: dockerd Not tainted 4.9.8-unRAID #1

Feb 15 18:31:52 Tower2 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z170 OC Formula, BIOS P2.10 01/25/2016

Feb 15 18:31:52 Tower2 kernel: ffffc900087ebc98 ffffffff813a34fa ffffc900087ebce8 ffffffff8196c9be

Feb 15 18:31:52 Tower2 kernel: ---[ end trace 5af4771217443ba2 ]---

 

Your libvrt image is corrupt, this may explain the VM issues:

 

Feb 15 18:31:52 Tower2 root: truncate: cannot open '/mnt/cache/system/libvirt/libvirt.img' for writing: Read-only file system

 

Docker image is also corrupt:

 

Feb 15 18:31:52 Tower2 root: truncate: cannot open '/mnt/cache/docker.img' for writing: Read-only file system

 

 

You'll need the solve the SSD errors first or the other problems will reappear, meanwhile disable both the docker an VM services.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.