VM pauses under load


Recommended Posts

Hi, 

I have a big problem. My VMs pauses under heavy load or data transfers.

 

I read in another thread that this is possible when the cache drive is full. But my cache drives have 60gb left. 

I have 2x240GB (Pool of two devices).

I checked the folder via terminal:

root@Tower:/mnt# du -sh cache/
151G	cache/

And this shows always up in the log when the VMs are pausing:

Sep 19 20:56:46 Tower kernel: loop: Write error at byte offset 456073216, length 4096.
Sep 19 20:56:46 Tower kernel: print_req_error: I/O error, dev loop2, sector 890752
Sep 19 20:56:46 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 31, rd 0, flush 0, corrupt 0, gen 0
Sep 19 20:56:46 Tower kernel: loop: Write error at byte offset 457043968, length 4096.
Sep 19 20:56:46 Tower kernel: print_req_error: I/O error, dev loop2, sector 892544
Sep 19 20:56:46 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 32, rd 0, flush 0, corrupt 0, gen 0
Sep 19 20:56:46 Tower kernel: loop: Write error at byte offset 457048064, length 4096.
Sep 19 20:56:46 Tower kernel: print_req_error: I/O error, dev loop2, sector 892672
Sep 19 20:56:46 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 33, rd 0, flush 0, corrupt 0, gen 0
Sep 19 20:56:46 Tower kernel: loop: Write error at byte offset 457404416, length 4096.
Sep 19 20:56:46 Tower kernel: print_req_error: I/O error, dev loop2, sector 893312
Sep 19 20:56:46 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 34, rd 0, flush 0, corrupt 0, gen 0
Sep 19 20:56:46 Tower kernel: loop: Write error at byte offset 467271680, length 4096.
Sep 19 20:56:46 Tower kernel: print_req_error: I/O error, dev loop2, sector 912640
Sep 19 20:56:46 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 35, rd 0, flush 0, corrupt 0, gen 0

 

Thanks for your help!

Edited by hans-peter123
Link to comment

Yes thank you thats it!

 

But now I have another problem. I changed a share from "cache only" to "cache yes" and run the mover to get free space.

 

but now the log is spamming messages like this:

Quote

Sep 20 12:44:22 Tower move: move: file /mnt/cache/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Plug-in Support/Caches/com.plexapp.agents.lastfm/HTTP.system/a7/efc9c924b561a6fa249fdab2fa129778a75573_attributes
Sep 20 12:44:22 Tower move: move: create_parent: /mnt/cache/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Plug-in Support/Caches/com.plexapp.agents.lastfm/HTTP.system/a7 error: Read-only file system
Sep 20 12:44:22 Tower move: move: file /mnt/cache/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Plug-in Support/Caches/com.plexapp.agents.lastfm/HTTP.system/a7/ad4141c5f568f4b58f8d968b4ea22ff263349d.content
Sep 20 12:44:22 Tower move: move: create_parent: /mnt/cache/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Plug-in Support/Caches/com.plexapp.agents.lastfm/HTTP.system/a7 error: Read-only file system
Sep 20 12:44:22 Tower move: move: file /mnt/cache/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Plug-in Support/Caches/com.plexapp.agents.lastfm/HTTP.system/a7/ad4141c5f568f4b58f8d968b4ea22ff263349d_attributes

So ich changed it back... The cache pool is in in read only mode now?!

Quote

root@Tower:~# btrfs balance start -dusage=75 /mnt/cache

ERROR: error during balancing '/mnt/cache': Read-only file system

There may be more info in syslog - try dmesg | tail

Even after a reboot :(

 

Also the array don't want to start correctly. WebGUI show "Array starting . Starting services..." Docker and VMs aren't online. I can only access shares that aren't cached/cache related.

 

tower-diagnostics-20190920-1103.zip

Edited by hans-peter123
Link to comment

After I turned if Docker and VMs, the Array stated now completely. But I still can't access some files in the cache

 

I wanted to move a vdisk image into the array, than those 3 last log messages came up:

Quote

Sep 20 13:40:28 Tower kernel: BTRFS critical (device sdg1): unable to find logical 4611686019636789248 length 4096
Sep 20 13:40:28 Tower kernel: BTRFS critical (device sdg1): unable to find logical 4611686019636789248 length 4096
Sep 20 13:40:28 Tower kernel: BTRFS critical (device sdg1): unable to find logical 4611686019636789248 length 16384

 

tower-diagnostics-20190920-1140.zip

Edited by hans-peter123
Link to comment

I have one more question. I have a very important file (VM Image). When I try to move it, always this error show up:

Quote

cp: vdisk1.img: failed to get extents info: Input/output error

And in the log:

Quote

Sep 20 13:52:34 Tower kernel: BTRFS critical (device sdg1): unable to find logical 4611686019636789248 length 4096
Sep 20 13:52:34 Tower kernel: BTRFS critical (device sdg1): unable to find logical 4611686019636789248 length 4096
Sep 20 13:52:34 Tower kernel: BTRFS critical (device sdg1): unable to find logical 4611686019636789248 length 16384

Is it possible to rescue the file? I thought I have redundancy!? Not at a filesystem corruption? 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.