ThanasisPolitis Posted August 31, 2023 Share Posted August 31, 2023 My server is not responding in any services, SMB or HTTPS, dashboard says CPU is 100% on all cores, however HTOP shows a different picture? Any ideas? Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 Dashboard graph includes i/o wait. 1 Quote Link to comment
ThanasisPolitis Posted August 31, 2023 Author Share Posted August 31, 2023 2 minutes ago, JorgeB said: Dashboard graph includes i/o wait. Thank you for the reply! It seems that my cache drives have SMART errors and I can see that in the logs too... I/O error, dev loop2, sector 27962672 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 BTRFS error (device loop2: state EA): bdev /dev/loop2 errs: wr 20, rd 1, flush 1, corrupt 1, gen 0 Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 You should post the diagnostics. Quote Link to comment
ThanasisPolitis Posted August 31, 2023 Author Share Posted August 31, 2023 Diagnostics attached echidna-diagnostics-20230831-1030.zip Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 There are ATA errors with both cache devices, and one of them had previous write errors, run a correcting scrub and check/replace cables to rule that out, but strange that both are having errors at the same time, see if they share something like a power splitter. 1 Quote Link to comment
ThanasisPolitis Posted August 31, 2023 Author Share Posted August 31, 2023 (edited) Thank you for the help! Scrub completed, no errors found... UUID: 5815d461-6c5e-4322-852c-c860344b7dd0 Scrub started: Thu Aug 31 13:41:51 2023 Status: finished Duration: 0:16:43 Total to scrub: 273.53GiB Rate: 278.98MiB/s Error summary: no errors found Both cache HDDs where bought brand new in March 2023... SMART errors started to show up a week or so ago... Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 197 Current_Pending_Sector -O--CK 100 100 050 - 8 Edit: HDD Power is direct to PSU (no splitters) SATA data cables are look visually OK and they are plugged in fully on both MB and HDD. Edited August 31, 2023 by ThanasisPolitis Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 Run an extended SMART test on cache1, the one with pending sectors. Quote Link to comment
ThanasisPolitis Posted August 31, 2023 Author Share Posted August 31, 2023 (edited) Both of the cache SSDs are 512GB SPCC Solid State Disks. I have run a command to check what's waiting for disk IO. watch -n 1 "(ps aux | awk '\$8 ~ /D/ { print \$0 }')" Edited August 31, 2023 by ThanasisPolitis Quote Link to comment
ThanasisPolitis Posted August 31, 2023 Author Share Posted August 31, 2023 5 hours ago, JorgeB said: Run an extended SMART test on cache1, the one with pending sectors. Extended SMART test report attached for Cache 1 drive SPCC_Solid_State_Disk_AA230130S3051210586-20230831-2110.txt Quote Link to comment
ThanasisPolitis Posted August 31, 2023 Author Share Posted August 31, 2023 5 hours ago, JorgeB said: Run an extended SMART test on cache1, the one with pending sectors. Extended SMART test report attached for Cache 2 drive SPCC_Solid_State_Disk_AA230130S3051213830-20230831-2133.txt Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 SMART test passed so it's OK for now, try rebooting to see if it helps with the wait. Quote Link to comment
Solution ThanasisPolitis Posted September 4, 2023 Author Solution Share Posted September 4, 2023 Sorry for the late reply, been quite busy... So I have rebooted the server, still no go, cache seems to be bottlenecking the CPU. Decided to convert the cache from BTRFS mirror to ZFS mirror and the performance difference is astonishing! CPU is averaging 3% with 32 docker containers running! Before, Plex plyers on the local network would freeze at least 5 times when watching a movie -- in direct stream, not transcoded, and now, it is smooth as butter! Will keep an eye on it but it seems that BTRFS was putting high stress in IOPS and it was bottlenecking the CPU... Thank you very much for all your help and support with this! Quote Link to comment
deblex Posted November 9, 2023 Share Posted November 9, 2023 On 9/4/2023 at 10:29 PM, ThanasisPolitis said: Sorry for the late reply, been quite busy... So I have rebooted the server, still no go, cache seems to be bottlenecking the CPU. Decided to convert the cache from BTRFS mirror to ZFS mirror and the performance difference is astonishing! CPU is averaging 3% with 32 docker containers running! Before, Plex plyers on the local network would freeze at least 5 times when watching a movie -- in direct stream, not transcoded, and now, it is smooth as butter! Will keep an eye on it but it seems that BTRFS was putting high stress in IOPS and it was bottlenecking the CPU... Thank you very much for all your help and support with this! Been having the same issue lately and was wondering how to convert my btrfs cache drive to zfs file system to see if it solve my issue, would you have any guide on how to and is there any impact doing this on dockers / vm's? Thank you for your help Quote Link to comment
ThanasisPolitis Posted November 9, 2023 Author Share Posted November 9, 2023 2 minutes ago, deblex said: Been having the same issue lately and was wondering how to convert my btrfs cache drive to zfs file system to see if it solve my issue, would you have any guide on how to and is there any impact doing this on dockers / vm's? Thank you for your help Yes, my friend, this is what I had to do to get over that issue. It seems that it did work (partially) but at the end, I got some 2nd hand Data Centre grade Intel SSDs which now host my Docker, VM and System shares and I keep the cache just for... well, cache. You can convert the current cache from BTRFS to ZFS without losing anything, however, you have to do some file moves to the array and back to cache once it is on zfs. This guide from helped me massively. https://www.youtube.com/watch?v=vXF8au5o9Tw Regards, Thanasis 1 Quote Link to comment
DanielPT Posted November 9, 2023 Share Posted November 9, 2023 So you could format a BTRFS to ZFS even that the array is a BTRFS? I use my cache drives for my appdata and downloads. But im having highg IOwait. But i use the mover to move from cache to array but can you mix this? Thanks! Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 5 minutes ago, DanielPT said: So you could format a BTRFS to ZFS even that the array is a BTRFS? You can. Quote Link to comment
DanielPT Posted November 9, 2023 Share Posted November 9, 2023 5 hours ago, JorgeB said: You can. Thanks Do you think it would help me with my high IOwait? Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 It does for some users, can't say if it's going to help you. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.