WEHA
-
Posts
91 -
Joined
-
Last visited
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Posts posted by WEHA
-
-
2 hours ago, Kilrah said:
So you're now copying parity data onto that 6TB drive?
Looks like it's an SMR drive so... you're just going to have to be very patient.
You might be able to gain some time by pausing the operation, waiting an hour or so then resuming, and repeating everytime it falls down to negligible speeds.
I mean ok it's an SMR drive, but this bad? Is this not a sequential write?
It was pretty much like this from the beginning but I can give that a try
-
I had a data disk bad and even disabled so I replaced it.
I only have a larger disk available so it wants to do a parity swap, I get it can take while but I'm seeing 5 / 6MB/s... for 3TB
Can someone point out how to get the ball rolling faster because at this rate it will take more than 5 days.
Diag attached.
-
I have write errors on a cache drive and now my VM's & Dockers are not responding (just opnsense).
mount|grep cache /dev/nvme2n1p1 on /mnt/cache type btrfs (ro,noatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/)
I'm on holiday so I don't want to stop the array to remove the disk.
Is there a way to tell unraid to stop using that disk?
I saw this but it does not seem to be right syntax for unraid: echo 1 > /sys/block/nvme0n1/device/delete
Sep 14 07:23:20 Tower kernel: BTRFS error (device nvme2n1p1): error writing primary super block to device 2
Label: none uuid: 987c4458-3b7c-4bbe-af87-c2f8bdde7c60 Total devices 2 FS bytes used 724.88GiB devid 1 size 931.51GiB used 903.54GiB path /dev/nvme2n1p1 devid 2 size 0 used 0 path /dev/nvme0n1p1 MISSING
Sidenote: With errors like these, I would also think that it would give at least an error when I'm looking at the array itself?
Would it be a good idea to do the following?
btrfs device remove /dev/nvme0n1p1 /mnt/cache
How can I remount rw?
Thanks!
-
7 minutes ago, JorgeB said:
The trial is to be able to see and send the new GUID to support, you cannot use a trial with an existing config.
You shoud be able to start a trial on an unknown usb stick to get started with your backup config.
Now you are just hijacking peoples systems who are already getting a headache of a non-working server.
EDIT: for like a (few) day(s) or something, linked to an account so you can track abuse
-
1 minute ago, JorgeB said:
You should contact support.
Yes, helpful, thank you... like I already mentioned that I already requested it.
There seems to a flaw in the documentation saying you need to request a trial but that's not possible when you have a backup of your configuration.
-
Starting a Trial is also not a solution as the configuration backup refers to an activated license so trial is invalid...
I would like to use my purchased product please???!!!
-
Another stick bites the dust, not sure why as these are practically unused sticks...
New stick installed, no license key obviously, the very convenient 1 year block on requesting a new license is very helpful.
I requested it via e-mail but who knows how long that is going to take.
I click free trial but there is no procedure to get he trial license, only how to install a stick... yes thank you, already done that.
I'm assuming this should be available on the machine itself but it does not have internet because the firewall is a VM on the same machine...
Either way I only get "fix error" that just opens the messages with the options "purchase key" and "redeem activation code".
So... now what?
-
3 minutes ago, JorgeB said:
Not because it's COW, all my btrfs data is COW, but if it's just the docker image there might be another reason for the corruption, and not be hardware.
Very well, thanks for your input.
-
6 hours ago, JorgeB said:
The vdisk might be corrupt, no way of knowing.
Well yes, not via btrfs but I have no issues with the vm, no errors in eventlog and full backups are working.
That's why I believe the vdisk is fine.
It's just weird to me that only docker image is affected and it was on a COW share.
But if you're confident that there is no issue with this scenario then ok.
-
4 hours ago, JorgeB said:
Enabling NOCOW turns off data checksum, so those can't be checked or fixed.
I mean COW by enabling.
So system had COW and the vdisk had NOCOW
But docker image was corrupt and vdisk image was not.
-
1 hour ago, JorgeB said:
That shouldn't be a problem.
It's strange that it's only the docker file and not the vm file... could it be related to NOCOW / COW?
I enabled this for the system share and thus the docker image, the vdisk has NOCOW.
Thank you for assisting
-
On 11/24/2020 at 3:42 PM, JorgeB said:
The only option I see is doing the manual way, i.e., copy/move all the data somewhere else, any files that can't be copied due to an i/o error are corrupt, note that some of the corruption might not be on files but it can be the metadata, but it sill should show that info on the log.
I moved everything off, 2 files remained, 1 vdisk file and docker img.
The docker image was unable to be moved due to an i/o error, so I removed it and recreated it on another pool.
I reran scrub and now no errors are detected.
Is this related to docker image being set as xfs on a btrfs pool?
I set this to xfs to be sure the bug that causes much disk i/o to be gone.
Smart does not show any errors on the disk so I can be sure this was a software corruption and not caused by a hardware (hdd) defect?
-
Same story, I see callbacks suppressed though
[203355.213783] BTRFS error (device sde1): unable to fixup (regular) error at logical 1342354677760 on dev /dev/sde1 [203436.360164] scrub_handle_errored_block: 8 callbacks suppressed [203436.360209] btrfs_dev_stat_print_on_error: 8 callbacks suppressed [203436.360212] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 93, gen 0 [203436.360214] scrub_handle_errored_block: 8 callbacks suppressed [203436.360215] BTRFS error (device sde1): unable to fixup (regular) error at logical 1348826648576 on dev /dev/sde1 [203439.353192] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 94, gen 0 [203439.353195] BTRFS error (device sde1): unable to fixup (regular) error at logical 1349298642944 on dev /dev/sde1 [203440.426170] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 95, gen 0 [203440.426174] BTRFS error (device sde1): unable to fixup (regular) error at logical 1349556105216 on dev /dev/sde1 [203441.204687] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 96, gen 0 [203441.204690] BTRFS error (device sde1): unable to fixup (regular) error at logical 1349681184768 on dev /dev/sde1
-
-
It's copied from the syslog file in nano, so I would think that is the full syslog?
There are warnings from before the scrub though:
root@Tower:/var/log# cat syslog |grep "BTRFS warning" Nov 23 03:59:25 Tower kernel: BTRFS warning (device sde1): csum failed root 5 ino 182291 off 1765621760 csum 0xd488241c expected csum 0xdbe78a4e mirror 1 Nov 23 03:59:25 Tower kernel: BTRFS warning (device sde1): csum failed root 5 ino 182291 off 1765621760 csum 0xd488241c expected csum 0xdbe78a4e mirror 1 Nov 23 20:40:23 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 281 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1 Nov 23 20:40:23 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 281 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1 Nov 24 04:03:17 Tower kernel: BTRFS warning (device sde1): csum failed root 5 ino 182291 off 4379881472 csum 0x1616fb61 expected csum 0xcbd3dbb1 mirror 2 Nov 24 09:19:19 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 282 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1 Nov 24 09:19:19 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 282 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1 Nov 24 09:22:05 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 283 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1 Nov 24 09:22:06 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 283 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1
-
Syslog does not show files:
Nov 24 13:01:51 Tower kernel: BTRFS info (device sde1): scrub: started on devid 1 Nov 24 13:01:51 Tower kernel: BTRFS info (device sde1): scrub: started on devid 2 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413978710016 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413913239552 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413913341952 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413913444352 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413915201536 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413915303936 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413915406336 on dev /dev/sdk1 Nov 24 13:03:22 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413916004352 on dev /dev/sdk1 Nov 24 13:03:23 Tower kernel: BTRFS error (device sde1): fixed up error at logical 1413978824704 on dev /dev/sdk1 Nov 24 13:03:23 Tower kernel: BTRFS error (device sde1): unable to fixup (regular) error at logical 1413979930624 on dev /dev/sdk1
-
14 minutes ago, JorgeB said:
Yes, that means that there's corrupt data and the balance will abort, you can run a scrub to find out the corrupt files(s), then delete them or restore from backup, also good idea to run memtest.
*sigh* ... how do I get a list of files?
I'm running scrub and this is the status already:
Error summary: csum=35
Corrected: 4
Uncorrectable: 31
Unverified: 0These are software errors, correct?
Smart does not indicate a problem, this is also a new disk.
-
Attached
Seems like this is the curlprit?
Nov 24 09:22:05 Tower kernel: BTRFS warning (device sde1): csum failed root -9 ino 283 off 951992320 csum 0x47d58bec expected csum 0x56997f79 mirror 1
-
25 minutes ago, JorgeB said:
It doesn't, unless something goes wrong.
Tried converting twice, remains the same state as posted earlier.
It starts and after about 30 seconds or so it goes back to no balance.
-
On 11/22/2020 at 10:01 AM, JorgeB said:
Due to a current btrfs bug you need to run the balace to single (or any other profile) twice.
Could you just confirm to me if converting from single to raid 1 does not lose data? (not stated in faq nor unraid gui)
I just added a disk to a cache pool from 1 to 2 and unraid made it single. (I believe this is the default according to the faq)
So this is the current state (2 states, related to the btrfs bug?):
Data, RAID1: total=42.00GiB, used=24.68GiB Data, single: total=1.18TiB, used=1.16TiB System, DUP: total=8.00MiB, used=176.00KiB Metadata, DUP: total=2.00GiB, used=1.69GiB GlobalReserve, single: total=512.00MiB, used=0.00B
I have enough space available so nothing will happen to my data right?
What would happen if there was not enough space?
-
1 hour ago, JorgeB said:
Due to a current btrfs bug you need to run the balace to single (or any other profile) twice.
That works, thanks!
- 1
-
I'm trying to create a JBOD cache pool in 6.9beta35.
I don't know if this is a bug or I'm just doing it wrong so...
From what I understand from the below post I have to set it to single mode
When I do this "convert to single mode" (it's a 14TB and 8TB disk) the GUI says it's 16TB.
I also see the same write speeds to both disks, giving the impression it's RAID 1
Balance status:
Data, RAID1: total=4.00GiB, used=2.97GiB
Data, single: total=1.00GiB, used=0.00B
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=1.00GiB, used=3.94MiB
GlobalReserve, single: total=3.78MiB, used=16.00KiB
If I execute "perform full balance", it just reverts to RAID 1 status.
Can anyone tell me what I'm doing wrong or should I post this as a bug in beta?
Maybe I have to jump through a few hoops like removing one disk -> single mode -> add disk?
thanks!
-
When making some changes it's sometimes necessary or preferred to start docker / vm manager without auto-start enabled.
That way you can just start whichever docker or vm you want.
So what I'm asking is: add a 3rd option to the enable docker / enable vms dropdown like Yes, no auto start
-
On 11/3/2020 at 1:33 PM, JorgeB said:
Fragmentation yes, increased write amplification not so good, since it can reduce the SSD life.
So I read about the "bug" that causes many writes to sdd's especially evo...
Mine have a 1200TBW and are around 1500TBW now (in 2 years time)
In the new beta there is a solution, but there also issues.
My thought is, can I upgrade to the new beta, recreate the cache (on new drives) with the new partition layout and revert back to 6.8.3 if the need arises?
Parity swap VERY slow (5MB/s)
in General Support
Posted
I guess this was one of those disks, I put in a non-smr and it's going 120-150MB/s...
I knew SMR were no good but THIS bad... wow