Unraid OS version 6.9.0-beta30 available

wickedathletes · October 8, 2020

7 minutes ago, limetech said:
Thank you. Please open terminal, type this command and post the output:
hexdump -C -n 512 /dev/sdh
(that is non-destructive, it's dumping the first 512 bytes of the device, aka, the MBR - Master Boot Record)

I assume in a "broke" state? or as beta25 right now?

limetech · October 8, 2020

On beta25 is fine. Just make sure the problematic device is still 'sdh' - if not, substitute that device id.

wickedathletes · October 8, 2020

55 minutes ago, limetech said:

On beta25 is fine. Just make sure the problematic device is still 'sdh' - if not, substitute that device id.

00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001c0 02 00 00 ff ff ff 01 00 00 00 ff ff ff ff 00 00 |................|
000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
00000200

limetech · October 8, 2020

1 hour ago, wickedathletes said:

00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001c0 02 00 00 ff ff ff 01 00 00 00 ff ff ff ff 00 00 |................|
000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
00000200

Thanks, yes that explains it. Probably, with -beta25 and even earlier releases, if you click on the device from Main and then look at 'Partition format' it will say 'GPT: 4KiB-aligned (factory erased)' - that (factory erased) shouldn't be there. Will fix for next release.

JorgeB · October 8, 2020

5 hours ago, limetech said:

Maybe the problem you were seeing had to do with caching where somehow it was still returning cached "correct" data instead of fetching your corrupted data.

Sorry missed this part earlier, no, that wasn't the problem, since the checksum error would always be logged in the syslog, and if I tried to copy the file locally I would always get an i/o error, with SMB (and aio read enable) it would still copy successfully despite the checksum error.

BTW, just did a test on beta30, with aio enable, and it's behaving correctly, i.e., a file with just a single sector corrupt will fail to copy, tried 20 times, and always got this:

image.png.4e3531b7da0af40a6bf1fd9cfda53972.png

At the same time a checksum error is logged (before only this part happened):

image.png.f423ebc5440e958560cc4ab9d9f0c5f1.png

So that issue appears to be fixed.

limetech · October 8, 2020

5 minutes ago, JorgeB said:

So that issue appears to be fixed.

That's good, but evidently not much performance improvement, and in fact decreases performance for some controllers?

JorgeB · October 8, 2020

3 minutes ago, limetech said:

in fact decreases performance for some controllers?

Possibly, maybe better to leave it disabled for now, I will since I don't see any performance advantage.

limetech · October 8, 2020

14 minutes ago, JorgeB said:

Possibly, maybe better to leave it disabled for now, I will since I don't see any performance advantage.

You think it's best to turn aio off for read but leave on for write by default? That's how it's been but I would add to smb.conf as:

aio read size = 0
aio write size = 1

trypowercycle · October 9, 2020

8 hours ago, limetech said:

Yes put user defined settings/overrides in config/smb-extra.conf

There were some changes made in samba aio, I think starting with samba v4.12. In reviewing this, the 'man smb.conf' page changed, which I pasted below, where you can see the samba defaults are now to enable aio for both read and write. Further, doesn't matter the value '4096' - just 0 means disabled and non-zero means enabled. Interesting, your card seems to perform better with read aio off.

aio max threads (G)

The integer parameter specifies the maximum number of threads each smbd process will create when doing parallel asynchronous IO calls. If the number of outstanding calls is
greater than this number the requests will not be refused but go onto a queue and will be scheduled in turn as outstanding requests complete.

Related command: aio read size

Related command: aio write size

Default: aio max threads = 100

aio read size (S)

If this integer parameter is set to a non-zero value, Samba will read from files asynchronously when the request size is bigger than this value. Note that it happens only for
non-chained and non-chaining reads and when not using write cache.

The only reasonable values for this parameter are 0 (no async I/O) and 1 (always do async I/O).

Related command: write cache size

Related command: aio write size

Default: aio read size = 1

Example: aio read size = 0 # Always do reads synchronously

aio write behind (S)

If Samba has been built with asynchronous I/O support, Samba will not wait until write requests are finished before returning the result to the client for files listed in
this parameter. Instead, Samba will immediately return that the write request has been finished successfully, no matter if the operation will succeed or not. This might speed
up clients without aio support, but is really dangerous, because data could be lost and files could be damaged.

The syntax is identical to the veto files parameter.

Default: aio write behind =

Example: aio write behind = /*.tmp/

aio write size (S)

If this integer parameter is set to a non-zero value, Samba will write to files asynchronously when the request size is bigger than this value. Note that it happens only for
non-chained and non-chaining reads and when not using write cache.

The only reasonable values for this parameter are 0 (no async I/O) and 1 (always do async I/O).

Compared to aio read size this parameter has a smaller effect, most writes should end up in the file system cache. Writes that require space allocation might benefit most
from going asynchronous.

Related command: write cache size

Related command: aio read size

Default: aio write size = 1

Example: aio write size = 0 # Always do writes synchronously

Thanks for the writeup and sending that over.

For what it's worth I took the line back out for the aio write size to let it default to on. I tested with aio write off and on and it didn't seem to make a difference.

I put aio read into the extras file as you suggested and it is working as expected, so all is good with that.

I guess if other people have the same controller they will just have to do the same. I'd assume it is fairly common controller. Perhaps the fix could be documented somewhere so others can find it. It is strange that this particular controller performs better with it off though...

Edited October 9, 2020 by trypowercycle

JorgeB · October 9, 2020

7 hours ago, limetech said:

You think it's best to turn aio off for read but leave on for write by default?

I honestely don't know, never did much testing on that, time permitting I'll try doing some over the weekend, with different workloads and a couple of controllers, to see if there's any clear difference one way or the other.

wickedathletes · October 9, 2020

15 hours ago, limetech said:

Thanks, yes that explains it. Probably, with -beta25 and even earlier releases, if you click on the device from Main and then look at 'Partition format' it will say 'GPT: 4KiB-aligned (factory erased)' - that (factory erased) shouldn't be there. Will fix for next release.

thank you and as always, you guys are the best!

NNate · October 9, 2020

I installed Beta 30 last night (coming from the stable). I'm seeing my log file full of avahi errors (I get them every minute). It makes it very difficult to find meaningful information when this buries everything.

Oct 9 08:52:01 Server avahi-daemon[11275]: Record [Ricoh\032Color\032Laser\032Printer\032\064\032Server._ipp._tcp.local#011IN#011TXT "txtvers=1" "qtotal=1" "rp=printers/RicohSPC250DN" "ty=RICOH SP C250DN PS" "adminurl=https://Server.local:631/printers/RicohSPC250DN" "note=Office" "priori
Oct 9 08:52:01 Server avahi-daemon[11275]: Record [AirPrint\032RicohSPC250DN\032\064\032Server._ipp._tcp.local#011IN#011TXT "txtvers=1" "qtotal=1" "Transparent=T" "URF=none" "rp=printers/RicohSPC250DN" "note=Ricoh Color Laser Printer" "product=(GPL Ghostscript)" "printer-state=3" "printer-type=0
Oct 9 08:52:01 Server avahi-daemon[11275]: Record [_printer._tcp.local#011IN#011PTR Ricoh\032Color\032Laser\032Printer\032\064\032Server._printer._tcp.local ; ttl=4500] not fitting in legacy unicast packet, dropping.
Oct 9 08:52:01 Server avahi-daemon[11275]: Record [Ricoh\032Color\032Laser\032Printer\032\064\032Server._printer._tcp.local#011IN#011TXT ; ttl=4500] not fitting in legacy unicast packet, dropping.
Oct 9 08:52:01 Server avahi-daemon[11275]: Record [Ricoh\032Color\032Laser\032Printer\032\064\032Server._printer._tcp.local#011IN#011SRV 0 0 0 Server.local ; ttl=120] not fitting in legacy unicast packet, dropping.

I'm running a CUPS Docker that has AirPrint, but I never saw these messages until 6.9 Beta. I'd love to hide them or get rid of them so I can better find meaningful info in the logs.

JorgeB · October 9, 2020

19 hours ago, JorgeB said:

So that issue appears to be fixed.

Tested also at work with different computers and it does appear to be fixed, but it was recent, v6.9-beta1 still has the issue, you get an i/o error about once every 10 times but just retry and it happily continues to copy the corrupt file.

Also, have some doubts if this was really fixed or is just pure luck that it works correctly currently, I can't find any reference to this bug, when I google the only result I get is my original report here in the forum, but maybe I'm using the wrong keywords, not even sure this was a btrfs issue or Samba, I would assume Samba.

limetech · October 9, 2020

41 minutes ago, JorgeB said:

really fixed or is just pure luck that it works correctly currently,

You have just described how almost all software functions 🤣

limetech · October 9, 2020

7 hours ago, wickedathletes said:

thank you and as always, you guys are the best!

The issue you are seeing with -beta30 is being caused by that particular hard drive having a "pre-cleared" signature in the MBR, yet, I'm assuming that when you boot -beta25 (or earlier), it mounts ok and there are files on that disk - is that correct?

If so, this should be an "impossible" state to get into - I'm wondering if you recall and give me a little background info that particular drive. That is, did you at one time in the past run 'pre-clear' on that drive?

jdiggity81 · October 9, 2020

I take it that means with the work around you built in that the lsi 9200-16e will work until they make the corrections?

wickedathletes · October 10, 2020

6 hours ago, limetech said:

The issue you are seeing with -beta30 is being caused by that particular hard drive having a "pre-cleared" signature in the MBR, yet, I'm assuming that when you boot -beta25 (or earlier), it mounts ok and there are files on that disk - is that correct?

If so, this should be an "impossible" state to get into - I'm wondering if you recall and give me a little background info that particular drive. That is, did you at one time in the past run 'pre-clear' on that drive?

This drive is functioning 100% fine in beta25 and earlier. 29/30 it wont mount.

I honestly don't recall. I would say I have pre-cleared 90% of my drives but their was a time when I didn't have the ability to pre-clear due to a drive failure and no extra bays to use (aka I was under the gun). Its possible this was the drive that was not, but I doubt it. The drive is roughly 3 years old.

Is their a way out of this state, or with this drive am I stuck on beta25 for eternity haha?

JorgeB · October 10, 2020

8 hours ago, jdiggity81 said:

I take it that means with the work around you built in that the lsi 9200-16e will work until they make the corrections?

Yes, they should work now.

itimpi · October 10, 2020

9 hours ago, jdiggity81 said:

I take it that means with the work around you built in that the lsi 9200-16e will work until they make the corrections?

Strange that the 'e' model needs this whereas the 'i' model (which I have) does not. You would have thought they would be identical except for the connector being external rather than internal.

SimonF · October 10, 2020

9 hours ago, jdiggity81 said:

I take it that means with the work around you built in that the lsi 9200-16e will work until they make the corrections?

Mine is working on beta 30.

07:00.0 "Serial Attached SCSI controller" "Broadcom / LSI" "SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor]" -r02 "Broadcom / LSI" "9201-16e 6Gb/s SAS/SATA PCIe x8 External HBA"

Interstellar · October 10, 2020

b25 to b30 here 24 hours ago, all fine.

Only issue is something setting /etc chown from root:root to something else causing ssh to fail, must be a plugin...!

limetech · October 10, 2020

15 hours ago, wickedathletes said:

This drive is functioning 100% fine in beta25 and earlier. 29/30 it wont mount.

I honestly don't recall. I would say I have pre-cleared 90% of my drives but their was a time when I didn't have the ability to pre-clear due to a drive failure and no extra bays to use (aka I was under the gun). Its possible this was the drive that was not, but I doubt it. The drive is roughly 3 years old.

Is their a way out of this state, or with this drive am I stuck on beta25 for eternity haha?

Is it possible that this drive was pre-cleared and then used to rebuild a disabled disk?

No you won't be stuck on beta25 🤪

Dava2k7 · October 11, 2020

Hi Limetech

Don't suppose there's any updates on the VM problems I'm having I uploaded my diagnostics file a few days back. I can see its been downloaded not sure by who is there any chance the problem will be sorted soon? this would be greatly appreciated. I had lost my VM for 9 months prior to the 3 days it was working and now I have no VM again I got so excited... about it I made some purchases that I cant now use to the full ability or at all due to having no VM. Really need some closure on what's going on? thank you

Edited October 11, 2020 by Dava2k7

JorgeB · October 11, 2020

On 10/9/2020 at 12:06 AM, limetech said:

You think it's best to turn aio off for read but leave on for write by default?

Did some Samba aio enable/disable tests, no time to do them using many different controllers, but wanted do a least use a couple, so tested on a an xfs formatted array with reconstruct write enable connect to an LSI 3008 and a 3 SSD btrfs raid0 pool connected to the Intel SATA ports, used robocopy to copy two folders to/from an NVMe device in my Win10 desktop, first a folder with 6 large files totaling around 26GB, second one with 25k small to medium files totaling 25GB, tried to remove RAM cache from the equation as much as possible.

I only ran each test once, an average of 3 runs would be more accurate but didn’t have the time, these are the speeds reported by robocopy after the transfers were done:

image.png.423cd07e88b36ecef32205535e94eb20.png

I was only going to be using user shares for testing but because of the very low write speed for small files I decided to repeat each test using disk shares:

image.png.434d23895ce98594e25af7ac01b41edf.png

Not what I testing here but still interesting results, shfs has a very large overhead with small files, especially for writes, not something I usually do in my normal usage but perhaps one of the reasons people with time machine backups are seeing very low speeds? I believe those use lots of small files.

As for Samba aio, I don’t see any advantage in having aio enable, if anything it appears to be generally a little slower, add to that it apparently performs much worse for some and that I still don’t trust that the btrfs issue is really fixed, and that it might come back on future releases, I would leave it disable by default, of course different hardware/workloads can return different results, but if anyone wants to enable it’s really easy using the Samba extra options.

wickedathletes · October 11, 2020

23 hours ago, limetech said:

Is it possible that this drive was pre-cleared and then used to rebuild a disabled disk?

No you won't be stuck on beta25 🤪

that is definitely possible, but asking my brain to think back 3 years is a big task hahaha. It would have been one of 2 things, a rebuild of a disabled disk or a replacement of a 4TB disk as 3 years ago my server only had room for 8 drives.

Unraid OS version 6.9.0-beta30 available

User Feedback

Recommended Comments

wickedathletes 2

Link to comment

limetech 3328

Link to comment

wickedathletes 2

Link to comment

limetech 3328

Link to comment

JorgeB 7507

Link to comment

limetech 3328

Link to comment

JorgeB 7507

Link to comment

limetech 3328

Link to comment

trypowercycle 1

Link to comment

JorgeB 7507

Link to comment

wickedathletes 2

Link to comment

NNate 14

Link to comment

JorgeB 7507

Link to comment

limetech 3328

Link to comment

limetech 3328

Link to comment

jdiggity81 0

Link to comment

wickedathletes 2

Link to comment

JorgeB 7507

Link to comment

itimpi 2249

Link to comment

SimonF 956

Link to comment

Interstellar 12

Link to comment

limetech 3328

Link to comment

Dava2k7 1

Link to comment

JorgeB 7507

Link to comment

wickedathletes 2

Link to comment