Making a disc "off limits"


Recommended Posts

Hey there,

 

I wasn't 100% sure where to put this as I feel it has to be a plugin. 

 

Now I have an unraid server going for quite a while and everything is fine. I just have 1 disc that seems to be very stubborn. SMART is fine, been back to store where they said it's fine and even sent it to the factory where it was returned as "fine". 

 

Yet once in every so while it just decides to become unavailable and only after a few resets it's back (and have to rebuild the data as it became "simulated"). Now I can think off a million reasons why it does this, but this is not why I'm here. Is there a way/plugin to just make a disc off limits. I don't want write activity/read activity (read: unless specifically asked) and just have it go standby. 

 

Tho the caveat is, I don't want to remove it from the raid (or importantly: I want it in the parity calculations)

 

Is there any way to achieve this? I know it's not the solution, but at least I can slowly debug without too much of a worry that it just suddenly dies permanently (ok then I have the 2 parity disks)

Link to comment
2 minutes ago, trurl said:

If the disk is fine then you should be able to fix whatever is causing it to become disabled.

 

Or, why not remove it? If it is disabled and you only have single parity then you have no parity protection for the rest of the array.

 

 

I guess I miswrote something: the disk itself isn't a parity, I just don't want to lose data if it really dies someday. And it's too big to "just" replace. 

 

And why I'm not sure yet, but I just want to avoid using it until I find out. 

Just now, trurl said:

Next time it happens, go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

Good idea, will do. 

Link to comment
Just now, djmulder said:

I guess I miswrote something: the disk itself isn't a parity

I don't think there was any misunderstanding, at least on my part, because I never thought it was parity.

 

Maybe the misunderstanding is yours.

 

Parity by itself doesn't protect anything. Parity requires all the other disks in order to recover the data for a failed, missing, or otherwise disabled disk.

 

So, if that disk is already disabled, then parity can't help you if one of your other disks fails.

Link to comment

The only way to guarantee it won't be used is to remove it.

 

If one of the other disks becomes disabled, then Unraid will have to use that disk and all the others in order to emulate the disabled disk.

 

If some data for one of the other disks can't be read, Unraid will have to use that disk and all the others in order to get the data that can't be read from that other disk then try to write it back to that other disk so it can be read in the future, or so it can be disabled if it can't be written.

 

Fix whatever problem is making the disk become disabled.

 

Have you run an extended SMART test on the disk?

Link to comment
6 minutes ago, trurl said:

The only way to guarantee it won't be used is to remove it.

 

If one of the other disks becomes disabled, then Unraid will have to use that disk and all the others in order to emulate the disabled disk.

 

If some data for one of the other disks can't be read, Unraid will have to use that disk and all the others in order to get the data that can't be read from that other disk then try to write it back to that other disk so it can be read in the future, or so it can be disabled if it can't be written.

 

Fix whatever problem is making the disk become disabled.

 

Have you run an extended SMART test on the disk?

Hmm such a shame. Is there a way to "access" the selection process of where it writes? Codewise I mean, I wouldn't mind diving into the plugin system for this, shouldn't be too hard to override the return value of chosen disk.

 

I did extended SMART, again I even went as far as sending it to the factory. 

 

I have 2 theories:

- It either overheats and it's just reported wrong; (it's a SATA disk on a SAS controller, so can imagine that part, btw I did the extended SMART without the SAS)

- A mate of mine told me that disks become unreliable at certain fill %, thinking this might be something too. 

Edited by djmulder
Link to comment
2 minutes ago, djmulder said:

Is there a way to "access" the selection process of where it writes? Codewise I mean, I wouldn't mind diving into the plugin system for this, shouldn't be too hard to override the return value of chosen disk.

This is way way below the level of "plugins", and anything you did would just be breaking the builtin parity protection.

Link to comment
4 minutes ago, djmulder said:

Is there a way to "access" the selection process of where it writes? Codewise I mean, I wouldn't mind diving into the plugin system for this, shouldn't be too hard to override the return value of chosen disk.

And since you don't know why it doesn't work, how could you code around the problem?

Link to comment
5 minutes ago, djmulder said:

Is there a way to "access" the selection process of where it writes?

If you just mean some way to make sure the disk isn't chosen for writing new files for user shares, just exclude it from all user shares in Global Share Settings. But that won't guarantee it won't be accessed if needed

13 minutes ago, trurl said:

If some data for one of the other disks can't be read, Unraid will have to use that disk and all the others in order to get the data that can't be read from that other disk then try to write it back to that other disk so it can be read in the future, or so it can be disabled if it can't be written.

 

 

Link to comment
8 minutes ago, trurl said:

This is way way below the level of "plugins", and anything you did would just be breaking the builtin parity protection.

Hmm that would be weird and I don't think it would work like that (at least: I hope) as selecting which disk to target for a file-write/read would be completely unrelated to the parity. The parity update would come after the write.

7 minutes ago, trurl said:

And since you don't know why it doesn't work, how could you code around the problem?

well yes and no. I do want to fix it, but I also want stuff to not stop working/breaking. 

Edited by djmulder
Link to comment
1 minute ago, djmulder said:

selecting which disk to target for a file-write/read would be completely unrelated to the parity.

yes, unrelated to parity

 

3 minutes ago, trurl said:

If you just mean some way to make sure the disk isn't chosen for writing new files for user shares, just exclude it from all user shares in Global Share Settings. But that won't guarantee it won't be accessed if needed

 

Link to comment
5 minutes ago, trurl said:

If you just mean some way to make sure the disk isn't chosen for writing new files for user shares, just exclude it from all user shares in Global Share Settings. But that won't guarantee it won't be accessed if needed

Hmm good idea, I'll turn this around (to still have access in case I need a file from it).. I'll exclude it for all the apps using the unraid for writing.

Link to comment
Just now, djmulder said:

Hmm good idea, I'll turn this around (to still have access in case I need a file from it).. I'll exclude it for all the apps using the unraid for writing.

If you don't exclude it from user shares in Global Share Settings, then it will still be read because all disks not excluded in Global Share Settings are included when reading user shares.

 

It will still be read even if you exclude a disk for writing by specific apps (normally your apps would access user shares and not disks).

 

It will still be read even if you exclude the disk for specific user shares.

 

Because most of the settings for a specific user share just controls how new files are written to the user share. So if you exclude the disk for a specific user share, that will make it not choose that disk when writing new files to that user share. But the disk will still be read when reading that user share if the disk contains files for that user share.

 

User Shares are simply the top level folders on cache and array. If a disk contains a top level folder for a user share, it is part of the user share, unless it is excluded from all user shares in Global Share Settings.

 

Link to comment
33 minutes ago, djmulder said:

I did extended SMART, again I even went as far as sending it to the factory. 

If it isn't the disk, it's the connection, cable, controller, or possibly power.

 

Have you tried swapping the way it's connected with another disk to see if the problem goes to the other disk?

  • Thanks 1
Link to comment
14 minutes ago, trurl said:

If you don't exclude it from user shares in Global Share Settings, then it will still be read because all disks not excluded in Global Share Settings are included when reading user shares.

 

It will still be read even if you exclude a disk for writing by specific apps (normally your apps would access user shares and not disks).

 

It will still be read even if you exclude the disk for specific user shares.

 

Because most of the settings for a specific user share just controls how new files are written to the user share. So if you exclude the disk for a specific user share, that will make it not choose that disk when writing new files to that user share. But the disk will still be read when reading that user share if the disk contains files for that user share.

 

User Shares are simply the top level folders on cache and array. If a disk contains a top level folder for a user share, it is part of the user share, unless it is excluded from all user shares in Global Share Settings.

 

Hmm so what happens if I remove them all from global? 

say I do that and then make a share "public" and a share "internal" where internal is the same as public except that one disk.

I do undestand what your saying, so basically my "public" share I have now is physically there on the array. But then what would the global settings add vs not using that? 

 

6 minutes ago, trurl said:

If it isn't the disk, it's the connection, cable, controller, or possibly power.

well the issue with the disk has been there with my previous nas. A complete different physical device (was a Drobo) even different house. 

 

ergo I'm grasping at straws while still trying to keep everything running. That's why I wanted a plan to just be able to analyse what was going on. 

Edited by djmulder
Link to comment
9 minutes ago, djmulder said:

Hmm so what happens if I remove them all from global? 

say I do that and then make a share "public" and a share "internal" where internal is the same as public except that one disk.

I do undestand what your saying, so basically my "public" share I have now is physically there on the array. But then what would the global settings add vs not using that? 

Sorry, not able to make any sense of that. Do you understand what I said here about how user shares work?

22 minutes ago, trurl said:

User Shares are simply the top level folders on cache and array. If a disk contains a top level folder for a user share, it is part of the user share

Maybe lets break it down

 

9 minutes ago, djmulder said:

if I remove them all from global

What does "them" refer to here?

 

9 minutes ago, djmulder said:

make a share "public" and a share "internal" where internal is the same as public except that one disk.

What is the name of this "public" share? What is the name of this "internal" share? It is the share name that defines a share, and any disk with a top level folder the same as that share name is part of that user share.

 

So, you could have a "public" share and an "internal" share and you could exclude that disk from the "public" share, but these shares would be different shares in every other possible way since they would have different names and those names correspond to all top level folders on any disk by that name.

Link to comment
11 minutes ago, trurl said:

Sorry, not able to make any sense of that. Do you understand what I said here about how user shares work?

Maybe lets break it down

Ah sorry not native english and I tend to literally write down what I'm thinking xD 

 

12 minutes ago, trurl said:

What does "them" refer to here?

"them" refers to all the disks

 

13 minutes ago, trurl said:

What is the name of this "public" share? What is the name of this "internal" share? It is the share name that defines a share, and any disk with a top level folder the same as that share name is part of that user share.

Right now I have a share named "public", physically on the system there's a /mnt/user/public folder. (yeh yeh /mnt/user isn't part of it, just using full name to be exact, and I know of /mnt/user0 )  

 

What I'd want is: have 2 shares, named "public" and "internal" to replace that 1 share.

1 pointing to /mnt/user/public (all disks)

1 pointing to /mnt/user/public (all minus disk 2)

 

I do understand what you say and /mnt/user is the whole array.

Link to comment
1 minute ago, djmulder said:

I have a share named "public"

So do I, or actually, it is named "Public"

 

2 minutes ago, djmulder said:

What I'd want is: have 2 shares, named "public" and "internal" to replace that 1 share.

1 pointing to /mnt/user/public (all disks)

1 pointing to /mnt/user/public (all minus disk 2)

A user share named "internal" is at /mnt/user/internal. Any top level folder named "internal" on cache, or on any array disk not excluded in Global Share Settings, is part of the "internal" user share. So, /mnt/cache/internal, /mnt/disk1/internal, /mnt/disk2/internal ... are all part of the user share named "internal".

 

A user share named "public" is at /mnt/user/public. Any top level folder named "public" on cache, or on any array disk not excluded in Global Share Settings, is part of the "public" user share. So, /mnt/cache/public, /mnt/disk1/public, /mnt/disk2/public, ... are all part of the user share named "public".

 

So, as you can see, there is no way to have a user share named "internal" that refers to /mnt/cache/public, /mnt/disk1/public, (skip /mnt/disk2/public), ...

 

And anyway, I suspect that disk that you want to avoid accessing already contains several top level folders, and so already contains parts of several user shares. Each of those top level folders on that disk is part of the user share with the same name as that top level folder.

Link to comment

To summarize, there is no way to guarantee the disk won't be read while it is still part of the parity array, because the parity calculation must read all disks when necessary.

 

The disk can be excluded for reading or writing all user shares by excluding it in Global Share Settings. Except, it still might be read as required by the parity calculation when reading other disks.

 

And anyway, what did you think would happen with the disk during a parity check?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.