Strange Preclear Errors


Rich

Recommended Posts

Hi All,

 

I've just added another drive to my Array and during the preclear i noticed the below from the syslog (this is just a section). Disk14 is the pre-clearing drive.

Normally i would have just thought i received a dodgy drive and would RMA, but each error is on the same block and is a minute apart. Is it just me or is that a bit odd?

 

I would have thought they would be at random times and on random blocks?

 

Is this likely to mean anything, or shall i just ignore it and RMA?

 

Thank you,

 

Rich

 

 

Feb  7 17:31:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:32:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:33:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:34:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:35:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:36:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:37:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:38:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:39:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:40:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:41:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:42:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:43:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:44:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read

Link to comment

Hi All,

 

I've just added another drive to my Array and during the preclear i noticed the below from the syslog (this is just a section). Disk14 is the pre-clearing drive.

Normally i would have just thought i received a dodgy drive and would RMA, but each error is on the same block and is a minute apart. Is it just me or is that a bit odd?

 

I would have thought they would be at random times and on random blocks?

 

Is this likely to mean anything, or shall i just ignore it and RMA?

 

Thank you,

 

Rich

 

 

Feb  7 17:31:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:32:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:33:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:34:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:35:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:36:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:37:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:38:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:39:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:40:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:41:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:42:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:43:02 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read
Feb  7 17:44:01 unRAID kernel: Buffer I/O error on dev md14, logical block 1465130608, async page read

 

I don't understand. You can't pre-clear an md device. An md device is a parity-protected member of the array. You can only pre-clear unassigned sd devices.

Link to comment

I installed the new drive, booted up, selected the new drive in the drop down for Disk 14 and started the array, agreeing that doing so would start a preclear.

 

Thats the right way to add a drive isn't it? Or are 'preclear' and 'clear' different things?

 

It has a little blue square next to it in the GUI for 'New Device'

Link to comment

I installed the new drive, booted up, selected the new drive in the drop down for Disk 14 and started the array, agreeing that doing so would start a preclear.

 

Thats the right way to add a drive isn't it? Or are 'preclear' and 'clear' different things?

Did it actually say it would start a preclear? On the forum preclear has always meant clearing a disk prior to (pre) adding it, either by a script or more recently a plugin. I assume if the webUI said anything at all it would just call it clear, but as I said, I haven't used that.

 

Older versions of unRAID would take the array offline to clear a disk, so preclear was born. Now that unRAID will keep the array online while clearing a disk, that purpose for preclear isn't necessary, but I think most still use preclear to burn-in a disk so it gets a good testing before trusting it in the array.

Link to comment

Also, unRAID won't even clear a disk unless it is added to a new data slot in an array that already has parity, since the purpose of clearing is so an added disk won't invalidate parity. If you are just using a disk to rebuild an existing disk it won't get cleared at all, so the only testing the disk would get would be the writing done by the rebuild, which doesn't even read the data back so not really much of a test.

 

Have you never precleared or otherwise tested any of your disks?

 

Do you have Notifications setup so you can be warned if any of your disks begin to show problems?

Link to comment

Now I know the difference, thanks  :) and sorry for my naivety.

 

I have never precleared a drive with a script, I have always let unRaid clear it itself, even when that meant keeping the array offline for a while.

 

Yeah I've got notifications set up to email me when errors occur, although no errors are showing on the GUI for Disk14, so no notifications.

 

The only errors are the ones in the syslog (which are still occurring every minute and on the same block).

Capture.PNG.c0999f20a1d42731d64c98dc209a78de.PNG

Link to comment

I must confess I have never made use of unRAID 6.2's ability to clear disks. Nevertheless, I'm still puzzled by the fact that the error message is referring to an md device, since that would involve parity. The clearing has to happen on the raw sd device before the md is created, surely?

If unRAID is merely clearing the disk by writing zeroes to a new disk then there would be no need to write to parity.  It is only when the clear has finished and the disk starts being used as part of the data array that any writes have to involve parity.
Link to comment

I must confess I have never made use of unRAID 6.2's ability to clear disks. Nevertheless, I'm still puzzled by the fact that the error message is referring to an md device, since that would involve parity. The clearing has to happen on the raw sd device before the md is created, surely?

If unRAID is merely clearing the disk by writing zeroes to a new disk then there would be no need to write to parity.  It is only when the clear has finished and the disk starts being used as part of the data array that any writes have to involve parity.

We know that, but why is the message reporting that it is clearing an md device instead of an sd device?
Link to comment

Thanks for confirming that the parity disks are spun down. For a moment I thought that maybe this clearing process was doing something crazy, like reading the new disk and updating parity to match  :o

 

Yes, it would be irritating indeed if the disk turns out to be faulty - which is why many people still use the old pre-clear method to give the disk a good testing, as trurl explained.

 

Link to comment

I must confess I have never made use of unRAID 6.2's ability to clear disks. Nevertheless, I'm still puzzled by the fact that the error message is referring to an md device, since that would involve parity. The clearing has to happen on the raw sd device before the md is created, surely?

If unRAID is merely clearing the disk by writing zeroes to a new disk then there would be no need to write to parity.  It is only when the clear has finished and the disk starts being used as part of the data array that any writes have to involve parity.

We know that, but why is the message reporting that it is clearing an md device instead of an sd device?

i assume that is a detail of the way LimeTech have implemented the clear function.
Link to comment

From diagnostics, system/vars.txt

[disk14] => Array
        (
            [idx] => 14
            [name] => disk14
            [device] => sdg
            [id] => WDC_WD60EZRZ-00GZ5B1_WD-WX21D36PPPEF
            [rotational] => 1
            [size] => 5860522532
            [status] => DISK_NEW
            [temp] => 27
            [numReads] => 236
            [numWrites] => 4554156
            [numErrors] => 0
            [format] => GPT: 4K-aligned
            [type] => Data
            [comment] => 
            [color] => blue-on
            [exportable] => no
            [fsStatus] => -
            [fsColor] => grey-off
            [fsError] => 
            [fsType] => auto
            [fsSize] => 0
            [fsFree] => 0
            [spindownDelay] => -1
            [spinupGroup] => host1
            [deviceSb] => md14
            [idSb] => WDC_WD60EZRZ-00GZ5B1_WD-WX21D36PPPEF
            [sizeSb] => 5860522532
        )

[status] => DISK_NEW I'm guessing means it hasn't been added to the array yet, even though [deviceSb] => md14.

 

What happens after the clear is finished? Does the array get restarted so the disk becomes [status] => DISK_OK?

 

Of course the problem is the clear will never finish since it seems to be stuck for some reason.

 

Also, a lot of these in syslog

Feb  7 16:41:28 unRAID root: error: plugins/advanced.buttons/AdvancedButtons.php: wrong csrf_token

which has been discussed on the release thread but I'm not convinced has been fully explained. Are all your plugins up-to-date?

 

Can you stop the array? If so, I think I would go to Disk Settings and set it to not autostart, then reboot and see where we are with the disk assignments and what unRAID thinks it wants to do if you start. Since parity should still be valid for the array without the disk we should be able to easily get going again without the disk.

 

I have not had any issues with 6.3.0 myself, but I have never let unRAID clear a disk this way. And there are some reports of other issues people are having with the new version.

 

Maybe we could try a preclear on the new disk instead of letting unRAID clear it.

 

Link to comment

I genuinely can't remember what happens after the clear and how the drive gets added, sorry. It's been a long time since I've added a drive and last time it required the array to be offline the entire time.

The percentage counter is moving in the corner of the GUI, so it doesn't look like its stuck?

 

Yeah all plugins are up to date. I haven't done anything with them, regarding the errors, as they are all working and i wanted to give it a bit of time for them to be updated to support the unRAID change, before considering deleting them.

 

Sure, I'll have to search on how to do it, but i'm happy to stop it and preclear instead.

 

is it ok to use this plugin?

https://raw.githubusercontent.com/gfjardim/unRAID-plugins/master/plugins/preclear.disk.plg

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.