Preclear plugin


Recommended Posts

Currently swapping out some old drives on my server for newer larger ones. I am intending to sell my old drives and have in the past done a three loop pre-clear on any drives I sell to give them a stress test to hopefully give the buyer a little peace of mind. 
 
I am currently pre-clearing six drives simultaneously and all is going well. Apart from one drive that can't seem to get past the first zeroing. 
 
I've attached the log. It gets to the end of the zeroing loop. The dd process hangs, gets killed and starts from the beginning of the zeroing again. A dodgy drive perhaps? The pre-clearing isn't failing and SMART doesn't show any issues. 
preclear_disk_WD-WMC4N1495101_9736.txt
You're using an old version of the plugin which suffered a bug with those exact symptoms.

Enviado de meu SM-G985F usando o Tapatalk

Link to comment

After Preclearing an Easystore 8TB still in it's USB enclosure I noticed the log being spammed by a repetitive message. I believe this started after I clicked the red X to "stop preclear."  Note I have to click that X every time even after I'm alerted that pre-clear is finished. Here is the message. It added about 1Mb to my syslog.  
 

2021-01-16T23:00:03-05:00 Tower preclear_disk_375347564C4C5743[7 /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 478: /tmp/.preclear/sdg/dd_output_complete: No such file or directory
2021-01-16T23:00:03-05:00 Tower preclear_disk_375347564C4C5743[7 /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: /tmp/.preclear/sdg/dd_output_complete: No such file or directory
2021-01-16T23:00:03-05:00 Tower preclear_disk_375347564C4C5743[7 /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: [: -gt: unary operator expected

 

Link to comment
2 hours ago, ChillZwix said:

Hello, 

I try to preclear 3 new disks, but keep failing. 

First time I did run preclear for all 3,but did fail at after read, so tried just one disk, but it failed now at pre read, can someone please help me out? 

lucifer-diagnostics-20210124-2043.zip 358.55 kB · 0 downloads

Your root filesystem is getting full and the script stopped to prevent any possible problems. 

 

Try to reboot and then start it again.

Link to comment

My opinion about this plugin or the idea behind it:

 

The first part is clear: It writes zeros to the disk to preclear it, being able to use it directly in an unraid array. The second part is: Verify if those zeros are still there after they where written.

 

I don't think the second part is really useful. It maybe sounds logical, but finally it misses a simple part: A new disk has already zeros on it. So writing zeros to a disk which has already zeros on it, verifies nothing because writing only zeros means demagnetizing all sectors which are already demagnetized. Conclusion: You don't know if a sector could really contain data as you never tested writing / magnetizing.

 

I think this would be better:

dd if=/dev/random of=/dev/sdX bs=1M
dd if=/dev/sdX of=/dev/null bs=1M

 

This creates pseudo random data on each sector and after that it reads every sector.

 

Why is this sufficient or even better?

 

Because every sector on the disk has an ECC (Error Correcting Code) to verify the content of a sector:

1604700949_2021-01-2614_44_02.png.f5af6f98f399fd27474e32d1b244b9b8.png

 

And this ECC is verified on every read. This means if you read data from a disk, the firmware checks if the ECC fits to the sector data and if not, it:

A) trys to re-read the sector or

B) rebuilds the data through the ECC (yes its some kind of a parity, too) or

C) it marks the sector as defective, which can be seen through SMART.

 

And because we are really writing data to the disk, we know, thanks to ECC, if the sector could be magnetized.

 

After executing those two commands you can add the disk to the array. Unraid will now clear it (writing zeros again).

 

This means we have three processes and each needs 1 hour per TB (for HDDs with an average of 150 MB/s). This gives us a total of 30 hours for a 10TB disk to verify and clear it. This is faster than the current method and I think, the more realible method.

 

What's your opinion about that?

Link to comment
32 minutes ago, mgutt said:

What's your opion about that?

I think that the preclear function already has very limited usefulness, other than a few niche applications like prepping a drive and leaving it on the shelf for rapid expansion.

 

What you are speaking of is expanding the testing portion to be more thorough and useful. I submit that preclear is the WRONG place for this.

 

I would much rather see the preclear operation pared back to ONLY a single pass of writing and confirming zeroes, then applying the secret sauce.

 

Please consider writing a plugin specifically for testing disks, with various levels of thoroughness. I would much rather get behind an unraid disk test suite than continue to hear about people saying preclear is needed when replacing an existing disk. Better to separate the functions to what they really should be, preclear for the very VERY few times it's warranted, and optional thorough disk testing for every disk that is introduced to the array.

 

You could get quite fancy and write a mostly non-destructive test that includes writes, by reading a sector, flipping all the bits, writing the sector, reading to confirm, then flipping them back, then reading them again. For extra points flush the disk cache after every operation.

 

It would take days on a large drive, but you could be very sure the drive was good.

Link to comment
4 hours ago, jonathanm said:

What you are speaking of is expanding the testing portion to be more thorough and useful. I submit that preclear is the WRONG place for this.

 

But its the main reason why people are using it (as far as I know).

 

4 hours ago, jonathanm said:

I would much rather see the preclear operation pared back to ONLY a single pass of writing and confirming zeroes

 

If you extend your array, Unraid will clear it. Execute this and you have your confirmation (as the firmware compares the ECC):

dd if=/dev/sdX of=/dev/null bs=1M

 

We only need a plugin that adds a button for this or which automatically executes it after a disk was cleared through Unraid.

Link to comment



My opinion about this plugin or the idea behind it:
 
The first part is clear: It writes zeros to the disk to preclear it, being able to use it directly in an unraid array. The second part is: Verify if those zeros are still there after they where written.
 
I don't think the second part is really useful. It maybe sounds logical, but finally it misses a simple part: A new disk has already zeros on it. So writing zeros to a disk which has already zeros on it, verifies nothing because writing only zeros means demagnetizing all sectors which are already demagnetized. Conclusion: You don't know if a sector could really contain data as you never tested writing / magnetizing.
 
I think this would be better:
dd if=/dev/random of=/dev/sdX bs=1Mdd if=/dev/sdX of=/dev/null bs=1M

 
This creates pseudo random data on each sector and after that it reads every sector.
 
Why is this sufficient or even better?
 
Because every sector on the disk has an ECC (Error Correcting Code) to verify the content of a sector:
1604700949_2021-01-2614_44_02.png.f5af6f98f399fd27474e32d1b244b9b8.png
 
And this ECC is verified on every read. This means if you read data from a disk, the firmware checks if the ECC fits to the sector data and if not, it:
A) trys to re-read the sector or
B) rebuilds the data through the ECC (yes its some kind of a parity, too) or
C) it marks the sector as defective, which can be seen through SMART.
 
And because we are really writing data to the disk, we know, thanks to ECC, if the sector could be magnetized.
 
After executing those two commands you can add the disk to the array. Unraid will now clear it (writing zeros again).
 
This means we have three processes and each needs 1 hour per TB (for HDDs with an average of 150 MB/s). This gives us a total of 30 hours for a 10TB disk to verify and clear it. This is faster than the current method and I think, the more realible method.
 
What's your opinion about that?



Your post wrongly assumes that every new drive is empty. Many USB drives people use to shuck comes formated in NTFS from the factory. So no, not every new drive is wipe clean from factory. You can't also guarantee that the disk don't become magnetized if it is exposed to a strong magnetic field prior to installation. And you wrongly assume every drive added is a new empty drive, not a drive prior used in FreeNAS, Openvault etc.

Other wrong assumption is that every bit written into the disk is a zero. Look at this topic and see many problems in zeroing that occur because of defective RAM. I'm pretty sure Unraid doesn't verify its cleared drives, so this can led to errors in the parity check if you have a bad RAM stick.

As it is now, you can use Erase and Clear Disk to write randomized data into it. It reads the disk prior to write, but if you think it's important to random fill the drive prior to the first (surface) read, it can be easily adapted.

It's not written to be a test tool, but in many cases this script behaves like one. I look into the statistics sent and I see loads of disks that have increased pending and reallocated sectors after the preclear, and many disks that fails preclear because of uncorrectable sectors and other hardware problems, so I know it can be a valuable tool.

Obviously, if you think the plugin is redundant you can always skip it's use and rely in the built-in Unraid functionality.

Enviado de meu SM-G985F usando o Tapatalk

Link to comment



 
If you extend your array, Unraid will clear it. Execute this and you have your confirmation (as the firmware compares the ECC):
dd if=/dev/sdX of=/dev/null bs=1M

 
We only need a plugin that adds a button for this or which automatically executes it after a disk was cleared through Unraid.



You're wrong again. You need a code to run this command, code to stop it, code to detect if a sync command is issued and pause the operation to prevent a docker container from hanging on stop, code to collect/compare SMART attributes, code to send notifications etc. So believe me when I say it's not just a button.

Enviado de meu SM-G985F usando o Tapatalk

Link to comment
6 hours ago, gfjardim said:

Your post wrongly assumes that every new drive is empty.

 

Ok, most of them. And in this case the pre-read, clearing and post-read doesn't really verify anything.

 

6 hours ago, gfjardim said:

problems in zeroing that occur because of defective RAM

 

I read that, but I don't think its the task of this plugin to test the RAM. Of course its a nice side effect.

 

6 hours ago, gfjardim said:

if you think the plugin is redundant

 

I don't think so. I only think it could be optimized. To cover everything (pre-clearing, non-empty disks, defective RAM, really writing data), it could create XXX MB random data, build its CRC, write the data to the disk and compare the CRC through reading the data again. This would replace the pre-read and post-read. so it would take exactly the same amount of time. The final step would be the pre-clearing (which could be even optional if someone leaves this to Unraid).

Link to comment

I just bought two Western Digital Easystore 12Tb HDD. I haven't taken them out of the external enclosure yet because I am out of SATA ports on my motherboard. I have them plugged into the back of my motherboard as regular USB 3.0 devices. Previously, I have successfully precleared drives with them directly plugged into the SATA port and a USB 3.0 port so I know this is something that can be done. I currently have one drive part of the array as a USB device and everything has been working fine.

 

I am still on the trial version of unRAID 6.8.3. I'm running Preclear Disk version 2021.01.03, Unassigned Devices version 2021.01.16b, Unassigned Devices Plus version 2021.01.24.

 

I load up the Preclear plugin and on the rightmost column Preclear status for both drives shows up as "Disk mounted" so I click "Start Multiple Sessions."

 

The first issue is that the box that says "Select Disks: Preclears Disks" is a gray font and if I click on the box for the drop down menu, then the 2 drives are also gray. I tried this with only one disk plugged in and it's the same gray font but now with only one drive showing. I can't recall exactly but I think on my previous runs, they were a black font.

 

1554791495_unRAIDPreClearIssue2021-01-2819_44_31.thumb.png.c5d3beb92781c3e82ffd569fe9820675.png

 

I then click "START" button which turn it into 3 white dots that light up gradually like a loading logo. That button stays like that for about 30mins without anything happening or showing me the progress. I have tried this with both gfjardim - 1.0.20 script and Joe L. - 1.20 script with the same results.

 

214892743_unRAIDPreClearnotstarting2021-01-2819_44_31.thumb.png.60a62e5cfe22882f2668c8d92e622116.png

Link to comment

Don't pre-clear the WD drives in the enclosure. I know from experience it won't work and you will have to do it again when you add them to the array. Unless you plan to keep the drives in the enclosures while they are in the array that is.

 

If you plan to run them off USB in the array I would highly recommend figuring out a way to get some more sata ports (time for an HBA it sounds like). USB is not nearly as reliable as sata and it only takes a drive dropping out for a second to require you to have to rebuild the entire disk.

 

You can get HBA's on ebay for less then $50 last I checked.

 

Far as the issues with the plugin itself, can't help you there.

Link to comment
10 hours ago, mondama13 said:

I just bought two Western Digital Easystore 12Tb HDD. I haven't taken them out of the external enclosure yet because I am out of SATA ports on my motherboard. I have them plugged into the back of my motherboard as regular USB 3.0 devices. Previously, I have successfully precleared drives with them directly plugged into the SATA port and a USB 3.0 port so I know this is something that can be done. I currently have one drive part of the array as a USB device and everything has been working fine.

 

I am still on the trial version of unRAID 6.8.3. I'm running Preclear Disk version 2021.01.03, Unassigned Devices version 2021.01.16b, Unassigned Devices Plus version 2021.01.24.

 

I load up the Preclear plugin and on the rightmost column Preclear status for both drives shows up as "Disk mounted" so I click "Start Multiple Sessions."

 

 

If it's marked as mounted, then it's mounted. Depending the settings in Unassigned Devices, it auto mounts USB disks when plugged.

Link to comment
On 1/26/2021 at 2:58 PM, mgutt said:

I don't think the second part is really useful. It maybe sounds logical, but finally it misses a simple part: A new disk has already zeros on it. So writing zeros to a disk which has already zeros on it, verifies nothing because writing only zeros means demagnetizing all sectors which are already demagnetized. Conclusion: You don't know if a sector could really contain data as you never tested writing / magnetizing.

 

Your assumptions on magnetic media are wrong. The information on a disk is not coded into magnetised / demagnetised spots, it is coded into transitions of magnetisation with opposite polarities. The read heads detect magnetic flux changes, not the magnetisation itself. Also what is written are not the bits that the disk driver hands over to the drive. The data coming from the driver gets re-encoded in a way that optimises several parameters, e.g. number of transitions for clock recovery, influence on neighbouring spots, error correction. How it is done on a given drive is the secret sauce of its manufacturer, but whatever bit patterns you send to the drive, what ends up on the disk is something different and the spots on the disk platter will always get magnetised with one polarity or the other. For disk surface testing it does therefore not matter much, whether you write zeroes or random data. For stressing the mechanical elements and forcing early failures of marginal drives it doesn't matter either. For pre-clearing a disk you obviously need to write zeroes. There are some scenarios, e.g. a controller circuit on the drive having issues or a fake SSD with less capacity than advertised, where using a pseudo-random sequence would be far superior to using zeroes. But in that case you would need reproducible pseudo-random sequences so that you only need to store the seed between write / read and not the sequence itself. Your proposal to create a block of random data and repeatedly write / read that block would not detect issues with addressing. I do however think that those scenarios are beyond the intention of this plugin. There are already tools for that kind of tests.

 

Pre-clearing was very important when Unraid was not able to clear a disk while keeping the array available. This has changed, but the plugin can still be used to stress-test a drive before adding it to the array. 

Link to comment
7 hours ago, tstor said:

it is coded into transitions of magnetisation with opposite polarities

Ok, I read this article. Now I understand the principal.

 

7 hours ago, tstor said:

For disk surface testing it does therefore not matter much, whether you write zeroes or random data.

But if you write zeros everywhere, it should use only one polarity on the complete surface.

 

7 hours ago, tstor said:

repeatedly write / read that block would not detect issues with addressing

Why not? I would create the block, calculate it's CRC hash, write the block, read the block and compare the CRC before/after.

 

 

Link to comment
40 minutes ago, mgutt said:

Why not? I would create the block, calculate it's CRC hash, write the block, read the block and compare the CRC before/after.

 

 

 

This approach was needed before, when disks controllers lacked advanced firmware functionality. Like you said, if a block is read and it doesn't check against its CRC, the firmware tries to recover the data from its CRC, and if it fails, it will mark the sector as a pending sector. When you write the sector again, it will store the value in a new spare sector designated by the firmware to occupy its place, and mark the old one as a reallocated sector. That's why you don't need fancy random patterns to test modern disks.

 

That's why we read the disk in the pre-read, to force the disk to detect defective sectors and mark them as pending sectors. After that we write to the disk to force the firmware to map new spare sectors to the pending sector's place and mark them as reallocated sectors.

  • Like 1
  • Thanks 1
Link to comment
On 1/30/2021 at 1:16 AM, mgutt said:

But if you write zeros everywhere, it should use only one polarity on the complete surface.

 

No, the data you write is encoded for two reasons: guarantee a minimum of transitions for clock recovery (see RLL codes, https://en.wikipedia.org/wiki/Run-length_limited) and modern error recovery algorithms (https://en.wikipedia.org/wiki/Low-density_parity-check_codehttps://web.archive.org/web/20161213104211/http://www.marvell.com/storage/assets/Marvell_88i9422_Soleil_pb_FINAL.pdf)

In other words, there will always be a lot of flux reversals / polarity changes regardless of what exactly you write.

 

On 1/30/2021 at 1:16 AM, mgutt said:

I would create the block, calculate it's CRC hash, write the block, read the block and compare the CRC before/after.

 

Yes, but assume you have an address line defect in a higher address line (or a fake SSD with only 128 GB instead of the expected 2 TB). You would write a block, e.g. 1 GB and successfully read back from there. All your tests would be successful. During your tests you would never notice, that when you assume to write into the second 128 GB you actually overwrite the first. Only when you write the whole disk first and then start reading back would you detect this kind of defect / cheating. Since you are unlikely to have enough RAM to keep a copy of written random data for the whole disk, you need a pseudo-random sequence as a source for the written data. When you read back you can then just use the same generator and seed and generate the sequence again for comparison with the read data.  

Link to comment

 Hello guys,

 

I am testing three 1TB drives, if they're still healthy to be used in an Unraid array. I'm preclearing them with the Preclear plugin. Post-Read verification failed on a few runs. To speed up the process, I skipped pre-read after the first time. I already checked the SATA cables (all new) and swapped disk positions. Also I'm wondering if the power delivery to the drive cage is sufficient, but I dont think this is very likely. The current pending sectors returned to 0 after the clearing cycles. After the second Preclear cycle I started another round of extended SMART tests on the drives which exited with some errors.

 

Memory is fine. (Ran Memtest for 13 hours)

 

A summary on the tests:
 

+-------+----------------+----------------+----------------+----------------+---------+
| Drive |      S/N       |   Preclear 1   |   Preclear 2   |   Preclear 3   | SMART 3 |
+-------+----------------+----------------+----------------+----------------+---------+
|     1 | JP2940J8046DTV | Pass           | Post-Read Fail | Pass           | Pass    |
|     2 | S1Y5J90SC63261 | Post-Read Fail | Post-Read Fail | Post-Read Fail | Pass    |
|     3 | S246J9BZ616699 | Disappeared    | Post-Read Fail | Post-Read Fail | Fail    |
+-------+----------------+----------------+----------------+----------------+---------+

 

SMART Reports (after 3rd Preclear run)

S246J9BZ616699  -  Error

JP2940J8046DTV - Passed

S1Y5J90SC63261 - Passed

 

I'm still new to Unraid and SMART reports and would like to understand better what these results mean.

I'd appreciate your help a lot

Edited by adrifromhh
Link to comment

Hi all Similar to Sknxk Iam getting the following errors on UNRAID 6.8.3 preclear plugin 2021.01.03

 

This is brand new unraid build on a old NAS.

 

Feb 14 00:51:38 Tower preclear_disk_MN5210F32Z4XXX[29721]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 498: 0 * 100 /         0 : division by 0 (error token is "0 ")
Feb 14 06:48:46 Tower preclear_disk_MN5210F32Z4W8K[29721]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: /tmp/.preclear/sdc/dd_output_complete: No such file or directory
Feb 14 06:48:46 Tower preclear_disk_MN5210F32Z4XXX[29721]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: [: -gt: unary operator expected
Feb 14 06:48:46 Tower preclear_disk_MN5210F32Z4XXX[29721]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 478: /tmp/.preclear/sdc/dd_output_complete: No such file or directory
Feb 14 06:48:46 Tower preclear_disk_MN5210F32Z4XXX[29721]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: /tmp/.preclear/sdc/dd_output_complete: No such file or directory
Feb 14 06:48:46 Tower preclear_disk_MN5210F32Z4XXX[29721]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: [: -gt: unary operator expected

 

Link to comment

Hi.  Was the preclear plugin updated and stopped working in the new version for some reason?  I just went to preclear a disk

Pre-read - Done

Zeroing - Done

 

And thats it.  Post read never occurss and plugin is just back to asking me if I want to pre-read.  here is the log.

 

eb 14 17:41:44 UnRAID preclear_disk_5PJHBRLE[18094]: Zeroing: dd if=/dev/zero of=/dev/sdg bs=2097152 seek=2097152 count=12000136527872 conv=notrunc iflag=count_bytes,nocache,fullblock oflag=seek_bytes
Feb 15 05:20:38 UnRAID preclear_disk_5PJHBRLE[18094]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 475: /tmp/.preclear/sdg/dd_output_complete: No such file or directory
Feb 15 15:42:54 UnRAID preclear_disk_5PJHBRLE[18094]: Post-Read: cmp /tmp/.preclear/sdg/fifo /dev/zero
Feb 15 15:42:54 UnRAID preclear_disk_5PJHBRLE[18094]: Post-Read: dd if=/dev/sdg of=/tmp/.preclear/sdg/fifo count=2096640 skip=512 iflag=nocache,count_bytes,skip_bytes
Feb 15 15:42:55 UnRAID preclear_disk_5PJHBRLE[18094]: Post-Read: cmp /tmp/.preclear/sdg/fifo /dev/zero
Feb 15 15:42:55 UnRAID preclear_disk_5PJHBRLE[18094]: Post-Read: dd if=/dev/sdg of=/tmp/.preclear/sdg/fifo bs=2097152 skip=2097152 count=12000136527872 iflag=nocache,count_bytes,skip_bytes
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory
Feb 15 15:44:41 UnRAID preclear_disk_5PJHBRLE[18094]: cat: /tmp/.preclear/sdg/smart_cycle_initial_start: No such file or directory

 

 

/tmp/.preclear/sdg/smart_cycle_initial_start does in fact exist.  Why is it saying no such file exists?

Link to comment

Hey guys,

 

currently running preclear on 2 disks. The log is getting flodded with messages like these:

 

Feb 15 21:53:13 Tower preclear_disk_ZDH1NDE7[5373]: tput: unknown terminal "screen"

 

I started the preclear via unassigned devices plugin. In the GUI everything appears fine, the unassigned devices section shows the preclear progress correctly, but the GUI just can't show my the syslog, anymore, due to "out of RAM".

 

Not a big deal, but thought I'd report it.

Link to comment
  • Squid unpinned this topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.