Unassigned Devices Preclear - a utility to preclear disks before adding them to the array


dlandon

Recommended Posts

13 minutes ago, comet424 said:

@dlandon

 

were there any other tests you wanted me to try on my backupserver   on why the preclear wouldnt resume after the reboot?

 

as i tried it a couple times same thing wont resume  but it did resume that one time but only on one of the hard drives

I don't remember you answering this post:

Were sdc and sde the only drives being precleared?

Link to comment

@dlandon  oh didnt see your reply .. i dont get notifications of any replies in email or so.. so i just keep the browser windows open and just refresh for replies

 

but ya  the 2 16s   are the only 2 drives being precleared.. so i re start it and do the reboot..  and the resume option isnt there..

just that one time where i mentioned the 1 16tb  saved a resume  so i could resume but the other 16tb  made me choose to restart from scratch

 

and the other 2 disks in the server are my array disks

 

sorry missed your reply

Edited by comet424
Link to comment

i currently doing a preclear on a 12 tb i started yesterday.. but in middle of night unraid crashed i guess...  no monitor  so i hit reset button..   there was no resume  and no sys log  so incase power goes out or crash  maybe preclear cant resume

 

i then restarted a new 3 cycle   ran it about 2 min and rebooted... and then i could resume  its almost a hit and miss  i wondering if i should just downgrade my backup server to 11.5  since it seems more stable then 12.2  i already downgraded  my main server from 12.2  to 11.5  and its cleared up alot of issues i was having..  dont think 12.2  is just ready to be stable yet

Link to comment
4 hours ago, comet424 said:

i currently doing a preclear on a 12 tb i started yesterday.. but in middle of night unraid crashed i guess...  no monitor  so i hit reset button..   there was no resume  and no sys log  so incase power goes out or crash  maybe preclear cant resume

 

i then restarted a new 3 cycle   ran it about 2 min and rebooted... and then i could resume  its almost a hit and miss  i wondering if i should just downgrade my backup server to 11.5  since it seems more stable then 12.2  i already downgraded  my main server from 12.2  to 11.5  and its cleared up alot of issues i was having..  dont think 12.2  is just ready to be stable yet

The preclear status file used to resume a preclear is written to the flash drive so it will be available after a reboot.  When the server is shutting down, there is an event that stops all preclears and causes the preclear status of each disk to be written to the flash drive.  The preclear status will not be written to the flash drive if the server crashes.  The preclear doesn't do a continuous write to the flash of the current preclear status to minimize flash writes.

Link to comment

ah ok good to know... and then sometimes the reboots  dont save the resume preclear status to the flash drive.. as you seen in the syslogs... guess its a flip of the coin?  when it does save to the flash correctly..  as it hasnt been working   or with those 2 16s   one saved other didnt save so  had restart it...  probably something to do in 12.2  as 11.5  i never had a problem with resuming..

 

once these finish there 3 cycles  i probably just gonna downgrade back to 11.5  too many issues with 12.2  so far  ill wait a few months for the next fix  and try it then..

 

nothing perfect.. but do appreciate unraid .. and the great programmers as yourself making these great plugins 🙂

Link to comment
15 minutes ago, comet424 said:

ah ok good to know... and then sometimes the reboots  dont save the resume preclear status to the flash drive.. as you seen in the syslogs... guess its a flip of the coin?  when it does save to the flash correctly..  as it hasnt been working   or with those 2 16s   one saved other didnt save so  had restart it...  probably something to do in 12.2  as 11.5  i never had a problem with resuming..

I am going to do some testing with two preclears running at once when the server is shutdown.  I suspect there may be an issue with more than one preclear running on shutdown.  I saw in one of your logs that the shutdown did pause two disks.

Link to comment

ya   and when it came back up the one would resume where the other didnt...    and i not sure if its cuz its 16tb  as i pre clearing a 12 tb  so thats going to be a week.. and i used  the 2 16tb  in my array for my backupserver...    but ill re test in a week the 12tb   doing reboot...    i figured maybe its due to they 16tb  drives  not sure..  but so far  i usually use it on my backup server the preclears  as i dont bother touching it  so it can work while my main server can reboot etc

 

Link to comment
  • 3 weeks later...

I'm not understanding the "no space left on device" error.....  This is an 8TB disk I'm trying to preclear.

 

 

Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100274+0 records out
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 210289819648 bytes (210 GB, 196 GiB) copied, 1705.49 s, 123 MB/s
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100283+0 records in
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100283+0 records out
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 210308694016 bytes (210 GB, 196 GiB) copied, 1721.06 s, 122 MB/s
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100284+0 records in
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100284+0 records out
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 210310791168 bytes (210 GB, 196 GiB) copied, 1894.07 s, 111 MB/s
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100284+0 records in
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100284+0 records out
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 210310791168 bytes (210 GB, 196 GiB) copied, 1894.07 s, 111 MB/s
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: dd: error writing '/dev/sdc': No space left on device
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100285+0 records in
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 100284+0 records out
Jul 29 08:09:12 preclear_disk_Z840SENE_13892: Zeroing: dd output: 210310791168 bytes (210 GB, 196 GiB) copied, 1894.07 s, 111 MB/s
Jul 29 08:09:13 preclear_disk_Z840SENE_13892: Zeroing: zeroing the disk failed!
Jul 29 08:09:13 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Error:
Jul 29 08:09:13 preclear_disk_Z840SENE_13892: S.M.A.R.T.:
Jul 29 08:09:13 preclear_disk_Z840SENE_13892: S.M.A.R.T.: ATTRIBUTE               INITIAL NOW STATUS
Jul 29 08:09:13 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Reallocated_Sector_Ct   0       -
Jul 29 08:09:13 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Power_On_Hours          49905   -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Runtime_Bad_Block       0       -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: End-to-End_Error        0       -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Reported_Uncorrect      0       -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Airflow_Temperature_Cel 31      -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Current_Pending_Sector  0       -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: Offline_Uncorrectable   0       -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: UDMA_CRC_Error_Count    76      -
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: S.M.A.R.T.: 
Jul 29 08:09:14 preclear_disk_Z840SENE_13892: error encountered, exiting ...

Link to comment
On 7/29/2023 at 11:29 AM, tucansam said:

I'm not understanding the "no space left on device" error.....  This is an 8TB disk I'm trying to preclear.

The preclear writes to the disk in chunks and there was no room for UD to write the chunk.  If you look at the SMART report, there are 76 CRC errors this might indicate a problem.  Check your disk cables.

Link to comment
  • 2 weeks later...

Hi.

 

I have 3 unraid systems and I just updated unassigned.devices.preclear on 2 of them , and those two are now showing serious UI issues. 

  • The Dashboard page has headings but no content under the headings.
  • The Main page looks fine, but the button to Stop the array doesn't work. The page refreshes and I see the same device dislplay and the Stop button is still showing and there are no status messages indicated the array is stopping.
  • The Shares page has headings but no content.
  • The User page look normal.
  • The Settings page look fine. I didn't visit all the settings pages, but the ones I did visit showed expected content.
  • The Plugins page has headings but no content.
  • The Docker page has headings but no content.
  • I don't run VMs so I didn't check that page.
  • The Apps page has the side menu, the search box, and a single heading Updating Content, but no content.
  • The Stats page has no content.
  • The Tools page looks okay. I was able to generate the diagnostics (attached).

 

I am able to run the webterminal. I can also login via ssh.

 

I think the dockers are running - I see processes I associate with my dockers in the ps output.

 

I am seeing some messages like this in the logs:

Aug  8 15:52:01 flint-un nginx: 2023/08/08 15:52:01 [error] 8140#8140: *1894 open() "/usr/local/emhttp/plugins/unassigned.devices.preclear/assets/javascript.js" failed (2: No such file or directory) while sending to client, client: 192.168.52.67, server: , request: "GET /plugins/unassigned.devices.preclear/assets/javascript.js?v=autov_fileDoesntExist HTTP/2.0", host: "flint-un.pc.kntc.ca", referrer: "https://flint-un.pc.kntc.ca/Main"

 

The contents of /usr/local/emhttp/plugins/unassigned.devices.preclear/assets is:

root@flint-un:/usr/local/emhttp/plugins/unassigned.devices.preclear/assets# ls -l
total 56
-rwxr-xr-x 1 root root  5099 Jun 21  2017 arrive.min.js*
-rwxr-xr-x 1 root root  4492 Jan  2  2022 sweetalert2.css*
-rwxr-xr-x 1 root root 40887 Mar 14  2022 sweetalert2.js*
root@flint-un:/usr/local/emhttp/plugins/unassigned.devices.preclear/assets#

No javascript.js file here, but the one server that I haven't updated the unassigned.devices.preclear plugin on does have this file.

 

So something in the plugin update process appears to have trashed that file. How can I get the needed file? I'll need an out-of-band method to get it because the Plugins page isn't showing the list of plugins, so there's no update ability through the GUI at this time.

 

Thanks for any help!

 

flint-un-diagnostics-20230808-1553.zip

Link to comment
51 minutes ago, MrChip said:

So something in the plugin update process appears to have trashed that file. How can I get the needed file? I'll need an out-of-band method to get it because the Plugins page isn't showing the list of plugins, so there's no update ability through the GUI at this time.

I released an update to preclear that was missing some files.  I reverted the update as soon as I tried to update and had the issue.  I've fixed that and released the update that works.

Link to comment
15 minutes ago, dlandon said:

I released an update to preclear that was missing some files.  I reverted the update as soon as I tried to update and had the issue.  I've fixed that and released the update that works.

 

Thank you, that got things back to normal. Much appreciated.

 

Deleting unassigned.devices.preclear.plg appears to have removed the plugin from my servers (which I expected). I tried re-installing the plugin through Community Applications, but it went back into the problem state. 

Link to comment
Just now, MrChip said:

Deleting unassigned.devices.preclear.plg appears to have removed the plugin from my servers (which I expected). I tried re-installing the plugin through Community Applications, but it went back into the problem state. 

You should remove the '/flash/config/plugins/unassigned.devices.preclear/*' files so the plugin will download the files fresh.  I think you're re-installing the same corrupted package.

Link to comment

Hi, I bought a new 20TB drive that I've pre-cleared 3 time and it has failed in Post Read every time. I've run MemTest86 with no errors. I've changed the sata cable and power cord and pre-clear still failed. I'm here to verify if I got a bad drive and need to return it or if something else can be gathered from the logs. 

 

I've attached the most recent SMART Log after the most recent failure tonight. 

 

Thanks in advance for help!

tower-smart-20230808-2141.zip

Link to comment
9 hours ago, Mojo Ryzen said:

I'm here to verify if I got a bad drive and need to return it or if something else can be gathered from the logs. 

The SMART log looks fine.  It looks like the read and write time out at the end of large disks is not long enough.  I'm making some adjustments to see if this problem can be fixed.  Currently testing and will release once that test passes.  That will be today.  You can do another run on the updated preclear to see if it solves this issue for you.

 

Please post the preclear log on the disk so I can confirm if it is a time out.

Link to comment
41 minutes ago, dlandon said:

The SMART log looks fine.  It looks like the read and write time out at the end of large disks is not long enough.  I'm making some adjustments to see if this problem can be fixed.  Currently testing and will release once that test passes.  That will be today.  You can do another run on the updated preclear to see if it solves this issue for you.

 

Please post the preclear log on the disk so I can confirm if it is a time out.

 

@dlandon I saw the update to the plugin this morning. Thanks so much for the quick turnaround. I'll report back in a few days with the results!

Link to comment
On 8/9/2023 at 7:03 AM, dlandon said:

The SMART log looks fine.  It looks like the read and write time out at the end of large disks is not long enough.  I'm making some adjustments to see if this problem can be fixed.  Currently testing and will release once that test passes.  That will be today.  You can do another run on the updated preclear to see if it solves this issue for you.

 

Please post the preclear log on the disk so I can confirm if it is a time out.

 

On 8/9/2023 at 7:52 AM, dlandon said:

That's not the update.  I've got one more to release today.  Wait for today's update.

@dlandon I waited for the second update and ran pre-clear on my 20TB drive and the process errored in post-read verification again. Here are the SMART report and preclear logs from the most recent run. 

tower-smart-20230812-0042.zip preclear_disk_ZVT7PPG9_31998.txt

Link to comment
7 hours ago, Mojo Ryzen said:

I waited for the second update and ran pre-clear on my 20TB drive and the process errored in post-read verification again.

I'll be testing a different approach to post reads and will have an updated preclear in a day or two.  If you are willing, I'd like you to give it a try.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.