Re: preclear_disk.sh - a new utility to burn-in and pre-clear disks for quick add


Recommended Posts

However, the new drive does not spin down but, perhaps this means that there is no spin down command sent to an unassigned drive, rather than being an indication that the drive is being accessed?

Correct. drives not assigned to the array have not had any spin-down commands issued to them.  You can issue commands to have them spin themselves down on their own, but those commands work on some drives and not others.  It is why unRAID does its own timing for assigned drives.

I have now rebooted the unRAID server.  The new drive has now become sde but there are no other changes.

The device name may change from one boot to another as they are assigned as the drives are recognized by the linux kernel.  Most of the time it will identify the drive in the same order, but adding a new drive, or a new disk controller will change the "sdX" assignments.  Fortunatly unRAID uses the physical disk controller ports on the PCI  bus to identifiy drives.  It does not care if it is /dev/sda now, and /dev/sdb when you reboot.

You probably have a lot of memory being used by other processes on your server.

 

Memory Info

(from /usr/bin/free)

             total       used       free     shared    buffers     cached
Mem:       3943140     583048    3360092          0      64204     311224
-/+ buffers/cache:     207620    3735520
Swap:            0          0          0

There goes that theory.  You have a lot of free memory.  It leaves some other hardware interaction perhaps.

As I pinup groups might be involved if they are on the same disk controller.

Spin up groups are turned on (I will try turning off), but each of my 5 drives is reporting a different host.

Probably not that... 

 

I'd suggest you set the cache pressure to 200, to allow the cache to be re-used, and also use the options on the pre-clear command to have it use smaller buffers.

 

It sounds as if the cache drive, disk2, and the drive being cleared all share the same spinup group.

Okay, I will try some experiments in those areas.

 

As things stand, I'm frightened to add the new drive to the array in case formatting/preparing it will write to the wrong drive.  Also, I'm not comfortable about playing with drive assignments, in case I lose data.

Unless you see "unformatted" on a drive you know is formatted, you'll not be formatting a drive with data.  unformatted can be misleading though, as it simply indicates the drive cold not be mounted as a reiserfs file-system.  Seek assistance on the forum if a data drive shows as un-formatted.  (look in your syslog, you'll probably see the file-system could not be mounted a a file-system check is in order)

 

As far as assigning data drives on your array you can swap around disks amoung the same set of disk controller ports.  If you use different ports on the disk controllers you'll need to use th e"devices" page to assign the disks back to their respective logical slots in the array.  Take a screen-shot of the "Devices" page so you'll know where each of the disks is assigned.

 

If yo see a "red" indicator adjacent to a drive it indicates it was disabled because a "write" to it failed.  When that happens unRAID simulates the disabled drive by using parity and all the other data drives.  Do not be fooled into thinking it is working because you can still read and write to it.    Seek guidance on the forum.

It could be a bad drive, or it could be a cable came loose.  Before you do anything capture and post a syslog.  (instructions under troubleshooting in the wiki)

 

Joe L

Link to comment
Unless you see "unformatted" on a drive you know is formatted, you'll not be formatting a drive with data.

 

That should certainly be true if there is no 'funny' below the application layer.  However, in this case there is something strange going on and, if it is occurring at a low level in the system code, you cannot make any guarantees based on what you know is happening at the application level.

 

Anyway, as an experiment, I stopped the array and then started the preclear.  I then restarted the array and, to my relief, there were no increasing counts on disk2.  In this state I have left the preclear running - now, after 12 hours, it is 75% through step 2.  The parity, disk1 and disk2 drives are all spun down.  It has even been possible to play movies from either data drive.

 

When the preclear completes, my plan is to assign the new, 2TB, drive as parity and rebuild.  Then I will preclear the old parity drive (1TB) before assigning it as disk3.

 

I will be keeping a close eye on unexpected disk activity throughout the rest of this exercise.

Link to comment

There is definitely something odd going on.  I'm running the preclear on the old parity drive now, prior to configuring it as an additional data drive.  The preclear activity had reached the Post-Read when the family attempted to watch a movie (this time on disk1).  The movie would only play for a few seconds and then just stop.

 

I found that the Post-Read was reporting 85MB/s.  I stopped the array, whereupon the Post-Read speed went up to 106MB/s.

 

I restarted the array and the Post-Read speed remained over 100MB/s, and the movie would now play okay.

 

In case it's relevant, I'm currently running unRAID 4.6-rc3.

Link to comment

There is definitely something odd going on.  I'm running the preclear on the old parity drive now, prior to configuring it as an additional data drive.  The preclear activity had reached the Post-Read when the family attempted to watch a movie (this time on disk1).  The movie would only play for a few seconds and then just stop.

 

I found that the Post-Read was reporting 85MB/s.  I stopped the array, whereupon the Post-Read speed went up to 106MB/s.

 

I restarted the array and the Post-Read speed remained over 100MB/s, and the movie would now play okay.

 

In case it's relevant, I'm currently running unRAID 4.6-rc3.

sounds like you are running low on memory for the disk buffer cache.

 

Adjust the cache_pressure to a higher number... 200 perhaps, and also consider using the parameters to preclear_disk.sh to limit its use of memory.

 

Joe L.

Link to comment
sounds like you are running low on memory for the disk buffer cache.

 

Adjust the cache_pressure to a higher number... 200 perhaps, and also consider using the parameters to preclear_disk.sh to limit its use of memory.

 

Okay, thanks - I will give those a try ... but with 4GB available, I'm surprised.

 

However, with this info, it does look as though something is being greedy!:

 

Memory Info

(from /usr/bin/free)

             total       used       free     shared    buffers     cached
Mem:       3943140    3570284     372856          0     115252    3294348
-/+ buffers/cache:     160684    3782456
Swap:            0          0          0

Link to comment

sounds like you are running low on memory for the disk buffer cache.

 

Adjust the cache_pressure to a higher number... 200 perhaps, and also consider using the parameters to preclear_disk.sh to limit its use of memory.

 

Okay, thanks - I will give those a try ... but with 4GB available, I'm surprised.

 

However, with this info, it does look as though something is being greedy!:

 

Memory Info

(from /usr/bin/free)

             total       used       free     shared    buffers     cached
Mem:       3943140    3570284     372856          0     115252    3294348
-/+ buffers/cache:     160684    3782456
Swap:            0          0          0

Well...  If you think of it a tiny bit.  Both your movie AND the disk being cleared are using the same 4Gig of memory to buffer the movie you are playing (probably over 4 Gig) and the disk you are clearing (certainly WAY over 4 Gig... probably 2000Gig)

 

Joe L.

Link to comment

Joe -

I put a new disc in my server - went to pre- clear said it could not be done - as no partition table found - added disc to array thinking ok will let unraid format it for me - did not realsie it was gong to clear the disc as well. can I now take the disc out of the array and pre clear, of have I just screwed up big time?

 

Link to comment

Joe -

I put a new disc in my server - went to pre- clear said it could not be done - as no partition table found - added disc to array thinking ok will let unraid format it for me - did not realsie it was gong to clear the disc as well. can I now take the disc out of the array and pre clear, of have I just screwed up big time?

 

You will need to un-assign it before the pre-clear script can be used against it.
Link to comment

Question:

 

Here is 1 of my 5 pre-cleared drives that completed. I'm looking through this data and I'm not sure what I should be looking for that would cause concern.... Can anyone help explain what I should be looking for that would cause a concern?

 

How should I paste this so it isn't all fubar???

 

[pre]===========================================================================

=                unRAID server Pre-Clear disk /dev/sda

=                      cycle 1 of 1

= Disk Pre-Clear-Read completed                                DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes            DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward.          DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4      DONE

= Step 5 of 10 - Clearing MBR code area                        DONE

= Step 6 of 10 - Setting MBR signature bytes                    DONE

= Step 7 of 10 - Setting partition 1 to precleared state        DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning  DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries            DONE

= Step 10 of 10 - Testing if the clear has been successful.    DONE

= Disk Post-Clear-Read completed                                DONE

Disk Temperature: 26C, Elapsed Time:  31:59:43

============================================================================

==

== Disk /dev/sda has been successfully precleared

==

============================================================================

S.M.A.R.T. error count differences detected after pre-clear

note, some 'raw' values may change, but not be an indication of a problem

54c54

<  1 Raw_Read_Error_Rate    0x000f  100  100  006    Pre-fail  Always

-      9460

---

>  1 Raw_Read_Error_Rate    0x000f  113  099  006    Pre-fail  Always

-      55767436

58c58

<  7 Seek_Error_Rate        0x000f  100  253  030    Pre-fail  Always

-      15

---

>  7 Seek_Error_Rate        0x000f  100  253  030    Pre-fail  Always

-      331500

64c64

< 188 Unknown_Attribute      0x0032  100  253  000    Old_age  Always

-      0

---

> 188 Unknown_Attribute      0x0032  100  100  000    Old_age  Always

-      0

66,67c66,67

< 190 Airflow_Temperature_Cel 0x0022  077  072  045    Old_age  Always

-      23 (Lifetime Min/Max 23/23)

< 195 Hardware_ECC_Recovered  0x001a  100  100  000    Old_age  Always

 

---

> 190 Airflow_Temperature_Cel 0x0022  074  072  045    Old_age  Always

-      26 (Lifetime Min/Max 23/27)

> 195 Hardware_ECC_Recovered  0x001a  052  049  000    Old_age  Always

 

70,73c70,73

< 199 UDMA_CRC_Error_Count    0x003e  200  253  000    Old_age  Always

-      0

< 240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline

-      209895051755530

< 241 Unknown_Attribute      0x0000  100  253  000    Old_age  Offline

-      0

< 242 Unknown_Attribute      0x0000  100  253  000    Old_age  Offline

-      1600

---

> 199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always

-      0

> 240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline

-      196228465819690

> 241 Unknown_Attribute      0x0000  100  253  000    Old_age  Offline

-      4174963960

> 242 Unknown_Attribute      0x0000  100  253  000    Old_age  Offline

-      1036252430[/pre]

Link to comment

Question:

 

Here is 1 of my 5 pre-cleared drives that completed. I'm looking through this data and I'm not sure what I should be looking for that would cause concern.... Can anyone help explain what I should be looking for that would cause a concern?

You are looking for any attribute that says FAILING_NOW.

You are looking for re-allocated sectors, or sectors pending re-allocation (in the RAW column)

No other RAW column is meaningful to anyone but the manufacturer, with the possible exception of temperature.

You are looking for any other attribute where the current value is nearing the "threshold" column.

 

Joe L.

Link to comment

I just ran the pre-clear script on the two brand new seagate 5900 rpm drives from the newegg BF sale.

 

My preclear logs have six tables of what I think is SMART data.

 

Showing read error rates and seek error rates.  I'm kinda worried actually because the numbers aren't tiny or anything, and they are brand new drives.

 

If I post the log can someone advise me what to do?

 

Thanks :)

 

 

Link to comment

Hi Joe, can I use your script on a new build with (6) disks? I guess what I'm asking, what is the best way to pre-clear all my disks at once. I've been reading and reading so sorry for asking this question that I'm sure has been asked many times, but many posts are old and want to make sure I use the latest method.

Thanks,

Tom

Link to comment

Hi Joe, can I use your script on a new build with (6) disks? I guess what I'm asking, what is the best way to pre-clear all my disks at once. I've been reading and reading so sorry for asking this question that I'm sure has been asked many times, but many posts are old and want to make sure I use the latest method.

Thanks,

Tom

 

I used pre-clear also for the first time today. What i did is open a putty window for every pre-clear job. So i pre-cleared two drives at the same time. I don't know or this is the way to do it. Maybe some advice from a pro user?

Link to comment

Hi Joe, can I use your script on a new build with (6) disks? I guess what I'm asking, what is the best way to pre-clear all my disks at once. I've been reading and reading so sorry for asking this question that I'm sure has been asked many times, but many posts are old and want to make sure I use the latest method.

Thanks,

Tom

 

I used pre-clear also for the first time today. What i did is open a putty window for every pre-clear job. So i pre-cleared two drives at the same time. I don't know or this is the way to do it. Maybe some advice from a pro user?

 

Thanks, I did find Joe's procedure from July 09 http://lime-technology.com/forum/index.php?topic=4043.msg35774#msg35774

Link to comment

Hi Joe, can I use your script on a new build with (6) disks? I guess what I'm asking, what is the best way to pre-clear all my disks at once. I've been reading and reading so sorry for asking this question that I'm sure has been asked many times, but many posts are old and want to make sure I use the latest method.

Thanks,

Tom

I searched some more and found this, http://lime-technology.com/forum/index.php?topic=4043.msg35774#msg35774 I should be all set. I will have the rest of my parts this Tuesday and hopefully boot up that night.

Link to comment

Question:

 

Here is 1 of my 5 pre-cleared drives that completed. I'm looking through this data and I'm not sure what I should be looking for that would cause concern.... Can anyone help explain what I should be looking for that would cause a concern?

You are looking for any attribute that says FAILING_NOW.

You are looking for re-allocated sectors, or sectors pending re-allocation (in the RAW column)

No other RAW column is meaningful to anyone but the manufacturer, with the possible exception of temperature.

You are looking for any other attribute where the current value is nearing the "threshold" column.

 

Joe L.

 

Another one that some users have reported issues with is the load_cycle_count (LCC).  Seems that on some disks / disk controllers - that the LCC will get very large very quickly.  I recommend monitoring that in relationship to the disks age.

 

I plan to add some sort of LCC warning to the Smart view in myMain to give users a warning that the the LCC is getting large.

Link to comment

@Tom899 & MvL:

You can preclear up to 6 disk simultaneously (according to Joe, I myself have only tried 2 simultaneously).

Also, whenever you use a remote connection to preclear, it would be very useful to use 'screen'. otherwise, if the remote session (e.g. putty) ends, your preclear will stop and you would have to start from scratch (I imagine it will be more annoying if it happens while preclearing many HDs simultaneously)

Link to comment

@Tom899 & MvL:

You can preclear up to 6 disk simultaneously (according to Joe, I myself have only tried 2 simultaneously).

Also, whenever you use a remote connection to preclear, it would be very useful to use 'screen'. otherwise, if the remote session (e.g. putty) ends, your preclear will stop and you would have to start from scratch (I imagine it will be more annoying if it happens while preclearing many HDs simultaneously)

 

Thanks for that information. I do have 6 disks incuding my cache drive. How about for the preclear I log in directly to the unRAID instead of telnet through the network? Would this a more robust way to preclear all these drives?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.