Unassigned Devices Preclear - a utility to preclear disks before adding them to the array


dlandon

Recommended Posts

18 minutes ago, mathomas3 said:

Hot plug events and device designations are changing after UD Preclear is installed, probably because a hot plug event is initiated by UD Preclear.

Feb 20 03:46:15 Tower root: plugin: unassigned.devices.preclear.plg installed
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH72808CLAR8000_VJGR8M7X_35000cca261288218 was (sdaf) is now (sdj)
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH72808CLAR8000_VJHA50UX_35000cca2614ad854 was (sdd) is now (sdk)
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdy problem getting id
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH728080AL520_X_VLHZMZWY_35000cca2607016cc was (sdy) is now (sdh)
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdaf problem getting id
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdd problem getting id
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdt problem getting id
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdz problem getting id
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH728080AL4200_2EG9U3ZR_35000cca23b11d6ac was (sdp) is now (sde)
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdu problem getting id
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH728080AL520_X_VLJ03AAY_35000cca26070ee14 was (sdt) is now (sdb)
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdf problem getting id
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH728080AL4200_2EGAXXRR_35000cca23b13e130 was (sdn) is now (sdc)
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdn problem getting id
Feb 20 03:46:39 Tower  emhttpd: error: hotplug_devices, 1730: No such file or directory (2): Error: tagged device HUH72808CLAR8000_VJH9X6DX_35000cca2614a62e8 was (sdz) is now (sdi)
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdp problem getting id
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdh
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdj
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdk
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdb
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdi
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdg
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdq
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sdc
Feb 20 03:46:39 Tower  emhttpd: read SMART /dev/sde
Feb 20 03:46:39 Tower kernel: emhttpd[2720]: segfault at 674 ip 000055d29292a9d4 sp 00007ffcc4755040 error 4 in emhttpd[55d292918000+21000]
Feb 20 03:46:39 Tower kernel: Code: 8e 27 01 00 48 89 45 f8 48 8d 05 72 27 01 00 48 89 45 f0 e9 79 01 00 00 8b 45 ec 89 c7 e8 89 b1 ff ff 48 89 45 d8 48 8b 45 d8 <8b> 80 74 06 00 00 85 c0 0f 94 c0 0f b6 c0 89 45 d4 48 8b 45 e0 48
Feb 20 03:46:42 Tower kernel: sd 2:0:5:0: Mode parameters changed
Feb 20 03:46:44 Tower kernel: sd 2:0:16:0: Mode parameters changed

Unraid is having trouble with the hot plug devices and the device designations are changing.

 

How are these devices connected?

Screenshot 2023-02-20 065451.png

Link to comment
4 minutes ago, mathomas3 said:

I recently built the server chassis into what it currently is... the relevant hardware is listed below but basically it's a 1u server with a SAS card which is then connected to a DAS with two controller cards in the DAS(should one fail)

 

Dell R430 chassis

LSI SAS2308 SAS controller

HP DAS

It looks like the presentation of the device serial numbers are confusing Unraid and/or UD.  If you look at those serial numbers, the endings are all the same.

Link to comment
19 minutes ago, dlandon said:
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdaf problem getting id
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdd problem getting id
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdt problem getting id
Feb 20 03:46:39 Tower  emhttpd: device /dev/sdz problem getting id

This means Unraid is seeing the same devices twice, you likely have the HBA connected to dual controllers (or dual expander) on the enclosure, Unraid does not support SAS multipath, connect a single cable from the HBA to the enclosure and reboot.

 

 

  • Like 1
Link to comment
24 minutes ago, JorgeB said:

This means Unraid is seeing the same devices twice, you likely have the HBA connected to dual controllers (or dual expander) on the enclosure, Unraid does not support SAS multipath, connect a single cable from the HBA to the enclosure and reboot.

 

 

I did as suggested and the drive name/IDs have returned to what I would call normal and no longer have 4 digit names...

 

I reinstalled the plugin and it now only shows the two drives and they properly ID the drives...

 

dlandon, Should I still be able to see the two drives on the UD page? they are currently missing after installing the plugin again

Link to comment
28 minutes ago, JorgeB said:

This means Unraid is seeing the same devices twice, you likely have the HBA connected to dual controllers (or dual expander) on the enclosure, Unraid does not support SAS multipath, connect a single cable from the HBA to the enclosure and reboot.

 

 

Thank you for that tip... been running it this way for a while and I have never come across this before... it's these little bits of knowledge that make you a god among us mortals :)

Link to comment
3 minutes ago, mathomas3 said:

I did as suggested and the drive name/IDs have returned to what I would call normal and no longer have 4 digit names...

 

I reinstalled the plugin and it now only shows the two drives and they properly ID the drives...

 

dlandon, Should I still be able to see the two drives on the UD page? they are currently missing after installing the plugin again

Yes.  Post diagnostics and a complete screen shot of the UD page.

Link to comment
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Zeroing: dd output: 128636+0 records out
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Zeroing: dd output: 269769244672 bytes (270 GB, 251 GiB) copied, 1638.25 s, 165 MB/s
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Zeroing: dd output: 128637+0 records in
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Zeroing: dd output: 128637+0 records out
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Zeroing: dd output: 269771341824 bytes (270 GB, 251 GiB) copied, 1744.74 s, 155 MB/s
Mar 01 17:12:27 preclear_disk_W300DV27_15534: dd process hung at 269769244672, killing ...
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Zeroing: zeroing the disk started 2 of 5 retries...
Mar 01 17:12:27 preclear_disk_W300DV27_15534: Continuing disk write on byte 269767147520
Mar 01 17:16:56 preclear_disk_W300DV27_15534: Zeroing: dd output: 
Mar 01 17:16:56 preclear_disk_W300DV27_15534: dd process hung at 0, killing ...
Mar 01 17:16:56 preclear_disk_W300DV27_15534: Zeroing: zeroing the disk started 3 of 5 retries...
Mar 01 17:16:56 preclear_disk_W300DV27_15534: Zeroing: emptying the MBR.

Here's the end of my preclear log. As you can see, the process does seem to work, until it tries to zero the data. For some reason the dd ouput hangs on 0, and it is timed out and killed by the drive. I'm doing the whole process, the pre-read was completed, I'm not sure what's going on here. Anything I can do to fix this? I'M kind of in a hurry to use this drive and I have been using it in a Linux box for years, I just thought this was a requirement so I did it. I just wanna get on with my file copying.

Anyone?

Link to comment
On 3/1/2023 at 8:12 PM, couzin2000 said:

Here's the end of my preclear log. As you can see, the process does seem to work, until it tries to zero the data. For some reason the dd ouput hangs on 0, and it is timed out and killed by the drive. I'm doing the whole process, the pre-read was completed, I'm not sure what's going on here. Anything I can do to fix this? I'M kind of in a hurry to use this drive and I have been using it in a Linux box for years, I just thought this was a requirement so I did it. I just wanna get on with my file copying.

Anyone?

 

I know this is a couple days later, but I would run SMART short self-test to see if it passes.

But I think your drive may have died. 

Link to comment

Just wanted to drop by and thank the dev. I had a power failure 10 hours into a 20tb preclear. Thanks to this plugin I was able to pause the preclear and let the unRAID server shutdown when my UPS batteries hit the low percentage mark.

Upon power restoration I booted up and was able to resume the preclear.

Thanks for saving me 10 hours of wasted time.

Link to comment
  • 2 weeks later...

Trying to preclear a new drive, seems to be stuck on "Starting ...". 

 

I've noticed a couple others in this thread with the same symptom, but don't see the solution.

 

I don't see anything in the logs at all - just that I started the preclear operation and then I eventually cancel it (I let it run all night - still didn't start).

 

EDIT: I removed the preclear addon and then re-installed it and now it is working again.

Edited by autumnwalker
Link to comment
  • 5 weeks later...
  • 2 weeks later...

SO I am pre clearing a 6TB disk I had laying around, and the speeds are abysmal (2-3 MB/s). I notice a lot of these blocks within the log:

 

Apr 30 11:40:31 Tower kernel: ata8.00: exception Emask 0x0 SAct 0x80e00000 SErr 0x0 action 0x0
Apr 30 11:40:31 Tower kernel: ata8.00: irq_stat 0x40000008
Apr 30 11:40:31 Tower kernel: ata8.00: failed command: READ FPDMA QUEUED
Apr 30 11:40:31 Tower kernel: ata8.00: cmd 60/40:a8:00:94:73/05:00:00:00:00/40 tag 21 ncq dma 688128 in
Apr 30 11:40:31 Tower kernel:         res 41/40:00:00:94:73/00:00:00:00:00/00 Emask 0x409 (media error) <F>
Apr 30 11:40:31 Tower kernel: ata8.00: status: { DRDY ERR }
Apr 30 11:40:31 Tower kernel: ata8.00: error: { UNC }
Apr 30 11:40:31 Tower kernel: ata8.00: configured for UDMA/133
Apr 30 11:40:31 Tower kernel: sd 9:0:0:0: [sdo] tag#21 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=11s
Apr 30 11:40:31 Tower kernel: sd 9:0:0:0: [sdo] tag#21 Sense Key : 0x3 [current] 
Apr 30 11:40:31 Tower kernel: sd 9:0:0:0: [sdo] tag#21 ASC=0x11 ASCQ=0x4 
Apr 30 11:40:31 Tower kernel: sd 9:0:0:0: [sdo] tag#21 CDB: opcode=0x88 88 00 00 00 00 00 00 73 94 00 00 00 05 40 00 00
Apr 30 11:40:31 Tower kernel: I/O error, dev sdo, sector 7574528 op 0x0:(READ) flags 0x84700 phys_seg 168 prio class 0
Apr 30 11:40:31 Tower kernel: ata8: EH complete

 

Is this due to a failing drive, and does it also explain the speeds being in the toilet?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.