Jump to content

Data drive swap


Recommended Posts

Hello,

I am upgrading all my drives in one of my unRAID servers. I'm running 6.12.6 on both.

First, I swapped out the parity drive from a 4TB SATA drive to a 10TB SAS drive following this documentation. After about ~20 hours, everything was fine and great. Everything was green. 

Next, I swapped out Disk1 which was also a 4TB SATA drive for another 10TB SAS drive following this documentation. However, in the documentation step 9 says "Put a check in the Yes, I'm sure checkbox (next to the information indicating the drive will be rebuilt), and click the Start button" But a check box was not there. However, I did see that check box when swapping the parity drive. So I clicked the Start Button anyway. And again, about ~20 hours later, it was done. But then I had a red X by the drive, and it said "Device contents emulated" 

So I stopped the array. Then started the array back up. It still showed the same thing. Then I stopped the array again, and restarted the server. The server is back up, but drive still says disabled and contents emulated. So I stop the array again, unassign the disk, start the array, stop the array, assign the disk back, and now it says it's rebuilding the data. But this time there is no timer where it shows me how long it's been going, and how much longer it estimates it's going to take. 

Is there something I'm missing? I thought the steps were pretty straight forward minus the check box not being there for "Yes, I'm sure"

Link to comment

I forgot to mention that the disk capacity showed up with 3.9 TB used, and 6.1 TB available after the initial rebuild. And it still shows that now as it's rebuilding again. All drives in the array are XFS file system. And there are no SMART errors or disk errors on any of the drives.

I'm fairly new to unRAID, so if I've left out info please let me know and I'll update. I appreciate any help I can get, and your time spent helping me

Link to comment

Once a disk gets disabled (red ‘x’) for any reason then rebuilding it is the way you clear this status.

 

You are likely to get better informed feedback if you attach your system’s diagnostics zip file to your next post in this thread. it is always a good idea when asking questions to supply your diagnostics so we can see details of your system, how you have things configured, and the current syslog.

 

The syslog in the diagnostics is the RAM copy and only shows what happened since the reboot.   It could be worth enabling the syslog server to get a log that survives a reboot so we can see what happened prior to the reboot. The mirror to flash option is the easiest to set up, but if you are worried about excessive wear on the flash drive you can put your server’s address into the Remote Server field.

Link to comment
11 minutes ago, trurl said:

You are reading old documentation. The current documentation is available from the links at top and bottom of the forum, and from the 'manual' link in lower right corner of your Unraid webUI.


I appreciate you pointing that out. I have found the current documentation regarding replacing a disk to increase capacity. My steps were pretty similar

  • Parity check was run and was/is valid.
  • Stopped the array
  • unassigned the disk
  • Started the array
  • clicked the red x to forget the disk
  • Stopped the array
  • Shutdown (this isn't in the documentation, but did it for good measure. I hope that didn't mess with anything)
  • removed old 4 TB drive
  • installed new 10 TB drive
  • Powered on the server
  • Logged in and stopped the array
  • Formatted the disk
  • assigned the new disk in place of the old one
  • Started array, and it started rebuilding.

~20 hours later it was done, but Disk1 (which was just upgraded from 4 TB to 10 TB) shows "device is disabled, contents emulated" But "Size", "Used" and "Free" columns all showed correctly with 10 TB, 3.9 TB and 6.1 TB respectively under each column header. Is this normal behavior? Was the initial 20+ hours the pre-clear, and now the next 20 hours is rebuilding data?

Link to comment

The fact the drive is disabled suggests a write to it failed.   You are likely to get better informed feedback if you attach your system’s diagnostics zip file to your next post in this thread. it is always a good idea when asking questions to supply your diagnostics so we can see details of your system, how you have things configured, and the current syslog.

 

i am a bit worried by the fact you mentioned a format - that is not part of the normal process of replacing s drive.  Users have been known to lose the contents of the emulated drive by accidentally formatting it and end up rebuilding an empty drive.   However the figures you quote seem about right so maybe this is not what you did but I am not sure what you used to do the unneeded format as the rebuild process would wipe away any format anyway.

Link to comment
9 hours ago, itimpi said:

The fact the drive is disabled suggests a write to it failed.   You are likely to get better informed feedback if you attach your system’s diagnostics zip file to your next post in this thread. it is always a good idea when asking questions to supply your diagnostics so we can see details of your system, how you have things configured, and the current syslog.

 

Post #4 has my diag attached unless you're asking me to post a new one?

Link to comment
16 hours ago, surface said:
  • Parity check was run and was/is valid.
  • Stopped the array
  • unassigned the disk
  • Started the array
  • clicked the red x to forget the disk
  • Stopped the array
  • Shutdown (this isn't in the documentation, but did it for good measure. I hope that didn't mess with anything)
  • removed old 4 TB drive
  • installed new 10 TB drive
  • Powered on the server
  • Logged in and stopped the array
  • Formatted the disk
  • assigned the new disk in place of the old one
  • Started array, and it started rebuilding.

Basically, all you have to do is assign the new disk to the same slot as the disk it is replacing, and start the array to begin rebuild. All the rest isn't really necessary.

 

I am concerned that you mention "format" in the middle of all this though. Format is never part of rebuild. It sounds as if you didn't format the disk in the array, though, so should be OK, though totally pointless to format a disk that is going to have every bit overwritten during rebuild.

 

Does the rebuilding disk show all of the data you expect?

Link to comment

Just noticed 64 errors that I hadn't noticed before. Also in syslog, I just found this
 

Jan 25 14:35:34 RKNAS02 kernel: critical target error, dev sdj, sector 19532742384 op 0x1:(WRITE) flags 0x0 phys_seg 64 prio class 2
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742320
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742328
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742336
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742344
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742352
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742360
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742368
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742376
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742384
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742392
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742400
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742408
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742416
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742424
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742432
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742440
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742448
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742456
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742464
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742472
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742480
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742488
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742496
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742504
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742512
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742520
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742528
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742536
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742544
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742552
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742560
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742568
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742576
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742584
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742592
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742600
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742608
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742616
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742624
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742632
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742640
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742648
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742656
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742664
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742672
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742680
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742688
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742696
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742704
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742712
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742720
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742728
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742736
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742744
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742752
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742760
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742768
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742776
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742784
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742792
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742800
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742808
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742816
Jan 25 14:35:34 RKNAS02 kernel: md: disk1 write error, sector=19532742824
Jan 25 14:35:34 RKNAS02 kernel: md: recovery thread: exit status: -4


Maybe this drive is bad?

Link to comment

Just as an FYI, this same issue is happening on my nas02 (the one we've been talking about) as well as nas03. I ran a long SMART test on both the 10TB SAS disks that are having issues on nas02 and nas03 about 15 hours ago. They're almost done. Nas03 is at 100% but looks like it's still running. Nas02 is at 98%. Both drives are 10TB Seagate SAS drives. They were purchased used and came from the same seller. 
image.png.0dc6f3aafda49e397be37d832ae8a81c.png

Edited by surface
Link to comment
  • 2 weeks later...

What I did to fix this was, and I'm not saying this is going to work for everyone or even if this is correct, but I stopped the array, unassigned the disk, started the array, stopped the array again, assigned the disk. Then I went to tools, and new config, and preserve current assignments. Then started the array again. My disk was then accepted. The weird thing is I didn't see this step in the documentation. So I'm not sure this is the correct way of accomplishing this, but it worked for me

Edited by surface
Link to comment
3 minutes ago, surface said:

What I did to fix this was, and I'm not saying this is going to work for everyone or even if this is correct, but I stopped the array, unassigned the disk, started the array, stopped the array again, assigned the disk. Then I went to tools, and new config, and preserve current assignments. Then started the array again. My disk was then accepted. The weird thing is I didn't see this step in the documentation. So I'm not sure this is the correct way of accomplishing this, but it worked for me

This approach will lose any updates made to the drive since it was disabled so you can have data loss.  It is normally only the last ditch attempt after everything else has failed.

 

The correct approach is covered here the online documentation accessible via the Manual link at the bottom of the Unraid GUI.  In addition every forum page has a DOCS link at the top and a Documentation link at the bottom.   The Unraid OS->Manual section covers most aspects of the current Unraid release.

Link to comment

I appreciate all of your (the mods) help. I had 10 more drives to upgrade and I didn't want to potentially do this 10 more times. So after it worked, I opted to take the data loss, changed the 10 drives all at once, then restored all my files from backup. All 35TB of it. 

 

Again, I appreciate all of you, including the time you take and work you do to help the community

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...