Swap 2 data slots issue


Triglia118

Recommended Posts

Dear friends, I recently upgraded to Unraid 6.1.3: everything is OK. After some testing I decided to improve my server, adding some disks and setting them up better. All done, all OK.

Yesterday I decided to swap DISK 5 slot and DISK 6 slot, to mirror the hardware real position in the server. I followed the instructions, found somewhere in the documentation:

 

1) Stop the server: OK

2) Unassign DISK 5 and DISK 6: ERROR disks missing, as expected

3) Swap assign disk 5 in disk 6 slot and the other way round: ERROR, different assignment then before, as expected

4) Start the server: OK all slots with green light .... BUT

 

disk 5 and disk 6 unmountable (or something like that) and the server requested to format them ?!!!!

What? they are full of data!

Of course I reverted everything as it was before, and all was Ok again. What I did wrong?

Thank you for your help and sorry for my poor English.

Link to comment

A New Config is completely safe PROVIDING you don't assign any disks incorrectly !!

 

Before doing it, you want to be CERTAIN that you have good parity => I'd run a correcting parity check and confirm that there are no sync errors corrected ... if there are, run it again afterwards to be sure there aren't any residual issues.

 

Once you KNOW you have good parity, be sure you know which disk is your parity disk (best to simply save a screen shot of the Web GUI that shows the disk assignments)  and which disk(s) are assigned as cache drives (if any).

 

Then you simply do a New Config on the Utils menu; and then assign all the disks in the slots you want [Doesn't matter what goes in which slot except for the parity and cache units].    You can then check the "Parity is already valid" box and Start your array, and you're ready to go.    Note that UnRAID will start a parity check when you start the array (to confirm your parity is good) ... but you can cancel that if you want.

 

Link to comment

@ garycase

Thank you, all done. Unraid server up and running, all data preserved. Just a question, now Unraid is doing the new parity check, and it is correcting all the parity disk (parity disk correctly assigned as before), as it was all wrong.... but before proceeding I checked the parity and it was Ok. So: why now it seems that is all wrong? is this normal? I've just switched the slot position of 2 data drives, so I was expecting that the parity would be OK...

 

Just to know: all disks are green, no errors, just rebuilding parity disk

Link to comment

Hopefully you definitely assigned the correct disk as parity !!  If not, you're now writing parity to a disk that had data (too late to do anything about it at this point).

 

Did you check the "Parity is already valid" box before you started the array?    If not, it's simply doing a new parity sync.    You can tell if you did this for sure by looking at the Web GUI => is it doing a parity SYNC or a parity CHECK ??

 

Link to comment

I'm sure that I assigned the parity disk in the correct slot. And I did check "parity already valid". The odd thing is that during parity check a lot of errors have been found. I didn't expect that. Before swapping the two data disks I did a previous parity check without any errors. Anyway, no data loss and a new created parity disk...  I just didn't understand why...

Link to comment

It's perplexing why you'd see errors on the parity check, but as long as you absolutely assigned the correct disk it should all be okay.    Does the Web GUI show any read errors from the data disks during the check?  (the errors column)    If so, you may not have a SATA cable that's not seated securely, and this could explain the errors you're seeing.

 

If that's the case, I'd reseat the cables securely, do another New Config, and not check the "parity is valid box" => letting UnRAID do a new parity sync.    If that's not the case, then all is probably fine ... just chalk it up as one of life's little mysteries.    In any event, I'd run ANOTHER parity check after the first one finishes, to confirm you don't get any more sync errors.

 

Link to comment

First of all thank you for your help and for patience reading my poor English. I followed your suggestion and performed another parity check: success, no error. During the previous parity check a got a lot of corrections on parity disk, but no errors in error column at all.

This morning I downloaded the syslog, and found out some strange things. It seems that I have some problems reading the usb flash stick some lines like:

Sep 25 09:53:13 Unraid101 kernel: read_file: error 2 opening /boot/config/super.dat
Sep 25 09:53:13 Unraid101 kernel: md: could not read superblock from /boot/config/super.dat
Sep 25 09:53:13 Unraid101 kernel: md: initializing superblock

and

Sep 25 09:53:23 Unraid101 emhttp: Start OK!
Sep 25 09:53:24 Unraid101 emhttp: unclean shutdown detected
Sep 25 09:53:24 Unraid101 kernel: mdcmd (46): check CORRECT
Sep 25 09:53:24 Unraid101 kernel: md: recovery thread woken up ...
Sep 25 09:53:24 Unraid101 kernel: md: recovery thread checking parity...
Sep 25 09:53:25 Unraid101 kernel: md: using 1536k window, over a total of 976762552 blocks.
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21048
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21056
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21064
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21072

 

anyway I attach the complete syslog, if you have some spare time, you could suggest me what to check, to avoid future problems.

Thank you

 

 

syslog-2015-09-25.zip

Link to comment
  • 2 weeks later...

Dear friends, thank you for your help. After USB stick replacement and licence migration, all is OK. So I think that all issues I experienced during server hardware upgrade, were due to reading problems of USB boot stick. I didn't understand why this would affect parity errors, but anyway I could live with this mystery.

Link to comment

First of all thank you for your help and for patience reading my poor English. I followed your suggestion and performed another parity check: success, no error. During the previous parity check a got a lot of corrections on parity disk, but no errors in error column at all.

This morning I downloaded the syslog, and found out some strange things. It seems that I have some problems reading the usb flash stick some lines like:

Sep 25 09:53:13 Unraid101 kernel: read_file: error 2 opening /boot/config/super.dat
Sep 25 09:53:13 Unraid101 kernel: md: could not read superblock from /boot/config/super.dat
Sep 25 09:53:13 Unraid101 kernel: md: initializing superblock

These lines are completely normal for a New Config, same as for a brand new install with no super.dat file yet.  The New Config tool clears all assignments just by deleting the current super.dat file, and that results in the lines above.  Those lines by the way are REALLY old, were in syslogs from the very first unRAID versions I ever saw, so could probably use some cleanup, to avoid this confusion.  Your flash drive was fine, no issues that I could see.

 

Sep 25 09:53:23 Unraid101 emhttp: Start OK!
Sep 25 09:53:24 Unraid101 emhttp: unclean shutdown detected
Sep 25 09:53:24 Unraid101 kernel: mdcmd (46): check CORRECT
Sep 25 09:53:24 Unraid101 kernel: md: recovery thread woken up ...
Sep 25 09:53:24 Unraid101 kernel: md: recovery thread checking parity...
Sep 25 09:53:25 Unraid101 kernel: md: using 1536k window, over a total of 976762552 blocks.
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21048
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21056
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21064
Sep 25 09:53:25 Unraid101 kernel: md: correcting parity, sector=21072

This was more worrying, may indicate a rare unRAID bug.  You booted the system with the existing assignments without issue, no unclean shutdown detected, and the array automatically started.  Then you stopped the array, and after a bit, executed New Config, at which point the expected messages about missing super.dat appeared.  You assigned the parity drive, then the Cache drive, then Disks 1 through 7 in order, except the assignments for Disk 5 and Disk 6 were swapped (all others were identical to their previous assignments).  Then you started the array.  At first, it was fine, drives are mounted, User Shares set up, and Serviio started, then it indicates "unclean shutdown detected", which is impossible here!  The array had already been started without issue, then stopped, and now restarted.  The first 21000 sectors are perfect, but then it begins to find clusters of parity sectors needing correction.  As far as I can tell, these are early, in the file system portions of the drive.  It would be interesting to know the final count of corrections.  Tom, if you see this, would it be possible to add the count of corrections to the final summary in the syslog?  Triglia118, can you recall if the corrections were all at the beginning?  You quickly started another parity check, with no corrections showing.  It ran less than a minute quicker than the first run, so there could not have been too many corrections in the first run.

 

The one thing that is a possible concern is that you are running the S3 sleep plugin.  Is there any chance it kicked in during this session?  If so, it could have caused some strangeness on waking, left certain things in an uncertain state.  I did notice that you changed some settings for that plugin.

 

A side note, you are running the Mover every hour on the hour, which is a little unusual, but fine if that's the way you want it.

Link to comment

RobJ, thank you for your deep analysis of my log. I'll try to answer to your questions...

...

Those lines by the way are REALLY old, were in syslogs from the very first unRAID versions I ever saw, so could probably use some cleanup, to avoid this confusion.  Your flash drive was fine, no issues that I could see.

What do you mean with "REALLY old"? I took the last available log file and those lines are from the day I upgraded the server.

 

...

Triglia118, can you recall if the corrections were all at the beginning?  You quickly started another parity check, with no corrections showing.  It ran less than a minute quicker than the first run, so there could not have been too many corrections in the first run.

Sorry I cannot remember the exact number of corrections. There were many of them, not only at the beginning of the process. Is there any other log file that records these events?

 

...

The one thing that is a possible concern is that you are running the S3 sleep plugin.  Is there any chance it kicked in during this session?

No this is not the case, while the process was running I was at work, supervising the process now and then. I was able to access every time so I don't think the server went to sleep.

There is also another thing about S3 sleep: I had some problems, described here:

http://lime-technology.com/forum/index.php?topic=36543.msg411213#msg411213
http://lime-technology.com/forum/index.php?topic=36543.msg413486#msg413486

Don't know if this is relevant.

 

...

A side note, you are running the Mover every hour on the hour, which is a little unusual, but fine if that's the way you want it.

My server is not usually ON, I switch it ON when I need it and leave it ON for the time needed. This is why I run the Mover every hour: to avoid to switch it off before the Mover has done its job. A frequent call avoid this problem

 

 

Link to comment

Those lines by the way are REALLY old, were in syslogs from the very first unRAID versions I ever saw, so could probably use some cleanup, to avoid this confusion.

What do you mean with "REALLY old"? I took the last available log file and those lines are from the day I upgraded the server.

I'm sorry, I wasn't clear, "REALLY old" did not refer to your syslogs, but to those 3 lines.  They have appeared for many years in unRAID syslogs.  Here's an excerpt from a syslog from 2007!

Aug  3 23:32:51 unRaid emhttp[1112]: shcmd (16): modprobe md-mod super=/boot/config/super.dat slots=8,0,8,16,8,32,8,48,33,0,33,64,34,0,34,64,56,0,56,64,57,0,57,64,0,0,0,0,0,0,0,0 >>/var/log/go 2>&1

Aug  3 23:32:51 unRaid kernel: [  121.524962] md: unRAID driver 0.92.0 installed

Aug  3 23:32:51 unRaid kernel: [  121.866746] md: xor using function: pII_mmx (3930.400 MB/sec)

Aug  3 23:32:51 unRaid kernel: [  121.867691] md: reading superblock from /boot/config/super.dat

Aug  3 23:32:51 unRaid kernel: [  121.867722] read_file: error 2 opening /boot/config/super.dat

Aug  3 23:32:51 unRaid kernel: [  121.867725] md: could not read superblock from /boot/config/super.dat

Aug  3 23:32:51 unRaid kernel: [  121.867728] md: warning! initializing superblock

Aug  3 23:32:51 unRaid kernel: [  121.868169] md0: import [8,0] (sda) ST3500641A      3PM0FSRX offset: 63 size: 488386552

Aug  3 23:32:51 unRaid kernel: [  121.868175] md0: new disk

Aug  3 23:32:51 unRaid kernel: [  121.868273] md1: import [8,16] (sdb) ST3500641A      3PM0GCY1 offset: 63 size: 488386552

Aug  3 23:32:51 unRaid kernel: [  121.868277] md1: new disk

 

Sorry I cannot remember the exact number of corrections. There were many of them, not only at the beginning of the process. Is there any other log file that records these events?

I believe the count is in vars.txt in the diagnostics, but it's only the last count.  You have done another parity check, so the count will be zero.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.