Tried to shrink array and now I can't start the array or power off the server


Recommended Posts

Hi guys,

 

I'm currently on unRAID 6.5.2 on a Dell PowerEdge T20 (Intel® Xeon® CPU E3-1231 v3 @ 3.40GHz, 16 GB Single-bit ECC ) with a total of 6 HDDs and i'm trying to consolidate the data on fewer disks so i followed the guide to shrink the array while maintaining the parity.

The "Clear Drive Then Remove Drive" Method

 

 

The disks I wish to remove is Disk 1 and Disk 4 (unraid01.png). I started with Disk 4 since it was unused and it well pretty well (i guess?) over night clearing it out, but the parity drive went CRAZY hot (66 centigrades!) but the script reported OK and cleared in the browser. I later went on to step 13 in the guide and here's were trouble started. I can't unassign Disk 4. When i click on the drop down menu and choose "no device" the page reloads (both in Firefox and Chrome) and there's no way going around the problem and there's no Step 14. on that page. I can switch to "Dashboard" and go down and look for the parity status and it checks out OK: "Parity is valid...Last checked Sun 22 Apr, 87 days ago...) (unraid02.png)

 

Well, i thought that sucked, so i tried to reboot/shut down the system but the system won't reboot or shut down. I even tried from SSH (unraid04.png) but the system just won't. I can browse the GUI still though. And as you can see (unraid03.png), there's no way to start the array or the other buttons that used to be there. So now I have no clue what to do. I don't want to do a Norwegian reboot (Swedish joke ;) ) by pressing the powerbutton for a couple of sec and force a parity check from a unclean reboot.

 

Thanks in advance!

 

 

 

unraid01.png

unraid02.png

unraid03.png

narum-diagnostics-20180718-1210.zip

unraid04.png

Link to comment

You have a problem with the disk that is named md4.  See this section of your syslog file:

 

Jul 17 20:15:32 Narum emhttpd: req (38): cmd=/plugins/user.scripts/startScript.sh&arg1=/tmp/user.scripts/tmpScripts/clear_an_array_drive/script&arg2=&csrf_token=****************
Jul 17 20:15:32 Narum emhttpd: cmd: /usr/local/emhttp/plugins/user.scripts/startScript.sh /tmp/user.scripts/tmpScripts/clear_an_array_drive/script 
Jul 17 20:16:32 Narum clear_array_drive: Clear an unRAID array data drive  v1.4
Jul 17 20:16:32 Narum clear_array_drive: Unmounting Disk 4  (command: umount /mnt/disk4 ) ...
Jul 17 20:16:32 Narum clear_array_drive: Clearing Disk 4  (command: dd bs=1M if=/dev/zero of=/dev/md4  status=progress ) ...
Jul 17 20:16:36 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:16:36 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:16:36 Narum kernel: REISERFS (device md4): Remounting filesystem read-only
Jul 17 20:16:44 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:16:44 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:16:54 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:16:54 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:17:04 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:17:04 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:17:14 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:17:14 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:17:25 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1

The REISERFS error has to do with the file  system on that disk.  (there are many hundreds of similar lines in the reminder of your syslog.)  As I recall, md4 is also known as disk4 and that one is this physical disk--- HGST_HTS545050A7E380_130729TE85A13R2ZNKKK.  I am not sure what you should do at this point...   

Link to comment
25 minutes ago, Frank1940 said:

You have a problem with the disk that is named md4.  See this section of your syslog file:

 


Jul 17 20:15:32 Narum emhttpd: req (38): cmd=/plugins/user.scripts/startScript.sh&arg1=/tmp/user.scripts/tmpScripts/clear_an_array_drive/script&arg2=&csrf_token=****************
Jul 17 20:15:32 Narum emhttpd: cmd: /usr/local/emhttp/plugins/user.scripts/startScript.sh /tmp/user.scripts/tmpScripts/clear_an_array_drive/script 
Jul 17 20:16:32 Narum clear_array_drive: Clear an unRAID array data drive  v1.4
Jul 17 20:16:32 Narum clear_array_drive: Unmounting Disk 4  (command: umount /mnt/disk4 ) ...
Jul 17 20:16:32 Narum clear_array_drive: Clearing Disk 4  (command: dd bs=1M if=/dev/zero of=/dev/md4  status=progress ) ...
Jul 17 20:16:36 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:16:36 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:16:36 Narum kernel: REISERFS (device md4): Remounting filesystem read-only
Jul 17 20:16:44 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:16:44 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:16:54 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:16:54 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:17:04 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:17:04 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:17:14 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1
Jul 17 20:17:14 Narum kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Jul 17 20:17:25 Narum kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one -1

The REISERFS error has to do with the file  system on that disk.  (there are many hundreds of similar lines in the reminder of your syslog.)  As I recall, md4 is also known as disk4 and that one is this physical disk--- HGST_HTS545050A7E380_130729TE85A13R2ZNKKK.  I am not sure what you should do at this point...   

 

Ahhh ok! I forgot to mention that i did (as described in the guide) change the filesystem on that disk and format, as that was the fastest way to erase the data. I don't mind if the disk is wasted as long as the array makes it :D I'm really tempted to just make a unclean reboot right now since i really miss the services running on the machine haha!

 

Should i try to just shut the server down and then remove the drive?

 

Link to comment

IF you do this, you will have to set a new configuration and allow parity to rebuilt.  You have already have the link to how to do this.  Just use the "Remove Drives Then Rebuild Parity" Method.   

 

Another thing, you should be looking at why your parity drive is getting so hot.  That is not good for it...

Link to comment
13 minutes ago, Frank1940 said:

IF you do this, you will have to set a new configuration and allow parity to rebuilt.  You have already have the link to how to do this.  Just use the "Remove Drives Then Rebuild Parity" Method.   

 

Another thing, you should be looking at why your parity drive is getting so hot.  That is not good for it...

 

Ok! I will try that! I guess it's because it's crazy hot here in Sweden now and it's hard not to have about 30 centigrades indoor. Then i removed the side of the server chassie and the airflow got f*cked when i zeroed the disk. It's usually at about 53-55 centigrades (seagate ironwolf op. temp.)

 

Thanks for the help!

Link to comment

Never take the side of a case off of a server to help cool it.  That guarantees that you will have very little air flow over your hard disks.  You actually want the temperatures to be in the mid 40's.  On my servers, I have a fan installed in every case opening for a fan and they all blow out.  (They should only blow in if if they are at the front of the hard drives.) Of course, we will hit 35C several times during the summer but most places have Air Conditioning (AC) as 30C is a fairly typical high for us.  My air conditioning set-point is about 25C but the home office where one of the servers is always a degree or two above that--- the reduction is humidity is the real benefit of AC for most folks.   (I was outside in Arizona once when it was 40C and was not uncomfortable.  BUT I was just sitting in a chair, in the shade, drinking lots and lots of water and the relative humidity was about 10%.  I was actually sweating like crazy but my skin was dry...)  

Edited by Frank1940
Link to comment
2 hours ago, iamnypz said:

 

Ok! I will try that! I guess it's because it's crazy hot here in Sweden now and it's hard not to have about 30 centigrades indoor. Then i removed the side of the server chassie and the airflow got f*cked when i zeroed the disk. It's usually at about 53-55 centigrades (seagate ironwolf op. temp.)

 

Thanks for the help!

 

I also have serious disk temp issues - so only playing with some 2.5" drives right now until the weather makes it safe to toy with an file server. And the hot weather seems to stay for a while more in Sweden.

 

Be very, very, very careful with your drives. 66°C is a warranty-killing temperature. If you can't keep the drives cool, then it's better to let the server rest.

Link to comment
7 minutes ago, pwm said:

 

I also have serious disk temp issues - so only playing with some 2.5" drives right now until the weather makes it safe to toy with an file server. And the hot weather seems to stay for a while more in Sweden.

 

Be very, very, very careful with your drives. 66°C is a warranty-killing temperature. If you can't keep the drives cool, then it's better to let the server rest.

 

I've never seen anything like it, but it must be because I (against better knowledge) opened the side of the server. Usually when there's alot of writing to the parity (in the winter that is, about 19C indoor at that time) the drive temp is about 51-53C. The other non seagate disks are at 40-isch C so it's Seagate specific I would say. But thanks for the heads up anyways! Yeah it seems like an never ending summer up north. Been a while since :D Hard times though since no one have ACs installed 

Link to comment
10 minutes ago, iamnypz said:

Been a while since :D Hard times though since no one have ACs installed 

 

I know. Have failed to get some real work done for a long time now. Just too hot in front of the computer to be able to focus on work while at home. And the fans/AC sold out in most of Sweden. But at least the work office is cool.

Link to comment

I have seen disk temps of upwards of 60c too. My WD Red Pro always runs quite a bit hotter than the Ironwolves (Ironwolfs?)

Happened yesterday during a parity rebuild and really not that much can be done about it. I just try to minimize cpu and disk usage during the day in the summer. Unfortunately, since my server sounds much like a vacuum cleaner it cannot stay anywhere in the house, and thus the unventilated, completely unshaded shed is the only place to put it.

As I am a renter, there's not much in the way of modifications I can do to the shed. I do intend to try and remove a small gable piece and put a fan there, but as I write this, the temp in the shed is high 40's.

So, you aren't the only person who's temperature limited and it is really annoying but sometimes there's just not much that can be done.

Del

 

As a side note, after hitting 50c yesterday a still plugged in but effectively retired WD Green 2TB with 9 years of power on time is now giving current pending sector count errors and they are increasing by the minute (16 to 120 in 8 hours.) Of course, it's far too hot to go out to the shed to remove it...

Edited by Delarius
Link to comment
22 minutes ago, ken-ji said:

You guys have crazy temps...

summer's over here and we had temps of 35C in the shade  with humidity of 60%

Disk temps are about 35C - 55C (during parity checks) - no AC - power bills are already too expensive

 

Yes, I know Japan have had really killer temperatures :(

 

The problem we have in Sweden is that I live at almost same latitude as Reykjavik on Iceland or middle of Fairbanks and Anchorage in Alaska. So +31°C (+88°F) outside is way outside of normal temperatures - our houses are optimized to keep comfortably warm when it's -30°C (-22°F) outside. It's normally only office buildings that have AC.

 

My Seagate 2.5" 5TB USB drive did reach 52°C when doing a slow, sequential, write of 30 MB/s over USB 2 (the world needs a new R-Pi generation with more bandwidth).

Link to comment
20 minutes ago, pwm said:

 

Yes, I know Japan have had really killer temperatures :(

 

The problem we have in Sweden is that I live at almost same latitude as Reykjavik on Iceland or middle of Fairbanks and Anchorage in Alaska. So +31°C (+88°F) outside is way outside of normal temperatures - our houses are optimized to keep comfortably warm when it's -30°C (-22°F) outside. It's normally only office buildings that have AC.

 

My Seagate 2.5" 5TB USB drive did reach 52°C when doing a slow, sequential, write of 30 MB/s over USB 2 (the world needs a new R-Pi generation with more bandwidth).

I'm actually in the funny country called the Philippines - where we are now experiencing lots of rain (storms one after the next) and uncomfortably sticky amounts of humidity...

Link to comment
3 minutes ago, ken-ji said:

I'm actually in the funny country called the Philippines - where we are now experiencing lots of rain (storms one after the next) and uncomfortably sticky amounts of humidity...

 

Sorry for mixing up the countries.

 

I very much dislikes humidity. Dry heat/cold is ok. Not so much heat transfer.

Link to comment
20 hours ago, pwm said:

 

I also have serious disk temp issues - so only playing with some 2.5" drives right now until the weather makes it safe to toy with an file server. And the hot weather seems to stay for a while more in Sweden.

 

Be very, very, very careful with your drives. 66°C is a warranty-killing temperature. If you can't keep the drives cool, then it's better to let the server rest.

 

FYI, I found this today, doing some research. So 66C isn't BANANAS but high - yes :D

 

Did a parity rebuild tonight with server case closed on the same parity drive and without Disk 4 and as far as i know, the temperature were < 58C so the airflow (and thus me) were to blame in the first case.

 

Quote

With our newer model drives the maximum temperature is now at 60 degrees Celsius.

The operating temperature range for most Seagate hard drives is 5 to 50 degrees Celsius. A normal PC case should provide adequate cooling.

http://knowledge.seagate.com/articles/en_US/FAQ/193771en

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.