unRAID Server Release 5.0-rc11 Available


limetech

Recommended Posts

where can you see that it's using all cores using the webgui?  I have a dual core and would like to know if unraid is using both.  I'm also thinking of buying a quad core.  I also don't see any write cache setting in my bios.  Perhaps my board just doesn't have this setting or issue.

Install htop and telnet in, type "htop". The write cache comments are directed to HP MicroServer users.

Link to comment
  • Replies 354
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

where can you see that it's using all cores using the webgui?  I have a dual core and would like to know if unraid is using both.  I'm also thinking of buying a quad core.  I also don't see any write cache setting in my bios.  Perhaps my board just doesn't have this setting or issue.

Install htop and telnet in, type "htop". The write cache comments are directed to HP MicroServer users.

or, don't install anything and type

top

then press the "1" key to show individual CPU stats rather than combined statistics.

 

 

Link to comment

Disk5 spinning in SF GUI,  not in unmenu GUI

 

What is this ? images and syslog attached, I am on RC11

 

//Peter

This has absolutely nothing to do with unRAID are both are user-add-ons.  Take this question to the customizations forum.  It does not belong in this thread.

 

Joe L.

Link to comment

Something that turned out rather well I think is a new way of reporting user share object "duplicates": they are no longer reported in the system log.  This will limit system log growth due to normal activity.  Instead of reporting duplicates in the system log, there is a new "Location" column added to the 'indexer' webGui feature (that is, the application that executes when you click the little file folder icon next to a disk or user share).

This will make it much easier for most users to determine where the duplicates are located, and where the files that are not duplicates are as well...

 

Unfortunately, if you no longer log the files in the syslog, it will make it nearly impossible to determine if any exist unless you have either a very few files or a small number of directories.  It would take days to manually traverse the GUI looking in the "location" field for them on my server as I have probably have thousands of directories. 

 

Please consider logging the duplicates as well as showing their location or provide a utility to list them all on one page, or in one file. to make detection easier.

 

Joe L.

 

Agreed.  This utility will be in next release, I just didn't want to take the time to write it before releasing -rc11.  When you click the utility it will open a window and start listing all the files on the select user share that have duplicates along with the disks they are on.  I wanted to get rid of the duplicate file logging in syslog because that could cause syslog to grow unbounded just with "normal" use.

Link to comment

where can you see that it's using all cores using the webgui?  I have a dual core and would like to know if unraid is using both.  I'm also thinking of buying a quad core.  I also don't see any write cache setting in my bios.  Perhaps my board just doesn't have this setting or issue.

Install htop and telnet in, type "htop". The write cache comments are directed to HP MicroServer users.

 

Thanks!  I did a search and there are several downloads for htop.  Which one are you using for unraid?  I didn't find it on Limetech site either.  Link please?  :)

Link to comment

or, don't install anything and type

top

then press the "1" key to show individual CPU stats rather than combined statistics.

Aha! I didn't know you could do that. Still, htop makes it easier to see what's going on. And it's pretty!

 

YD4RH5U.png

 

You can install htop via unmenu's package manager.

Link to comment

...they are no longer reported in the system log.  This will limit system log growth...

You can make your life a little easier, and stop worrying about runaway syslogs crashing the system, if you just add this one line to /etc/fstab

tmpfs  /var/log  tmpfs  size=32m  0  0

I would not limit the size like that, but, since tmpfs has a default size of 1/2 the system RAM, leave off the "size" option entirely.

By default, a tmpfs partition has its maximum size set to half your total RAM, but this can be customized (by using "size=XX). Note that the actual memory/swap consumption depends on how much you fill it up, as tmpfs partitions do not consume any memory until it is actually needed.
  A change like this was suggested 4 years ago here:

http://lime-technology.com/forum/index.php?topic=3352.msg29874#msg29874

 

Joe L.

Link to comment

please provide the correct link for htop, I'd like to try it.

 

http://lmgtfy.com/?q=unmenu+unraid

 

That didn't help.  I know how to google, geez.  I found that page yesterday, but if you look at the bottom, there is a bunch of different options . Which one works best with unraid?  I was thinking it would be the Slackware one, but that line doesn't have a link.

 

GoboLinux: In GoboLinux you can fetch and compile htop by typing: Compile htop

You can also download the GoboLinux binary package.

• Debian: In Debian you can fetch htop by typing: apt-get install htop

You can also download the binary packages from the Debian webpage.

Thanks to Eugene Lyubimkin and Bartosz Fenski.

• Fedora: htop is part of Fedora Extras; you can fetch it by typing: yum install htop

Thanks to Dawid Gajownik.

• RedHat: You can find RPM's for htop at DAG.

Thanks to Dag Wieers. Also, you can find RHEL packages at EPEL (thanks to Josh Stone for the tip).

 

• Slackware: htop is part of Slackware. You can find it in the ap/ section.

Thanks to Patrick Volkerding for including it, and to Fred Broders for earlier packages.

• Gentoo: In Gentoo Linux you can emerge the sys-process/htop package by typing: emerge sys-process/htop

Thanks to Wolfram Schlich.

• AltLinux: here are the latest RPMs for AltLinux.

Thanks to Ilya Evseev.

• OpenSuSE: htop is included in the OpenSuSE build service.

Thanks to Timo Hoenig.

• Mandriva: you can find RPMs for Mandriva (formerly Mandrake and Conectiva) and other RPM-based distributions at RpmFind.

• KateOS: packages are available at the KateOS Pkgportal. You may enable community repositories in /etc/updateservers or using KatePKG.

Thanks to Piotr Pelzowski.

• Zenwalk: Htop is installed by default straight from the Zenwalk GNU/Linux ISO CDs and updates are available via its package manager, netpkg.

Thanks to Michael Verret.

 

Link to comment

For top alternatives, see alternative to "top" for system monitoring?

 

These are also listed in the UnRAID Add Ons wiki page.

 

The pages and links are old though, so may not link to the most recent versions.

 

exactly, the links are old.  and for Slakeware, it just says find it in the ap /section.  I don't know what that means.  That is why I was asking for the download link to the current version that works best with unraid.

 

I also do not use unmenu at all, so I don't want a unmenu package.  I just want the htop install file itself and I'll install via the Go file.

Link to comment

For top alternatives, see alternative to "top" for system monitoring?

 

These are also listed in the UnRAID Add Ons wiki page.

 

The pages and links are old though, so may not link to the most recent versions.

 

exactly, the links are old.  and for Slakeware, it just says find it in the ap /section.  I don't know what that means.  That is why I was asking for the download link to the current version that works best with unraid.

 

I also do not use unmenu at all, so I don't want a unmenu package.  I just want the htop install file itself and I'll install via the Go file.

These are completely off-topic in the release thread. Please continue it in the customizations forum.

 

Easiest way to determine the link unMENU uses is to look in the .conf file unMENU uses. 

http://code.google.com/p/unraid-unmenu/source/browse/trunk/htop-unmenu-package.conf

It will have the link and any special installation instructions that might be needed for a given package.

Link to comment

Possible  issue with RC11.

 

It looks like unRAID dropped 1 channel on an OAC-SASLP-MV8. It red balled one drive (drive 15)

 

Instead of emulating the missing drive as it should, it ran into trouble. it is having 1000's of read errors  on the other 3 drive on the same channel.  It is doing hundreds of parity corrections now. would those parity corrections not invalidate the parity for the rebuild of the lost drive?

 

I am not onsite to perform any physical  maintenance.

 

The webgui went unresponsive at about this time. I tried to do a remote into the console and hard reboot it.  the console is spammed with 1,000 of riserFS errors for disk 15. I gave it a hard reboot command and lost interactivity and it continued to spam rieserFS errors. after 12 hours in this state, it never shut down. I hard rebooted it but cutting power remotely. it came back up and auto started. it is continuing to make parity corrections with the redballed drive offline and inserted (and it now shows it made 1000's writes to 2 of the other 3 drives on the same channel). it is also spamming the log and console. can not access the array at all. at about this point it went unresponsive.

 

now the question will be what died once i get a syslog?

The driver for the controller taking out the drive.

The controller itself.

The drive that in turn took out the controller.

 

I'll try and get some sys logs asap. at this time, i have lost connection to the server.

 

I am not sure what direction to go in from here.

i would assume to have the redballed disk pulled before powering it back up and see if the other 3 drives still have issues?

Link to comment

Possible  issue with RC11.

 

It looks like unRAID dropped 1 channel on an OAC-SASLP-MV8. It red balled one drive (drive 15)

 

Instead of emulating the missing drive as it should, it ran into trouble. it is having 1000's of read errors  on the other 3 drive on the same channel.  It is doing hundreds of parity corrections now. would those parity corrections not invalidate the parity for the rebuild of the lost drive?

 

I am not onsite to perform any physical  maintenance.

 

The webgui went unresponsive at about this time. I tried to do a remote into the console and hard reboot it.  the console is spammed with 1,000 of riserFS errors for disk 15. I gave it a hard reboot command and lost interactivity and it continued to spam rieserFS errors. after 12 hours in this state, it never shut down. I hard rebooted it but cutting power remotely. it came back up and auto started. it is continuing to make parity corrections with the redballed drive offline and inserted (and it now shows it made 1000's writes to 2 of the other 3 drives on the same channel). it is also spamming the log and console. can not access the array at all. at about this point it went unresponsive.

 

now the question will be what died once i get a syslog?

The driver for the controller taking out the drive.

The controller itself.

The drive that in turn took out the controller.

 

I'll try and get some sys logs asap. at this time, i have lost connection to the server.

 

I am not sure what direction to go in from here.

i would assume to have the redballed disk pulled before powering it back up and see if the other 3 drives still have issues?

 

Why do you think this is a problem with RC11? It look like a hardware issue as you say yourself?

Link to comment

 

Might want to set the "oom_score_adj" value to -1000 in /proc/$PID/oom_score_adj to keep it from being idle the longest, and prime target for the kernel oom process killer.

One user did something like this in their config/go script after emhttp was invoked to keep it from being killed when memory was exhausted.

pgrep -f "/usr/local/sbin/emhttp" | while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done

 

It may not hang or die, but it sure gets killed a lot because it is the most idle.

 

Joe L.

 

I had the emhttp issue (webui not loading) with RC10. I implemented "oom_score_adj" and it seems much better. Once or twice a week, the webui doesnt load still but clearing cache for tower in firefox corrects it. I assumed that was an issue with firefox.

Link to comment

Why do you think this is a problem with RC11? It look like a hardware issue as you say yourself?

I think I can speak for him on this, I definitely wouldn't want unraid to write data to the parity drive based on read errors from multiple disks. If there are multiple failed drives, for whatever reason, unraid should gracefully give up and take the array offline pending intervention. The parity system is only able to handle a single drive failure. If another disk fails, writes to the parity drive shouldn't be happening at all.
Link to comment
I gave it a hard reboot command and lost interactivity and it continued to spam rieserFS errors. after 12 hours in this state, it never shut down. I hard rebooted it but cutting power remotely. it came back up and auto started. it is continuing to make parity corrections with the redballed drive offline and inserted (and it now shows it made 1000's writes to 2 of the other 3 drives on the same channel). it is also spamming the log and console. can not access the array at all. at about this point it went unresponsive.

 

Are you sure that the failed drive is still red-balled and offline after the reboot?  This sounds very similar to the problem I had with rc8a and for which a fix was implemented in rc10, here.

Link to comment

Why do you think this is a problem with RC11? It look like a hardware issue as you say yourself?

I think I can speak for him on this, I definitely wouldn't want unraid to write data to the parity drive based on read errors from multiple disks. If there are multiple failed drives, for whatever reason, unraid should gracefully give up and take the array offline pending intervention. The parity system is only able to handle a single drive failure. If another disk fails, writes to the parity drive shouldn't be happening at all.

 

You are reading my thoughts.

until I can get my syslogs. I can not verify what is happening and I can not give Tom any useful help.

 

 

I gave it a hard reboot command and lost interactivity and it continued to spam rieserFS errors. after 12 hours in this state, it never shut down. I hard rebooted it but cutting power remotely. it came back up and auto started. it is continuing to make parity corrections with the redballed drive offline and inserted (and it now shows it made 1000's writes to 2 of the other 3 drives on the same channel). it is also spamming the log and console. can not access the array at all. at about this point it went unresponsive.

 

Are you sure that the failed drive is still red-balled and offline after the reboot?  This sounds very similar to the problem I had with rc8a and for which a fix was implemented in rc10, here.

After reboot 15 is still offline and I think I lost 16 also. I started copying the files from the emulated 15 to another server incase I can not run recovery.  It started giving me reiserFS errors or drive 15 that is not plugged in (emulated erros?) array and server went unresponsive again.

GUI is not responsive.

 

 

Why do you think this is a problem with RC11? It look like a hardware issue as you say yourself?

So I lost 2-4 drives at once, next to each other on the same SAS channel just days after a parity check (and upgrade).  possible but odd.

I lost a single channel of a sas card that has had no previous errors..  possible but odd. I'd expect the whole card to go down.

Driver error/unraid error...

 

I had this happen twice before with my M1015's and it turned out the be a driver issue.

 

My concern is IF parity being updated from read errors while in a degraded state. is that going to hurt recovery attempts later?

 

Right now unraid is up from reboot #3 and completely unresponsive via webgui.

 

Link to comment

Something that turned out rather well I think is a new way of reporting user share object "duplicates": they are no longer reported in the system log.  This will limit system log growth due to normal activity.  Instead of reporting duplicates in the system log, there is a new "Location" column added to the 'indexer' webGui feature (that is, the application that executes when you click the little file folder icon next to a disk or user share).

This will make it much easier for most users to determine where the duplicates are located, and where the files that are not duplicates are as well...

 

Unfortunately, if you no longer log the files in the syslog, it will make it nearly impossible to determine if any exist unless you have either a very few files or a small number of directories.  It would take days to manually traverse the GUI looking in the "location" field for them on my server as I have probably have thousands of directories. 

 

Please consider logging the duplicates as well as showing their location or provide a utility to list them all on one page, or in one file. to make detection easier.

 

Joe L.

+1

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.