SMART paramater tracking database


Recommended Posts

Version 0.1.06 is now posted.  Link is in OP.

 

Now includes both executable (smarthistory) and php code (smarthistory.php)

 

//* 0.1.06  - Cleaned up code to suppress warnings from PHP when PHP warnings are enabled

//*           Changed defaults size for graphs

//*           Substitute static image when graphs would show data has never changed

//*           Changed logic to give alerts on all static thresholds when report=ALL

//*             regardless of whether a delta-threshold was met

 

And I added more comments to the source code ;)

 

//* ToDo List:

//* -------------------------------------------

//* Implement more options in program config

//* User-config to exclude drives

//* Smarts to know which models of drive can give smart data w/o spinning up

//* Limits on length of time to keep data

//* Let users define colors

//* Add ANSI codes for color terminal output

//* Combine the token and program config files

Link to comment
  • Replies 132
  • Created
  • Last Reply

Top Posters In This Topic

I'm using the Roadsend PHP compiler, pcc.

 

The problem is the static linking... which no matter what language you are using, if you static link any high-level language, you get tons of baggage. 

 

Thanks, this is a big help, I did not know about this tool. I have a new respect for PHP. ;-)

I'm aware of the static linking baggage issue. I design secure chroots on many of our servers for applications.

I end up making static versions of binaries all the time to alleviate the issue of shared libraries in chroots.

 

Link to comment

I just installed for the first time.  Couple comments / issues:

 

1 - The structure should support capturing more than one (or two) samples per day.  I could see utility to running it before and after a parity check.  I could see running every hour DURING a parity check (as this is the time that smart attributes are going to have the greatest chance of being updated).  I could also see starting a parity check and then running the smarthistory after 5 minutes.  If there were problems developing, you'd get notified.  Keeping this history could be very helpful even if options to display / graph it all are not immediately available.

 

2 - There are a bunch of zero-length files with a .LOCK extension in the smarthistory directory after the program ends.  Is this normal?

Link to comment
1 - The structure should support capturing more than one (or two) samples per day.

 

That would require a MAJOR change, both to data and the program.  Smarthistory is principally intended for looking at months of trending data over time

2 - There are a bunch of zero-length files with a .LOCK extension in the smarthistory directory after the program ends.  Is this normal?

 

Running smartctl every 5 minutes during a parity check is a very *bad* idea.

 

Yes.  The flatfile database code uses those for file lock semaphores.

Link to comment

That would require a MAJOR change, both to data and the program.  Smarthistory is principally intended for looking at months of trending data over time

 

Running smartctl every 5 minutes during a parity check is a very *bad* idea.

 

bubbaQ, this is a great tool.  Please take my comment as just a suggestion.

 

Might want to reread my last post.  I didn't suggest running smartctl every 5 minutes.

Link to comment

Understood...  I just have to watch out for "mission creep."  An application that looked for short-term smart parameter changes for use when there is hard-core disk activity might be useful... I just don't think that is in the mission of smarthistory.

 

You can do it with smarthistory however by overriding the -dailydata parameter as long as you don't cross midnight.

 

   smarthistory -dailydata LAST -devices /dev/sdx   (this will write the current (LAST) values to the database)

 

Start parity check.

 

Then in a few minutes you can run

 

   smarthistory -dailydata FIRST -devices /dev/sdx  (this will compare the written data with the current live data)

 

and you can re-run this command as many times as you like (as long as you don't cross midnight) and it will always compare the latest (LIVE) data to the data saved just before you started.

Link to comment

1.  I think that there might an issue using this with version 5.36 of smartctl.  I ran it on a sleeping array using 4.3.3. and got errors indicating it had failed the health check.  Lines showed up in the history file with blank values.  The drives were all spun down, but I told smarthistory to spin them up (and it did).

 

2.  Is there a way to get smarthistory to create the HTML for the graphs WITHOUT actually doing a data collection?

Link to comment
  • 5 months later...

Hi,

 

I am trying to start this tool from the go script, without success.

I tried to include this in the following two ways:

 

1. cd /boot/smarthistory; smarthistory

2. cd /boot/smarthistory

    smarthistory

 

It seems it not started (if I check a history file which was not updated for days, it's not get updated)

But if I do it manually via a telnet session, cd to the directory and run smarthistory it's ok.

 

What am I doing wrong?  ???

 

And two more question:

- how can I check, if it is started/loaded in some more sophisticated way, than checking the history files?

- if it invoked by "smarthistory", is it schedule itself automatically to check for smart data every day once a disk spin up?

 

Link to comment

OK, my problem with starting the tool has been solved.

 

So now the main question of mine, that how can a schedule it to record smart params every day?

When I run it manually it saves the record to the HDD files, but it seems not doing it automatically every day.

 

I understood it is schedule itself automatically, isn't it? Shall I schedule it manually?

 

Thank you!

Link to comment
  • 1 month later...

Hi

 

just installed and get these errors: I have unRAID Server Pro 4.4.2

 

VAULT login: root

Linux 2.6.27.7-unRAID.

root@VAULT:~# /boot/smart/smarthistory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

No alerts from Smarthistory.

1 device(s) active, 4 sleeping, 1 did not return SMART data.

root@VAULT:~#

 

 

 

 

Link to comment

Hi

 

just installed and get these errors: I have unRAID Server Pro 4.4.2

 

VAULT login: root

Linux 2.6.27.7-unRAID.

root@VAULT:~# /boot/smart/smarthistory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

smartctl: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

No alerts from Smarthistory.

1 device(s) active, 4 sleeping, 1 did not return SMART data.

root@VAULT:~#

 

 

 

 

The support library needed for the "smartctl" program was accidentally not included in your version of unRAID. It is back in 4.5-beta6.

 

You can easily install it in your version though,

 

Paste the line below at a command prompt  It is a long line, starting with an opening "(" and ending at the closing paren ")"

[pre]

(mkdir /boot/packages;cd /boot/packages;wget http://slackware.cs.utah.edu/pub/slackware/slackware-12.0/slackware/a/cxxlibs-6.0.8-i486-4.tgz;'>http://slackware.cs.utah.edu/pub/slackware/slackware-12.0/slackware/a/cxxlibs-6.0.8-i486-4.tgz; installpkg cxxlibs-6.0.8-i486-4.tgz;)[/pre]

Or, type the 4 commands in turn

 

mkdir /boot/packages

cd /boot/packages

wget http://slackware.cs.utah.edu/pub/slackware/slackware-12.0/slackware/a/cxxlibs-6.0.8-i486-4.tgz

installpkg /boot/packages/cxxlibs-6.0.8-i486-4.tgz

 

Either will create a /boot/packages directory, change directory to it, download the missing library, and then install it.

 

If you reboot you'll need to re-run the last line to re-install the package.  No need to download it again, as you already did that once.

 

Joe L.

Link to comment
  • 1 month later...

Has anyone ever seen an issue where a call to smarthistory can corrupt the view of the booting usb drive?  I've got an OCZ that after smarthistory is run the /boot filesystem is missing most entries including smarthistory and unmenu and I get error messages like this in the log:

 

unmenu[1504]: gawk: drivedb.lib.awk:1: fatal: can't read sourcefile `drivedb.lib.awk' (Input/output error)

 

At this point, unmenu's freaked out and deliver's corrupt pages. I can't shut down cleanly because I can't write logs or even find configuration files to put everything away.  I thought it was hardware or something and ended up going through cpu/motherboard setups and even had a 5 day run of a 3 drive unregistered array with a different usb key that worked like a champ.

 

So tonight, I go back to my registered array and went to smarthistory and got a first report but when I went back to unmenu I got the error above and bogus pages from unmenu and the drive contents were lost (and changed as time went on).

 

After getting it shut down I removed the usb key and put it on my windows box where everything looks fine.

 

Has anyone ever seen this?  Surely the OCZ doesn't support SMART so it should ignore/reject the request.  Anyone have any ideas?

 

Rob

Link to comment

Has anyone ever seen an issue where a call to smarthistory can corrupt the view of the booting usb drive?  I've got an OCZ that after smarthistory is run the /boot filesystem is missing most entries including smarthistory and unmenu and I get error messages like this in the log:

 

unmenu[1504]: gawk: drivedb.lib.awk:1: fatal: can't read sourcefile `drivedb.lib.awk' (Input/output error)

 

At this point, unmenu's freaked out and deliver's corrupt pages. I can't shut down cleanly because I can't write logs or even find configuration files to put everything away.  I thought it was hardware or something and ended up going through cpu/motherboard setups and even had a 5 day run of a 3 drive unregistered array with a different usb key that worked like a champ.

 

So tonight, I go back to my registered array and went to smarthistory and got a first report but when I went back to unmenu I got the error above and bogus pages from unmenu and the drive contents were lost (and changed as time went on).

 

After getting it shut down I removed the usb key and put it on my windows box where everything looks fine.

 

Has anyone ever seen this?  Surely the OCZ doesn't support SMART so it should ignore/reject the request.  Anyone have any ideas?

 

Rob

You are the first one to report that behavior.  The USB flash drive should ignore the SMART request, but who knows....

 

Are you sure it might not be a file-system corruption errorr and running chkdisk on windows might detect a problem with the smarthistory file.

 

If it really is the smartctl call that triggers corruption, you can always code the smarthistory.php to not include the flash drive.

 

Joe L.

Link to comment
  • 1 month later...

No, it does not schedule itself.... you have to do that with cron.

 

Thanks for the great extension bubbaQ. Added to daily cron, now it works flawless.

 

Sorry if it's been posted before but could somone please provide an example of how to do this?

 

I used Joe's .sh example below to create a job to check parity monthly and call it from my go file

 

#!/bin/sh

crontab -l >/tmp/crontab

grep -q "/root/mdcmd check" /tmp/crontab 1>/dev/null 2>&1

if [ "$?" = "1" ]

then

    echo "# check parity on the first of every month at midnight:" >>/tmp/crontab

    echo "0 0 1 * * /root/mdcmd check 1>/dev/null 2>&1" >>/tmp/crontab

    cp /tmp/crontab /var/spool/cron/crontabs/root-

    crontab /tmp/crontab

fi

Link to comment

No, it does not schedule itself.... you have to do that with cron.

 

Thanks for the great extension bubbaQ. Added to daily cron, now it works flawless.

 

Sorry if it's been posted before but could somone please provide an example of how to do this?

 

I used Joe's .sh example below to create a job to check parity monthly and call it from my go file

 

#!/bin/sh

crontab -l >/tmp/crontab

grep -q "/root/mdcmd check" /tmp/crontab 1>/dev/null 2>&1

if [ "$?" = "1" ]

then

    echo "# check parity on the first of every month at midnight:" >>/tmp/crontab

    echo "0 0 1 * * /root/mdcmd check 1>/dev/null 2>&1" >>/tmp/crontab

    cp /tmp/crontab /var/spool/cron/crontabs/root-

    crontab /tmp/crontab

fi

Scripts placed in the /etc/cron.daily folder are executed once a day at 4:40 AM.  If this is as good a time as any for you, just copy the

"button" script in unmenu's folder to /etc/cron.daily and make it execuitable.

 

Two lines, like this should do it:

cp /boot/unmenu/50-unmenu_user_script_smarthist-graph  /etc/cron.daily; 

chmod 755 /etc/cron.daily/50-unmenu_user_script_smarthist-graph

Link to comment

 

Scripts placed in the /etc/cron.daily folder are executed once a day at 4:40 AM.   If this is as good a time as any for you, just copy the

"button" script in unmenu's folder to /etc/cron.daily and make it execuitable.

 

Two lines, like this should do it:

cp /boot/unmenu/50-unmenu_user_script_smarthist-graph  /etc/cron.daily; 

chmod 755 /etc/cron.daily/50-unmenu_user_script_smarthist-graph

 

Awesome thanks, that's perfect.

 

 

Link to comment

 

Scripts placed in the /etc/cron.daily folder are executed once a day at 4:40 AM.   If this is as good a time as any for you, just copy the

"button" script in unmenu's folder to /etc/cron.daily and make it execuitable.

 

Two lines, like this should do it:

cp /boot/unmenu/50-unmenu_user_script_smarthist-graph  /etc/cron.daily; 

chmod 755 /etc/cron.daily/50-unmenu_user_script_smarthist-graph

 

Awesome thanks, that's perfect.

 

Er, on second thought, since I currently have a monthly parity check scheduled to run at midnight which probably won't finish by 4:30am, would this 4:30am daily smart history report cause a problem during the monthly parity check?

 

 

 

 

[

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.