queeg Posted March 2, 2010 Share Posted March 2, 2010 Has anyone tried HDSentinel for hard drive health tests? I just came across it today and they have a free command line version for linux that I downloaded and ran on unRAID. It's telling me one of my drives has a health rating at 40%. I'm looking for reviews of this tool to see how reliable it is. Also, I downloaded the windows version and it's really interesting so far. http://www.hdsentinel.com/download_hard_disk_sentinel.php Quote Link to comment
Joe L. Posted March 2, 2010 Share Posted March 2, 2010 Looks very interesting... can you perform a smartctl -a -d ata /dev/??? on your disk that reported a 40% rating? I'd love to see what they consider in their evaluation. Quote Link to comment
WeeboTech Posted March 2, 2010 Share Posted March 2, 2010 What a great tool. We should check into this more and post smartlogs on the lower level lifetimes to gather stats. It could help with automating checks later on. This looks like a great tool to be used in a cron monitor. Quote Link to comment
queeg Posted March 2, 2010 Author Share Posted March 2, 2010 Below is the output of HDSentinel. Attached is the smartctl output for /dev/sdb. root@Queeg:~# /mnt/user/Queeg/downloads/hdSentinel/HDSentinel Hard Disk Sentinel for LINUX console 0.03 © 2008-2009 [email protected] Start with -r [reportfile] to save data to report, -h for help Examining hard disk configuration ... HDD Device 0: /dev/sda HDD Model ID : ST3500320AS HDD Serial No: 9QM0EAHW HDD Revision : SD15 HDD Size : 476940 MB Interface : S-ATA Temperature : 29 ??C Health : 100 % Performance : 100 % Power on time: 279 days, 12 hours Est. lifetime: more than 1000 days HDD Device 1: /dev/sdb HDD Model ID : ST3500320AS HDD Serial No: 9QM26B35 HDD Revision : SD15 HDD Size : 476940 MB Interface : S-ATA Temperature : 30 ??C Health : 40 % Performance : 100 % Power on time: 443 days, 17 hours Est. lifetime: 221 days HDD Device 2: /dev/sdc HDD Model ID : ST3500320AS HDD Serial No: 5QM01S58 HDD Revision : SD04 HDD Size : 476940 MB Interface : S-ATA Temperature : 30 ??C Health : 97 % Performance : 100 % Power on time: 490 days, 5 hours Est. lifetime: more than 1000 days HDD Device 3: /dev/sdd HDD Model ID : Lexar JD FireFly HDD Serial No: ? HDD Revision : 1100 HDD Size : 1911 MB Interface : SCSI Temperature : Unknown ??C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: root@Queeg:~# cp /mnt/user/Queeg/downloads/hdSentinel/HDSentinel /boot root@Queeg:~# smartctl.txt Quote Link to comment
WeeboTech Posted March 2, 2010 Share Posted March 2, 2010 Thanks for posting. It's very interesting to see the results. After review I wonder how they came up with the 40% health and a target of 221 days. Somewhere there is a threshold to calculate this. Quote Link to comment
queeg Posted March 2, 2010 Author Share Posted March 2, 2010 I ran it a minute ago and now the Est. lifetime is 220 days. It's picked a specific date. Friday, October 8, 2010 Quote Link to comment
WeeboTech Posted March 2, 2010 Share Posted March 2, 2010 I ran it a minute ago and now the Est. lifetime is 220 days. It's picked a specific date. Friday, October 8, 2010 It's a conspiracy... it's destined to fail a few days after the warranty expires j/k Quote Link to comment
Romir Posted March 2, 2010 Share Posted March 2, 2010 I used the windows version to monitor my netbook domain server's head parking totals. It was happening 6 times a minute, 60,000 times a week, so I ended up disabling that with smartctl. The power usage went up 1w to ... 6w total, with gigabit and 2gb of ram! Quote Link to comment
prostuff1 Posted March 6, 2010 Share Posted March 6, 2010 For grin and giggles I went ahead and ran this tool on my server just a little while ago. Here is the output: root@Tower:~# /boot/hdsentinel Hard Disk Sentinel for LINUX console 0.03 (c) 2008-2009 [email protected] Start with -r [reportfile] to save data to report, -h for help Examining hard disk configuration ... HDD Device 0: /dev/sda HDD Model ID : LEXAR JD FIREFLY HDD Serial No: ? HDD Revision : 1100 HDD Size : 967 MB Interface : SCSI Temperature : Unknown °C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: HDD Device 1: /dev/sdb HDD Model ID : ST31500341AS HDD Serial No: 6VS04GWN HDD Revision : CC3G HDD Size : 1430799 MB Interface : S-ATA II Temperature : 24 °C Health : 79 % Performance : 100 % Power on time: 373 days, 7 hours Est. lifetime: 906 days HDD Device 2: /dev/sdc HDD Model ID : ST31000333AS HDD Serial No: 6TE0G93C HDD Revision : CC1F HDD Size : 953870 MB Interface : S-ATA II Temperature : 31 °C Health : 99 % Performance : 100 % Power on time: 36 days, 6 hours Est. lifetime: more than 1000 days HDD Device 3: /dev/sdd HDD Model ID : Hitachi HDS722020ALA330 HDD Serial No: JK1121YAGABMMS HDD Revision : JKAOA20N HDD Size : 1907729 MB Interface : S-ATA II Temperature : 30 °C Health : 100 % Performance : 100 % Power on time: 97 days, 5 hours Est. lifetime: more than 1000 days HDD Device 4: /dev/sde HDD Model ID : ST3750640AS HDD Serial No: 5QD5ELLF HDD Revision : 3.AAE HDD Size : 715405 MB Interface : S-ATA II Temperature : 35 °C Health : 100 % Performance : 100 % Power on time: 428 days, 12 hours Est. lifetime: more than 1000 days HDD Device 5: /dev/sdf HDD Model ID : WDC WD5000AAKS-00TMA0 HDD Serial No: WD-WCAPW2595673 HDD Revision : 12.01C01 HDD Size : 476940 MB Interface : S-ATA II Temperature : 26 °C Health : 100 % Performance : 100 % Power on time: 852 days, 10 hours Est. lifetime: more than 972 days HDD Device 6: /dev/sdg HDD Model ID : WDC WD5000AAKS-00TMA0 HDD Serial No: WD-WCAPW2132942 HDD Revision : 12.01C01 HDD Size : 476940 MB Interface : S-ATA II Temperature : 23 °C Health : 100 % Performance : 100 % Power on time: 763 days, 23 hours Est. lifetime: more than 1000 days HDD Device 7: /dev/sdh HDD Model ID : SAMSUNG HD753LJ HDD Serial No: S13UJ1MQ330294 HDD Revision : 1AA01110 HDD Size : 774090 MB Interface : S-ATA II Temperature : 17 °C Health : 97 % Performance : 100 % Power on time: 600 days, 1 hours Est. lifetime: more than 1000 days HDD Device 8: /dev/sdi HDD Model ID : WDC WD1600JD-40HBC0 HDD Serial No: WD-WCAL94036151 HDD Revision : 21.02J21 HDD Size : 178861 MB Interface : S-ATA Temperature : 31 °C Health : 100 % Performance : 100 % Power on time: 1363 days, 20 hours Est. lifetime: more than 461 days HDD Device 9: /dev/sdj HDD Model ID : Hitachi HDS722020ALA330 HDD Serial No: JK1131YAG93KBV HDD Revision : JKAOA20N HDD Size : 1907729 MB Interface : S-ATA II Temperature : 33 °C Health : 100 % Performance : 100 % Power on time: 97 days, 5 hours Est. lifetime: more than 1000 days It looks like it considers all my drives good. Quote Link to comment
prostuff1 Posted March 6, 2010 Share Posted March 6, 2010 For another comparison I have another tower (TestTower) that I have been messing with and have been moving very very old drives in and out of. I ran HDSentinel on that tower and here is the output I got back: root@TestTower:~# /boot/hdsentinel Hard Disk Sentinel for LINUX console 0.03 (c) 2008-2009 [email protected] Start with -r [reportfile] to save data to report, -h for help Examining hard disk configuration ... HDD Device 0: /dev/hda HDD Model ID : WDC WD300EB-75CPF0 HDD Serial No: WD-WMAATA036162 HDD Revision : 06.04G06 HDD Size : 33550 MB Interface : IDE/ATA Temperature : Unknown °C Health : 93 % Performance : 100 % Power on time: 419 days, 3 hours Est. lifetime: more than 1000 days HDD Device 1: /dev/hdb HDD Model ID : ST360015A HDD Serial No: 3KC21QHW HDD Revision : 3.33 HDD Size : 57242 MB Interface : IDE/ATA Temperature : 35 °C Health : 100 % Performance : 100 % Power on time: 919 days, 8 hours Est. lifetime: more than 905 days HDD Device 2: /dev/sda HDD Model ID : WDC WD1200JD-75GBB0 HDD Serial No: WD-WMAET1372951 HDD Revision : 02.05D02 HDD Size : 134110 MB Interface : S-ATA Temperature : 35 °C Health : 4 % Performance : 100 % Power on time: 1215 days, 13 hours Est. lifetime: 4 days HDD Device 3: /dev/sdb HDD Model ID : Verbatim STORE N GO HDD Serial No: PMAP1234 HDD Revision : 5.00 HDD Size : 7631 MB Interface : SCSI Temperature : Unknown °C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: As you can see the one drive is basically dying. I know this to begin with though I was hoping to save it so that I could at least use it in the this test machine so I had as close to a "complete" test system as possible. I am really hoping that a 5.0 beta is coming soon but I realize there is more important things to catch up on. All the orders and now the inclusion of the Supermicro card to support to name a few. Quote Link to comment
sdballer Posted March 7, 2010 Share Posted March 7, 2010 I've used this program on my Windows machines (about 2 years) and has worked very well. Quote Link to comment
purko Posted March 8, 2010 Share Posted March 8, 2010 For another comparison I have another tower (TestTower) that I have been messing with and have been moving very very old drives in and out of. I ran HDSentinel on that tower and here is the output I got back: HDD Device 2: /dev/sda HDD Model ID : WDC WD1200JD-75GBB0 HDD Serial No: WD-WMAET1372951 HDD Revision : 02.05D02 HDD Size : 134110 MB Interface : S-ATA Temperature : 35 °C Health : 4 % Performance : 100 % Power on time: 1215 days, 13 hours Est. lifetime: 4 days As you can see the one drive is basically dying. I know this to begin withclusion of the Supermicro card to support to name a few. It's funny how it estimates its remaining lifetime to be 4 days. Call us back in four days, will you? Quote Link to comment
prostuff1 Posted March 8, 2010 Share Posted March 8, 2010 For another comparison I have another tower (TestTower) that I have been messing with and have been moving very very old drives in and out of. I ran HDSentinel on that tower and here is the output I got back: HDD Device 2: /dev/sda HDD Model ID : WDC WD1200JD-75GBB0 HDD Serial No: WD-WMAET1372951 HDD Revision : 02.05D02 HDD Size : 134110 MB Interface : S-ATA Temperature : 35 °C Health : 4 % Performance : 100 % Power on time: 1215 days, 13 hours Est. lifetime: 4 days As you can see the one drive is basically dying. I know this to begin withclusion of the Supermicro card to support to name a few. It's funny how it estimates its remaining lifetime to be 4 days. Call us back in four days, will you? It was close, and it may have lasted had I not tried to run prelcear on it. I had it running, and it was taking its good sweet time. The drive was so SOL that it was going on the 48 hour mark and was only done with 50% of the first step in preclear. I gave up on it and cancelled the preclear. I am going to take it apart tonight to see what is inside the drive. Quote Link to comment
WeeboTech Posted March 8, 2010 Share Posted March 8, 2010 It was close, and it may have lasted had I not tried to run prelcear on it. I had it running, and it was taking its good sweet time. The drive was so SOL that it was going on the 48 hour mark and was only done with 50% of the first step in preclear. I gave up on it and cancelled the preclear. I am going to take it apart tonight to see what is inside the drive. Can you post the smartlogs on this drive. I would be interested in seeing how it calculates 4% health. Quote Link to comment
prostuff1 Posted March 8, 2010 Share Posted March 8, 2010 Can you post the smartlogs on this drive. I would be interested in seeing how it calculates 4% health. Your wish is my command: root@TestTower:~# smartctl -a -d ata /dev/sda smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Western Digital Caviar SE Serial ATA family Device Model: WDC WD1200JD-75GBB0 Serial Number: WD-WMAET1372951 Firmware Version: 02.05D02 User Capacity: 120,000,000,000 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Mon Mar 8 10:40:48 2010 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. See vendor-specific Attribute list for failed Attributes. General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: (3796) seconds. Offline data collection capabilities: (0x79) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 53) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 001 001 051 Pre-fail Always FAILING_NOW 31473 3 Spin_Up_Time 0x0007 146 144 021 Pre-fail Always - 3241 4 Start_Stop_Count 0x0032 100 100 040 Old_age Always - 388 5 Reallocated_Sector_Ct 0x0033 187 187 140 Pre-fail Always - 195 7 Seek_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0 9 Power_On_Hours 0x0032 060 060 000 Old_age Always - 29218 10 Spin_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 387 194 Temperature_Celsius 0x0022 107 253 000 Old_age Always - 43 196 Reallocated_Event_Count 0x0032 178 178 000 Old_age Always - 22 197 Current_Pending_Sector 0x0012 165 165 000 Old_age Always - 695 198 Offline_Uncorrectable 0x0012 176 176 000 Old_age Always - 485 199 UDMA_CRC_Error_Count 0x000a 200 253 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0009 134 155 051 Pre-fail Offline - 2136 SMART Error Log Version: 1 ATA Error Count: 27583 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 27583 occurred at disk power-on lifetime: 807 hours (33 days + 15 hours) When the command that caused the error occurred, the device was doing SMART Offline or Self-test. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 01 00 00 e0 Error: Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 00 00 c8 00 00 08 00 00 1d+09:48:57.450 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+09:48:57.450 NOP [Abort queued commands] 00 00 ef 00 00 45 00 00 1d+09:48:57.450 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+09:48:57.450 NOP [Abort queued commands] Error 27582 occurred at disk power-on lifetime: 803 hours (33 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 ed 1d 4a e5 Error: Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 00 00 c8 00 00 08 00 00 1d+05:40:44.100 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:44.100 NOP [Abort queued commands] 00 00 ef 00 00 44 00 00 1d+05:40:44.100 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:44.100 NOP [Abort queued commands] 00 00 c8 00 00 08 00 00 1d+05:40:44.100 NOP [Abort queued commands] Error 27581 occurred at disk power-on lifetime: 803 hours (33 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 ed 1d 4a e5 Error: Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 00 00 c8 00 00 08 00 00 1d+05:40:42.000 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:42.000 NOP [Abort queued commands] 00 00 ef 00 00 44 00 00 1d+05:40:42.000 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:42.000 NOP [Abort queued commands] 00 00 c8 00 00 08 00 00 1d+05:40:42.000 NOP [Abort queued commands] Error 27580 occurred at disk power-on lifetime: 803 hours (33 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 ed 1d 4a e5 Error: Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 00 00 c8 00 00 08 00 00 1d+05:40:39.850 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:39.850 NOP [Abort queued commands] 00 00 ef 00 00 44 00 00 1d+05:40:39.850 NOP [Abort queued commands] 00 00 27 00 00 00 00 00 1d+05:40:39.850 NOP [Abort queued commands] 05 00 4a 00 00 e8 1d 00 1d+05:40:39.850 [RESERVED] Error 27579 occurred at disk power-on lifetime: 803 hours (33 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 ed 1d 4a e5 Error: Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 00 00 c8 00 00 08 00 00 1d+05:40:37.750 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:37.750 NOP [Abort queued commands] 00 00 ef 00 00 44 00 00 1d+05:40:37.750 NOP [Abort queued commands] 00 00 00 00 00 00 00 00 1d+05:40:37.750 NOP [Abort queued commands] 00 00 c8 00 00 08 00 00 1d+05:40:37.750 NOP [Abort queued commands] SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 774 122951383 # 2 Short offline Completed without error 00% 430 - # 3 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Quote Link to comment
JudyZ Posted March 8, 2015 Share Posted March 8, 2015 Yup, it's 5 years since the last post I'm running HDSentinel Enterprise to monitor SMART info for all my Windows machines and would like to monitor unraid drives too. I'm struggling to install HDSentinel under unraid 5.0-rc12a and woudl be grateful for guidance. Instructions at http://www.hdsentinel.com/hard_disk_sentinel_linux.php are terminal-based but suggest chmod 755 ./HDS.zip sudo ./HDS.zip etc. HDS.zip is in /boot, chmod works then running it gives 'cannot execute binary file' So what am I missing please? And how would I make it auto-start at boot time please? Many thanks, Judy Quote Link to comment
c3 Posted March 8, 2015 Share Posted March 8, 2015 1) Please update to a supported version of unRAID. You are running beta software, and the released version is readily available. 2) The file you downloaded needs to be decompressed. The .zip indicates a compressed file. This step is listed on the website you linked. Quote Link to comment
JudyZ Posted March 9, 2015 Share Posted March 9, 2015 Thanks for the rapid reply. The linked website says to 'double click to open and decompress it to any folder ' I tried that first under windows then copied the unzipped file to /boot chmod 755 .HDS sudo ./HDS but get the same 'cannot execute binary file'. And how does one arrange for it to auto-start at boot please? Quote Link to comment
JudyZ Posted March 9, 2015 Share Posted March 9, 2015 Thanks for your help. My error was thinking unraid was 64-bit rather than 32-bit. It turns out that HDSentinel has execute bits set when copied over from Windows, and it runs fine from the USB key - a relief! If someone could advise how to make HDSentinel start automatically, that would be cool. To see its output in HDSentinel server on a Windows machine, I need to find out how to make it talk on port 61230. Does unraid ship with a firewall active? Sample output from an out of date unraid test server attached. Cheers Judy HDS.txt Quote Link to comment
trurl Posted March 10, 2015 Share Posted March 10, 2015 Yup, it's 5 years since the last post A lot has happened in those 5 years. Many of the people who were on the thread back then are still around, and they are getting some of the same functionality by just using the tools built-in to the latest v6 beta and its webGUI. It's also 64-bit. Quote Link to comment
JudyZ Posted March 10, 2015 Share Posted March 10, 2015 A lot has happened in those 5 years Yes, I'll cut over to v6 eventually too With 20+ machines / 70+ HDDs, I need an HDSentinel-based solution as it's tough to monitor all that, and the enterprise variant shows all drives on one screen. HDS rocks by the way - surface testing shows soft sectors and their locations so I can watch the drives age and replace before they die. Two drives have soft cylinders so have repartitioned them to only use the good areas - not on unraid though! Quote Link to comment
marcsayer Posted March 13, 2015 Share Posted March 13, 2015 HDSentinel is a great program. I use it in all 10 of my computers. There is a Pro Windows version, an Enterprise version (server), a Linux version, a Linux daemon version, a home version, and a free trial version that never expires (it just lacks some of the higher order functions). The guy who runs the company and develops the program, Janos, is a gem of a human being. He is as valuable as the product itself. Finding nice, helpful technical experts who will actually talk with you, is just refreshing. Even back when I was only using the trial version, Janos was more than happy to help me problem solve. He is why I went so strongly with HDS, that and the program itself and it's capabilities. And there are others here who feel the same way. As JudyZ's comments will attest -- I liked Janos' HDS program so much that I installed it on my Linux computers too. But Installation and use was strictly manual, command-line stuff in Linux. And it overwrites the report each time you run it, unless you manually rename the old report before you run HDS again. So I wrote a couple simple scripts, to help automate the installation, put launchers in a couple logical places, and give you an automated way to run the program, generate a new report, and save it automatically, with the date/time in the filename, thereby building a library of searchable reports. HDSentinel is a gem, it does a great many basic tasks really well. Plus it helps you avoid catastrophic failures, premature retirements of drives, and gives you options for reclaiming or salvaging a disk with issues. http://www.hdsentinel.com/ -- general info and links http://www.hdsentinel.com/add-on-linux-installers.php -- the program bundled with my installers and scripts Now Janos, Judy and I are working on a way to include the HDS Daemon on an unRAID USB drive, have it run automatically at startup, and have its output be monitored by the Enterprise (server) version of HDS either in Docker or remotely. Quote Link to comment
ljm42 Posted March 13, 2015 Share Posted March 13, 2015 This would be great as a plugin, where it could potentially be incorporated into the Disk Health tab in the gui, and send notifications when it detects problems. Quote Link to comment
flaggart Posted March 13, 2015 Share Posted March 13, 2015 What I am about to describe might be purely a coincidence, but it does not feel like it. I ran HDSentinal on my 5.0.6 box and afterwards a disk showed as "disabled" in the devices list. Stopping and restarting the array did not fix it, nor did rebooting. Here is the complete output from running HDSentinal - you can see that it could not read temp data from /dev/hdh the first time, and the second time it failed to read any data at all: root@BIGBOX:/mnt/cache/sentintel# ./HDSentinel Hard Disk Sentinel for LINUX console 0.08 (c) 2008-2011 [email protected] Start with -r [reportfile] to save data to report, -h for help Examining hard disk configuration ... HDD Device 0: /dev/sda HDD Model ID : SanDisk Cruzer Edge HDD Serial No: 20051740710F5C01ADC9 HDD Revision : 1.26 HDD Size : 7633 MB Interface : SCSI Temperature : Unknown °C Highest Temp.: Unknown °C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: HDD Device 1: /dev/sdb HDD Model ID : ST3000DM001-9YN166 HDD Serial No: W1F101TJ HDD Revision : CC9F HDD Size : 2861588 MB Interface : S-ATA II Temperature : 26 °C Highest Temp.: 48 °C Health : 100 % Performance : 100 % Power on time: 565 days, 15 hours Est. lifetime: more than 1000 days HDD Device 2: /dev/sdc HDD Model ID : WDC WD20EARX-00MMMB0 HDD Serial No: WD-WCAWZ2783588 HDD Revision : 80.00A80 HDD Size : 1907729 MB Interface : S-ATA II Temperature : 25 °C Highest Temp.: 48 °C Health : 100 % Performance : 100 % Power on time: 637 days, 9 hours Est. lifetime: more than 1000 days HDD Device 3: /dev/sdd HDD Model ID : SAMSUNG HD204UI HDD Serial No: S2H7J1BZB29336 HDD Revision : 1AQ10001 HDD Size : 1907729 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 45 °C Health : 100 % Performance : 100 % Power on time: 1154 days, 14 hours Est. lifetime: more than 670 days HDD Device 4: /dev/sde HDD Model ID : ST4000DM000-1F2168 HDD Serial No: S300PVJ0 HDD Revision : CC54 HDD Size : 3815448 MB Interface : S-ATA II Temperature : 27 °C Highest Temp.: 49 °C Health : 100 % Performance : 100 % Power on time: 121 days, 2 hours Est. lifetime: more than 1000 days HDD Device 5: /dev/sdf HDD Model ID : Samsung SSD 840 EVO 250GB HDD Serial No: S1DBNSADA51208J HDD Revision : EXT0BB0Q HDD Size : 238475 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 40 °C Health : 100 % Performance : 100 % Power on time: 2 days, 12 hours, 40 minutes Est. lifetime: more than 1000 days HDD Device 6: /dev/sdg HDD Model ID : Hitachi HDS5C3020ALA632 HDD Serial No: ML0220F30R1MXD HDD Revision : ML6OA580 HDD Size : 1907729 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 42 °C Health : 100 % Performance : 100 % Power on time: 1108 days, 16 hours Est. lifetime: more than 716 days HDD Device 7: /dev/sdh HDD Model ID : WDC WD20EADS-00R6B0 HDD Serial No: WD-WCAVY1187400 HDD Revision : 01.00A01 HDD Size : 1907729 MB Interface : S-ATA II Temperature : Unknown °C Highest Temp.: Unknown °C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: HDD Device 8: /dev/sdi HDD Model ID : ST4000DM000-1F2168 HDD Serial No: S300N8RB HDD Revision : CC54 HDD Size : 3815448 MB Interface : S-ATA II Temperature : 22 °C Highest Temp.: 49 °C Health : 100 % Performance : 100 % Power on time: 121 days, 8 hours Est. lifetime: more than 1000 days HDD Device 9: /dev/sdj HDD Model ID : ST4000DM000-1F2168 HDD Serial No: S3008WD3 HDD Revision : CC54 HDD Size : 3815448 MB Interface : S-ATA II Temperature : 23 °C Highest Temp.: 40 °C Health : 100 % Performance : 100 % Power on time: 36 days, 10 hours Est. lifetime: more than 1000 days HDD Device 10: /dev/sdk HDD Model ID : ST2000DL003-9VT166 HDD Serial No: 5YD6G1TX HDD Revision : CC3C HDD Size : 1907729 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 43 °C Health : 100 % Performance : 100 % Power on time: 1963 days, 0 hours Est. lifetime: more than 100 days HDD Device 11: /dev/sdl HDD Model ID : WDC WD30EZRX-00D8PB0 HDD Serial No: WD-WCC4N0115460 HDD Revision : 80.00A80 HDD Size : 2861588 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 42 °C Health : 100 % Performance : 100 % Power on time: 315 days, 9 hours Est. lifetime: more than 1000 days HDD Device 12: /dev/sdm HDD Model ID : ST3000DM001-9YN166 HDD Serial No: W1F1A2J9 HDD Revision : CC9F HDD Size : 2861588 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 48 °C Health : 100 % Performance : 100 % Power on time: 513 days, 17 hours Est. lifetime: more than 1000 days HDD Device 13: /dev/sdn HDD Model ID : TOSHIBA DT01ACA300 HDD Serial No: 531SBGZGS HDD Revision : MX6OABB0 HDD Size : 2861588 MB Interface : S-ATA II Temperature : 27 °C Highest Temp.: 47 °C Health : 100 % Performance : 100 % Power on time: 394 days, 1 hours Est. lifetime: more than 1000 days root@BIGBOX:/mnt/cache/sentintel# ./HDSentinel Hard Disk Sentinel for LINUX console 0.08 (c) 2008-2011 [email protected] Start with -r [reportfile] to save data to report, -h for help Examining hard disk configuration ... HDD Device 0: /dev/sda HDD Model ID : SanDisk Cruzer Edge HDD Serial No: 20051740710F5C01ADC9 HDD Revision : 1.26 HDD Size : 7633 MB Interface : SCSI Temperature : Unknown °C Highest Temp.: Unknown °C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: HDD Device 1: /dev/sdb HDD Model ID : ST3000DM001-9YN166 HDD Serial No: W1F101TJ HDD Revision : CC9F HDD Size : 2861588 MB Interface : S-ATA II Temperature : 28 °C Highest Temp.: 48 °C Health : 100 % Performance : 100 % Power on time: 565 days, 15 hours Est. lifetime: more than 1000 days HDD Device 2: /dev/sdc HDD Model ID : WDC WD20EARX-00MMMB0 HDD Serial No: WD-WCAWZ2783588 HDD Revision : 80.00A80 HDD Size : 1907729 MB Interface : S-ATA II Temperature : 27 °C Highest Temp.: 48 °C Health : 100 % Performance : 100 % Power on time: 637 days, 9 hours Est. lifetime: more than 1000 days HDD Device 3: /dev/sdd HDD Model ID : SAMSUNG HD204UI HDD Serial No: S2H7J1BZB29336 HDD Revision : 1AQ10001 HDD Size : 1907729 MB Interface : S-ATA II Temperature : 25 °C Highest Temp.: 45 °C Health : 100 % Performance : 100 % Power on time: 1154 days, 14 hours Est. lifetime: more than 670 days HDD Device 4: /dev/sde HDD Model ID : ST4000DM000-1F2168 HDD Serial No: S300PVJ0 HDD Revision : CC54 HDD Size : 3815448 MB Interface : S-ATA II Temperature : 27 °C Highest Temp.: 49 °C Health : 100 % Performance : 100 % Power on time: 121 days, 2 hours Est. lifetime: more than 1000 days HDD Device 5: /dev/sdf HDD Model ID : Samsung SSD 840 EVO 250GB HDD Serial No: S1DBNSADA51208J HDD Revision : EXT0BB0Q HDD Size : 238475 MB Interface : S-ATA II Temperature : 28 °C Highest Temp.: 40 °C Health : 100 % Performance : 100 % Power on time: 2 days, 12 hours, 40 minutes Est. lifetime: more than 1000 days HDD Device 6: /dev/sdg HDD Model ID : Hitachi HDS5C3020ALA632 HDD Serial No: ML0220F30R1MXD HDD Revision : ML6OA580 HDD Size : 1907729 MB Interface : S-ATA II Temperature : 25 °C Highest Temp.: 42 °C Health : 100 % Performance : 100 % Power on time: 1108 days, 16 hours Est. lifetime: more than 716 days HDD Device 7: /dev/sdh HDD Model ID : ? HDD Serial No: ? HDD Revision : ? HDD Size : 0 MB Interface : SCSI Temperature : Unknown °C Highest Temp.: Unknown °C Health : Unknown % Performance : Unknown % Power on time: Est. lifetime: HDD Device 8: /dev/sdi HDD Model ID : ST4000DM000-1F2168 HDD Serial No: S300N8RB HDD Revision : CC54 HDD Size : 3815448 MB Interface : S-ATA II Temperature : 23 °C Highest Temp.: 49 °C Health : 100 % Performance : 100 % Power on time: 121 days, 8 hours Est. lifetime: more than 1000 days HDD Device 9: /dev/sdj HDD Model ID : ST4000DM000-1F2168 HDD Serial No: S3008WD3 HDD Revision : CC54 HDD Size : 3815448 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 40 °C Health : 100 % Performance : 100 % Power on time: 36 days, 10 hours Est. lifetime: more than 1000 days HDD Device 10: /dev/sdk HDD Model ID : ST2000DL003-9VT166 HDD Serial No: 5YD6G1TX HDD Revision : CC3C HDD Size : 1907729 MB Interface : S-ATA II Temperature : 25 °C Highest Temp.: 43 °C Health : 100 % Performance : 100 % Power on time: 1963 days, 0 hours Est. lifetime: more than 100 days HDD Device 11: /dev/sdl HDD Model ID : WDC WD30EZRX-00D8PB0 HDD Serial No: WD-WCC4N0115460 HDD Revision : 80.00A80 HDD Size : 2861588 MB Interface : S-ATA II Temperature : 24 °C Highest Temp.: 42 °C Health : 100 % Performance : 100 % Power on time: 315 days, 9 hours Est. lifetime: more than 1000 days HDD Device 12: /dev/sdm HDD Model ID : ST3000DM001-9YN166 HDD Serial No: W1F1A2J9 HDD Revision : CC9F HDD Size : 2861588 MB Interface : S-ATA II Temperature : 25 °C Highest Temp.: 48 °C Health : 100 % Performance : 100 % Power on time: 513 days, 17 hours Est. lifetime: more than 1000 days HDD Device 13: /dev/sdn HDD Model ID : TOSHIBA DT01ACA300 HDD Serial No: 531SBGZGS HDD Revision : MX6OABB0 HDD Size : 2861588 MB Interface : S-ATA II Temperature : 28 °C Highest Temp.: 47 °C Health : 100 % Performance : 100 % Power on time: 394 days, 1 hours Est. lifetime: more than 1000 days root@BIGBOX:/mnt/cache/sentintel# ./HDSentinel -solid /dev/sda ? ? ? SanDisk_Cruzer_Edge 20051740710F5C01ADC9 7633 /dev/sdb 29 100 13575 ST3000DM001-9YN166 W1F101TJ 2861588 /dev/sdc 27 100 15297 WDC_WD20EARX-00MMMB0 WD-WCAWZ2783588 1907729 /dev/sdd 25 100 27710 SAMSUNG_HD204UI S2H7J1BZB29336 1907729 /dev/sde 27 100 2906 ST4000DM000-1F2168 S300PVJ0 3815448 /dev/sdf 29 100 60 Samsung_SSD_840_EVO_250GB S1DBNSADA51208J 238475 /dev/sdg 25 100 26608 Hitachi_HDS5C3020ALA632 ML0220F30R1MXD 1907729 /dev/sdh ? ? ? ? ? 0 /dev/sdi 24 100 2912 ST4000DM000-1F2168 S300N8RB 3815448 /dev/sdj 24 100 874 ST4000DM000-1F2168 S3008WD3 3815448 /dev/sdk 26 100 47112 ST2000DL003-9VT166 5YD6G1TX 1907729 /dev/sdl 25 100 7569 WDC_WD30EZRX-00D8PB0 WD-WCC4N0115460 2861588 /dev/sdm 26 100 12329 ST3000DM001-9YN166 W1F1A2J9 2861588 /dev/sdn 29 100 9457 TOSHIBA_DT01ACA300 531SBGZGS 2861588 root@BIGBOX:/mnt/cache/sentintel# cd .. root@BIGBOX:/mnt/cache# cd .. root@BIGBOX:/mnt# cd .. root@BIGBOX:/# Broadcast message from root (Fri Mar 13 21:46:41 2015): The system is going down for reboot NOW! Quote Link to comment
flaggart Posted March 13, 2015 Share Posted March 13, 2015 I have removed the drive and connected it to a Windows PC - it powers up, partitions are still there, and I can access the data using RFSTool - so basically the drive seems fine. However it refuses to work in unRAID. It powers up and is detected but the array config page says it is disabled. The temperature is also not displayed. There is no option to rebuilt data to the drive, so essentially it now seems to be useless in unRAID. Edit: Followed the procedure at http://lime-technology.com/wiki/index.php/Troubleshooting#Re-enable_the_drive Now the data is being rebuilt. But this seems silly, since the data was still OK! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.