![](http://content.invisioncic.com/u329766/set_resources_34/84c1e40ea0e759e3f1505eb1788ddf3c_pattern.png)
SP67
-
Posts
69 -
Joined
-
Last visited
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Posts posted by SP67
-
-
Yeah, but the failing drive was part of a cache pool (I have one SSD for app data and one HDD for torrent downloads). So AFAIK the parity would no have worked in this case.
Copying the data from the old drive was just to avoid having to download what hadn’t already moved to the array. If this is not the proper way to do it, please correct me as I’m still learning.
-
Ok, so I turned the server down, added a 4 TB disk, moved the contents of the falling drive to the new one and added the new disk to the cache pool. Then I turned down the server again and removed the old drive.
So far so good, everything is going well.
thanks!
-
Can I stop the array, remove the drive and add a new one? Or do I need to shut down the server?
-
Reported uncorrect has grown from 4 to 10 in less than 24h. The drive is probably on its last leg...
For what it's worth, I've found that this drive is from the 7200.14 series from Seagate, which had early-death problems that were supposedly fixed with a later firmware. I never saw this update so the drive has been using the factory firmware since I bought it.
-
Thanks! Can I do that directly on unRAID?
Although it seems I might be looking at buying another drive...
-
Hi,
This morning the server returned some SMART errors for a 2 TB I use as a torrent download cache before moving data to the array (I've read that this reduces wear on the array).
The errors are:
187 Reported uncorrect 0x0032 096 096 000 Old age Always Never 4
197 Current pending sector 0x0012 100 100 000 Old age Always Never 8
198 Offline uncorrectable 0x0010 100 100 000 Old age Offline Never 8
I've read online that I might be able to ignore the errors as the drive will just stop using those sectors, but there didn't seem to be much consensus about it.
Any suggestion?
Thanks
Full smart report:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.14.15-Unraid] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST2000DM001-9YN164 Serial Number: LU WWN Device Id: Firmware Version: CC4B User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Thu Mar 10 12:32:32 2022 CET ==> WARNING: A firmware update for this drive may be available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM level is: 128 (minimum power consumption without standby) Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Unavailable === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: ( 592) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 247) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-- 118 099 006 - 170591376 3 Spin_Up_Time PO---- 093 092 000 - 0 4 Start_Stop_Count -O--CK 097 097 020 - 3850 5 Reallocated_Sector_Ct PO--CK 100 100 036 - 0 7 Seek_Error_Rate POSR-- 076 060 030 - 45291769 9 Power_On_Hours -O--CK 091 091 000 - 8254 10 Spin_Retry_Count PO--C- 100 100 097 - 0 12 Power_Cycle_Count -O--CK 097 097 020 - 3531 183 Runtime_Bad_Block -O--CK 100 100 000 - 0 184 End-to-End_Error -O--CK 100 100 099 - 0 187 Reported_Uncorrect -O--CK 096 096 000 - 4 188 Command_Timeout -O--CK 100 099 000 - 1 3 3 189 High_Fly_Writes -O-RCK 096 096 000 - 4 190 Airflow_Temperature_Cel -O---K 062 051 045 - 38 (Min/Max 21/45 #1) 191 G-Sense_Error_Rate -O--CK 100 100 000 - 0 192 Power-Off_Retract_Count -O--CK 100 100 000 - 191 193 Load_Cycle_Count -O--CK 001 001 000 - 242985 194 Temperature_Celsius -O---K 038 049 000 - 38 (128 0 0 0 0) 197 Current_Pending_Sector -O--C- 100 100 000 - 8 198 Offline_Uncorrectable ----C- 100 100 000 - 8 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 240 Head_Flying_Hours ------ 100 253 000 - 4183h+31m+57.974s 241 Total_LBAs_Written ------ 100 253 000 - 183471166860639 242 Total_LBAs_Read ------ 100 253 000 - 81944486528248 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning General Purpose Log Directory Version 1 SMART Log Directory Version 1 [multi-sector log support] Address Access R/W Size Description 0x00 GPL,SL R/O 1 Log Directory 0x01 SL R/O 1 Summary SMART error log 0x02 SL R/O 5 Comprehensive SMART error log 0x03 GPL R/O 5 Ext. Comprehensive SMART error log 0x06 SL R/O 1 SMART self-test log 0x07 GPL R/O 1 Extended self-test log 0x09 SL R/W 1 Selective self-test log 0x10 GPL R/O 1 NCQ Command Error log 0x11 GPL R/O 1 SATA Phy Event Counters log 0x21 GPL R/O 1 Write stream error log 0x22 GPL R/O 1 Read stream error log 0x80-0x9f GPL,SL R/W 16 Host vendor specific log 0xa1 GPL,SL VS 20 Device vendor specific log 0xa2 GPL VS 4496 Device vendor specific log 0xa8 GPL,SL VS 20 Device vendor specific log 0xa9 GPL,SL VS 1 Device vendor specific log 0xab GPL VS 1 Device vendor specific log 0xb0 GPL VS 5067 Device vendor specific log 0xbd GPL VS 512 Device vendor specific log 0xbe-0xbf GPL VS 65535 Device vendor specific log 0xc0 GPL,SL VS 1 Device vendor specific log 0xe0 GPL,SL R/W 1 SCT Command/Status 0xe1 GPL,SL R/W 1 SCT Data Transfer SMART Extended Comprehensive Error Log Version: 1 (5 sectors) Device Error Count: 4 CR = Command Register FEATR = Features Register COUNT = Count (was: Sector Count) Register LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8 LH = LBA High (was: Cylinder High) Register ] LBA LM = LBA Mid (was: Cylinder Low) Register ] Register LL = LBA Low (was: Sector Number) Register ] DV = Device (was: Device/Head) Register DC = Device Control Register ER = Error register ST = Status register Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 4 [3] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 00 2e 36 f7 38 00 00 Error: WP at LBA = 0x2e36f738 = 775354168 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 61 00 00 00 08 00 00 3c a1 01 60 40 00 1d+05:44:31.301 WRITE FPDMA QUEUED 61 00 00 00 08 00 00 6f 00 d0 68 40 00 1d+05:44:31.083 WRITE FPDMA QUEUED 61 00 00 05 20 00 00 6e f4 25 f8 40 00 1d+05:44:31.081 WRITE FPDMA QUEUED 61 00 00 00 08 00 00 3c a1 01 58 40 00 1d+05:44:31.081 WRITE FPDMA QUEUED 61 00 00 00 48 00 00 3c 93 15 10 40 00 1d+05:44:31.081 WRITE FPDMA QUEUED Error 3 [2] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 00 2e 36 f7 38 00 00 Error: WP at LBA = 0x2e36f738 = 775354168 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 61 00 00 08 40 00 00 3f 2a 19 80 40 00 1d+05:44:28.089 WRITE FPDMA QUEUED 61 00 00 00 08 00 00 3f 40 d9 a8 40 00 1d+05:44:28.089 WRITE FPDMA QUEUED 61 00 00 04 60 00 00 3c 93 0b 38 40 00 1d+05:44:28.088 WRITE FPDMA QUEUED 61 00 00 00 08 00 00 3c a1 01 50 40 00 1d+05:44:28.088 WRITE FPDMA QUEUED 60 00 00 00 08 00 00 2e 36 f7 38 40 00 1d+05:44:28.085 READ FPDMA QUEUED Error 2 [1] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 00 2e 36 f7 38 00 00 Error: WP at LBA = 0x2e36f738 = 775354168 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 61 00 00 04 c0 00 00 16 e6 3c 48 40 00 1d+05:44:25.346 WRITE FPDMA QUEUED 61 00 00 00 08 00 00 17 1d a5 78 40 00 1d+05:44:25.346 WRITE FPDMA QUEUED 61 00 00 04 00 00 00 6e f4 1a f8 40 00 1d+05:44:25.346 WRITE FPDMA QUEUED 61 00 00 00 08 00 00 6f 00 d0 58 40 00 1d+05:44:25.346 WRITE FPDMA QUEUED 60 00 00 00 08 00 00 2e 36 f7 38 40 00 1d+05:44:25.119 READ FPDMA QUEUED Error 1 [0] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 00 2e 36 f7 38 00 00 Error: UNC at LBA = 0x2e36f738 = 775354168 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 00 08 78 00 00 2e 37 1f 30 40 00 1d+05:44:20.856 READ FPDMA QUEUED 60 00 00 0a 00 00 00 2e 37 15 30 40 00 1d+05:44:20.855 READ FPDMA QUEUED 60 00 00 00 08 00 00 2e 3f 31 20 40 00 1d+05:44:20.855 READ FPDMA QUEUED 60 00 00 03 80 00 00 2e 37 11 a8 40 00 1d+05:44:20.855 READ FPDMA QUEUED 60 00 00 0a 00 00 00 2e 37 07 a8 40 00 1d+05:44:20.855 READ FPDMA QUEUED SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 8249 775354168 # 2 Short offline Completed without error 00% 1536 - # 3 Short offline Completed without error 00% 641 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. SCT Status Version: 3 SCT Version (vendor specific): 522 (0x020a) Device State: Active (0) Current Temperature: 37 Celsius Power Cycle Min/Max Temperature: 21/45 Celsius Lifetime Min/Max Temperature: 5/49 Celsius Under/Over Temperature Limit Count: 0/0 SCT Data Table command not supported SCT Error Recovery Control command not supported Device Statistics (GP/SMART Log 0x04) not supported Pending Defects log (GP Log 0x0c) not supported SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x000a 2 3 Device-to-host register FISes sent due to a COMRESET 0x0001 2 0 Command failed due to ICRC error 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS
-
Thanks, both options helped. Very much appreciated.
-
Hi,
I'd like to download the USB backup to flash a new USB drive (the old one seems to have died). I have searched the manuals, but it seems that I need to access the myserver web-page from the unraid's server interface, which is not available as the server is powered down.
Is it possible?
Thanks
-
Last time I did that I had issues with the array not stopping, so I thought about finding another way.
-
Ok, so I’ve run chkdsk /f and /r and it solved an error on the usb drive. Now the servers boots again. I’ll check the performance and come back to tell the news.
thanks
-
Ok so I think my usb drive just died. Here what I did, following this guide: https://wiki.unraid.net/Console#To_cleanly_Stop_the_array_from_the_command_line
Stopped mover
Stopped docker
Stopped Samba
umount /dev/md1
umount /dev/md2 (only 2 disk + parity in the array)
/root/mdcmd stop
reboot
Now the server refuses to boot. I do have a backup of the usb drive using the my server plugin, but I’m not sure how to proceed.
-
Ok, i'll try rebooting.
I cannot stop the array because mover is working (which i find extrange because its scheduled to start at 3 AM, its currently 6 PM and the server hasn't have much use lately due to its performance). I'll give it a couple more hours, but any advice in case it is stuck?
-
The "fix common problems" plugins has found the following error:
Your server has run out of memory, and processes (potentially required) are being killed off. You should post your diagnostics and ask for assistance on the unRaid forums". I'm attaching the diagnostics.
-
I'm abroad, but for what my girlfriend is telling me, it's also very slow. Jellyfin is also unusable using local access.
-
Hi,
Since a few days ago, I'm getting very slow performance (close to unusable). Access via SMB (using VPN) has become really slow (down from a couple MB/s to 50-100 kB/s), using the web interface is quite painful, etc.
I've checked the system log as I found the following error repeatedly: TCP: out of memory -- consider tuning tcp_mem.
After searching online I have not found anything of help. I attach some more lines of the log in case something can be of help.
Thanks!
Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state Feb 20 16:48:37 NAS kernel: veth31a3898: renamed from eth0 Feb 20 16:48:37 NAS avahi-daemon[1742]: Interface veth7126499.IPv6 no longer relevant for mDNS. Feb 20 16:48:37 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth7126499.IPv6 with address fe80::5036:b8ff:fe52:19d. Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state Feb 20 16:48:37 NAS kernel: device veth7126499 left promiscuous mode Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state Feb 20 16:48:37 NAS avahi-daemon[1742]: Withdrawing address record for fe80::5036:b8ff:fe52:19d on veth7126499. Feb 20 16:50:23 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state Feb 20 16:50:23 NAS kernel: veth75ec41e: renamed from eth0 Feb 20 16:50:24 NAS avahi-daemon[1742]: Interface veth1cd4cd0.IPv6 no longer relevant for mDNS. Feb 20 16:50:24 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth1cd4cd0.IPv6 with address fe80::4470:62ff:fe0c:e210. Feb 20 16:50:24 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state Feb 20 16:50:24 NAS kernel: device veth1cd4cd0 left promiscuous mode Feb 20 16:50:24 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state Feb 20 16:50:24 NAS avahi-daemon[1742]: Withdrawing address record for fe80::4470:62ff:fe0c:e210 on veth1cd4cd0. Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state Feb 20 16:53:46 NAS kernel: veth945447c: renamed from eth0 Feb 20 16:53:46 NAS avahi-daemon[1742]: Interface veth9105c65.IPv6 no longer relevant for mDNS. Feb 20 16:53:46 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth9105c65.IPv6 with address fe80::64e4:56ff:fe1f:baa6. Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state Feb 20 16:53:46 NAS kernel: device veth9105c65 left promiscuous mode Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state Feb 20 16:53:46 NAS avahi-daemon[1742]: Withdrawing address record for fe80::64e4:56ff:fe1f:baa6 on veth9105c65. Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state Feb 20 16:54:14 NAS kernel: veth5922185: renamed from eth0 Feb 20 16:54:14 NAS avahi-daemon[1742]: Interface veth33f6111.IPv6 no longer relevant for mDNS. Feb 20 16:54:14 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth33f6111.IPv6 with address fe80::40ba:fbff:fef8:f50d. Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state Feb 20 16:54:14 NAS kernel: device veth33f6111 left promiscuous mode Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state Feb 20 16:54:14 NAS avahi-daemon[1742]: Withdrawing address record for fe80::40ba:fbff:fef8:f50d on veth33f6111. Feb 20 16:54:32 NAS kernel: TCP: out of memory -- consider tuning tcp_mem Feb 20 16:55:15 NAS kernel: TCP: out of memory -- consider tuning tcp_mem Feb 20 16:56:03 NAS kernel: TCP: out of memory -- consider tuning tcp_mem Feb 20 16:59:27 NAS kernel: TCP: out of memory -- consider tuning tcp_mem Feb 20 17:00:13 NAS kernel: TCP: out of memory -- consider tuning tcp_mem Feb 20 17:00:13 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
-
Hi,
I'm having some issues trying to set-up the docker without VPN. I have changed the /data folder but, how should i proceed with the other options that refer to the VPN?
Other issue I'm having is that rTorrent is not uploading practically anything (including torrents with leechers). I think it is because I need to open some port on my router, but I'm not sure which one is it. I assume it is the listening port that ruTorrent lists on the settings, but should I add anything on the docker config? Or just open it on my router and point it to the unraid server?
Thanks in advance
SMART errors
in Storage Devices and Controllers
Posted · Edited by SP67
I’m attaching a capture of my array to see if it helps clarify things. Thanks for the interest.