SP67

March 12, 2022

I’m attaching a capture of my array to see if it helps clarify things. Thanks for the interest.

March 12, 2022

Yeah, but the failing drive was part of a cache pool (I have one SSD for app data and one HDD for torrent downloads). So AFAIK the parity would no have worked in this case.

Copying the data from the old drive was just to avoid having to download what hadn’t already moved to the array. If this is not the proper way to do it, please correct me as I’m still learning.

March 11, 2022

Ok, so I turned the server down, added a 4 TB disk, moved the contents of the falling drive to the new one and added the new disk to the cache pool. Then I turned down the server again and removed the old drive.

So far so good, everything is going well.

thanks!

March 11, 2022

Can I stop the array, remove the drive and add a new one? Or do I need to shut down the server?

March 11, 2022

Reported uncorrect has grown from 4 to 10 in less than 24h. The drive is probably on its last leg...

For what it's worth, I've found that this drive is from the 7200.14 series from Seagate, which had early-death problems that were supposedly fixed with a later firmware. I never saw this update so the drive has been using the factory firmware since I bought it.

March 10, 2022

Thanks! Can I do that directly on unRAID?

Although it seems I might be looking at buying another drive...

March 10, 2022

Hi,

This morning the server returned some SMART errors for a 2 TB I use as a torrent download cache before moving data to the array (I've read that this reduces wear on the array).

The errors are:

187 Reported uncorrect 0x0032 096 096 000 Old age Always Never 4

197 Current pending sector 0x0012 100 100 000 Old age Always Never 8

198 Offline uncorrectable 0x0010 100 100 000 Old age Offline Never 8

I've read online that I might be able to ignore the errors as the drive will just stop using those sectors, but there didn't seem to be much consensus about it.

Any suggestion?

Thanks

Full smart report:

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.14.15-Unraid] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-9YN164
Serial Number:    
LU WWN Device Id: 
Firmware Version: CC4B
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Mar 10 12:32:32 2022 CET

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(  592) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 247) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x3085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   118   099   006    -    170591376
  3 Spin_Up_Time            PO----   093   092   000    -    0
  4 Start_Stop_Count        -O--CK   097   097   020    -    3850
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   076   060   030    -    45291769
  9 Power_On_Hours          -O--CK   091   091   000    -    8254
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   097   097   020    -    3531
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   096   096   000    -    4
188 Command_Timeout         -O--CK   100   099   000    -    1 3 3
189 High_Fly_Writes         -O-RCK   096   096   000    -    4
190 Airflow_Temperature_Cel -O---K   062   051   045    -    38 (Min/Max 21/45 #1)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    191
193 Load_Cycle_Count        -O--CK   001   001   000    -    242985
194 Temperature_Celsius     -O---K   038   049   000    -    38 (128 0 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    8
198 Offline_Uncorrectable   ----C-   100   100   000    -    8
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
240 Head_Flying_Hours       ------   100   253   000    -    4183h+31m+57.974s
241 Total_LBAs_Written      ------   100   253   000    -    183471166860639
242 Total_LBAs_Read         ------   100   253   000    -    81944486528248
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS      20  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5067  Device vendor specific log
0xbd       GPL     VS     512  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
Device Error Count: 4
	CR     = Command Register
	FEATR  = Features Register
	COUNT  = Count (was: Sector Count) Register
	LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
	LH     = LBA High (was: Cylinder High) Register    ]   LBA
	LM     = LBA Mid (was: Cylinder Low) Register      ] Register
	LL     = LBA Low (was: Sector Number) Register     ]
	DV     = Device (was: Device/Head) Register
	DC     = Device Control Register
	ER     = Error register
	ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 4 [3] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 2e 36 f7 38 00 00  Error: WP at LBA = 0x2e36f738 = 775354168

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 00 00 08 00 00 3c a1 01 60 40 00  1d+05:44:31.301  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 6f 00 d0 68 40 00  1d+05:44:31.083  WRITE FPDMA QUEUED
  61 00 00 05 20 00 00 6e f4 25 f8 40 00  1d+05:44:31.081  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 3c a1 01 58 40 00  1d+05:44:31.081  WRITE FPDMA QUEUED
  61 00 00 00 48 00 00 3c 93 15 10 40 00  1d+05:44:31.081  WRITE FPDMA QUEUED

Error 3 [2] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 2e 36 f7 38 00 00  Error: WP at LBA = 0x2e36f738 = 775354168

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 00 08 40 00 00 3f 2a 19 80 40 00  1d+05:44:28.089  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 3f 40 d9 a8 40 00  1d+05:44:28.089  WRITE FPDMA QUEUED
  61 00 00 04 60 00 00 3c 93 0b 38 40 00  1d+05:44:28.088  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 3c a1 01 50 40 00  1d+05:44:28.088  WRITE FPDMA QUEUED
  60 00 00 00 08 00 00 2e 36 f7 38 40 00  1d+05:44:28.085  READ FPDMA QUEUED

Error 2 [1] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 2e 36 f7 38 00 00  Error: WP at LBA = 0x2e36f738 = 775354168

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 00 04 c0 00 00 16 e6 3c 48 40 00  1d+05:44:25.346  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 17 1d a5 78 40 00  1d+05:44:25.346  WRITE FPDMA QUEUED
  61 00 00 04 00 00 00 6e f4 1a f8 40 00  1d+05:44:25.346  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 6f 00 d0 58 40 00  1d+05:44:25.346  WRITE FPDMA QUEUED
  60 00 00 00 08 00 00 2e 36 f7 38 40 00  1d+05:44:25.119  READ FPDMA QUEUED

Error 1 [0] occurred at disk power-on lifetime: 8245 hours (343 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 2e 36 f7 38 00 00  Error: UNC at LBA = 0x2e36f738 = 775354168

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 00 08 78 00 00 2e 37 1f 30 40 00  1d+05:44:20.856  READ FPDMA QUEUED
  60 00 00 0a 00 00 00 2e 37 15 30 40 00  1d+05:44:20.855  READ FPDMA QUEUED
  60 00 00 00 08 00 00 2e 3f 31 20 40 00  1d+05:44:20.855  READ FPDMA QUEUED
  60 00 00 03 80 00 00 2e 37 11 a8 40 00  1d+05:44:20.855  READ FPDMA QUEUED
  60 00 00 0a 00 00 00 2e 37 07 a8 40 00  1d+05:44:20.855  READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      8249         775354168
# 2  Short offline       Completed without error       00%      1536         -
# 3  Short offline       Completed without error       00%       641         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       522 (0x020a)
Device State:                        Active (0)
Current Temperature:                    37 Celsius
Power Cycle Min/Max Temperature:     21/45 Celsius
Lifetime    Min/Max Temperature:      5/49 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            3  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

March 10, 2022

Thanks, both options helped. Very much appreciated.

March 8, 2022

Hi,

I'd like to download the USB backup to flash a new USB drive (the old one seems to have died). I have searched the manuals, but it seems that I need to access the myserver web-page from the unraid's server interface, which is not available as the server is powered down.

Is it possible?

Thanks

February 20, 2022

Last time I did that I had issues with the array not stopping, so I thought about finding another way.

February 20, 2022

Ok, so I’ve run chkdsk /f and /r and it solved an error on the usb drive. Now the servers boots again. I’ll check the performance and come back to tell the news.

thanks

February 20, 2022

Ok so I think my usb drive just died. Here what I did, following this guide: https://wiki.unraid.net/Console#To_cleanly_Stop_the_array_from_the_command_line

Stopped mover

Stopped docker

Stopped Samba

umount /dev/md1

umount /dev/md2 (only 2 disk + parity in the array)

/root/mdcmd stop

reboot

Now the server refuses to boot. I do have a backup of the usb drive using the my server plugin, but I’m not sure how to proceed.

February 20, 2022

Ok, i'll try rebooting.

I cannot stop the array because mover is working (which i find extrange because its scheduled to start at 3 AM, its currently 6 PM and the server hasn't have much use lately due to its performance). I'll give it a couple more hours, but any advice in case it is stuck?

February 20, 2022

The "fix common problems" plugins has found the following error:

Your server has run out of memory, and processes (potentially required) are being killed off. You should post your diagnostics and ask for assistance on the unRaid forums". I'm attaching the diagnostics.

nas-diagnostics-20220220-1137.zip

February 20, 2022

I'm abroad, but for what my girlfriend is telling me, it's also very slow. Jellyfin is also unusable using local access.

February 20, 2022

Hi,

Since a few days ago, I'm getting very slow performance (close to unusable). Access via SMB (using VPN) has become really slow (down from a couple MB/s to 50-100 kB/s), using the web interface is quite painful, etc.

I've checked the system log as I found the following error repeatedly: TCP: out of memory -- consider tuning tcp_mem.

After searching online I have not found anything of help. I attach some more lines of the log in case something can be of help.

Thanks!

Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state
Feb 20 16:48:37 NAS kernel: veth31a3898: renamed from eth0
Feb 20 16:48:37 NAS avahi-daemon[1742]: Interface veth7126499.IPv6 no longer relevant for mDNS.
Feb 20 16:48:37 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth7126499.IPv6 with address fe80::5036:b8ff:fe52:19d.
Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state
Feb 20 16:48:37 NAS kernel: device veth7126499 left promiscuous mode
Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state
Feb 20 16:48:37 NAS avahi-daemon[1742]: Withdrawing address record for fe80::5036:b8ff:fe52:19d on veth7126499.
Feb 20 16:50:23 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state
Feb 20 16:50:23 NAS kernel: veth75ec41e: renamed from eth0
Feb 20 16:50:24 NAS avahi-daemon[1742]: Interface veth1cd4cd0.IPv6 no longer relevant for mDNS.
Feb 20 16:50:24 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth1cd4cd0.IPv6 with address fe80::4470:62ff:fe0c:e210.
Feb 20 16:50:24 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state
Feb 20 16:50:24 NAS kernel: device veth1cd4cd0 left promiscuous mode
Feb 20 16:50:24 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state
Feb 20 16:50:24 NAS avahi-daemon[1742]: Withdrawing address record for fe80::4470:62ff:fe0c:e210 on veth1cd4cd0.
Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state
Feb 20 16:53:46 NAS kernel: veth945447c: renamed from eth0
Feb 20 16:53:46 NAS avahi-daemon[1742]: Interface veth9105c65.IPv6 no longer relevant for mDNS.
Feb 20 16:53:46 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth9105c65.IPv6 with address fe80::64e4:56ff:fe1f:baa6.
Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state
Feb 20 16:53:46 NAS kernel: device veth9105c65 left promiscuous mode
Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state
Feb 20 16:53:46 NAS avahi-daemon[1742]: Withdrawing address record for fe80::64e4:56ff:fe1f:baa6 on veth9105c65.
Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state
Feb 20 16:54:14 NAS kernel: veth5922185: renamed from eth0
Feb 20 16:54:14 NAS avahi-daemon[1742]: Interface veth33f6111.IPv6 no longer relevant for mDNS.
Feb 20 16:54:14 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth33f6111.IPv6 with address fe80::40ba:fbff:fef8:f50d.
Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state
Feb 20 16:54:14 NAS kernel: device veth33f6111 left promiscuous mode
Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state
Feb 20 16:54:14 NAS avahi-daemon[1742]: Withdrawing address record for fe80::40ba:fbff:fef8:f50d on veth33f6111.
Feb 20 16:54:32 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 16:55:15 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 16:56:03 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 16:59:27 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 17:00:13 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 17:00:13 NAS kernel: TCP: out of memory -- consider tuning tcp_mem

December 31, 2021

Hi,

I'm having some issues trying to set-up the docker without VPN. I have changed the /data folder but, how should i proceed with the other options that refer to the VPN?

Other issue I'm having is that rTorrent is not uploading practically anything (including torrents with leechers). I think it is because I need to open some port on my router, but I'm not sure which one is it. I assume it is the listening port that ruTorrent lists on the settings, but should I add anything on the docker config? Or just open it on my router and point it to the unraid server?

Thanks in advance

SP67

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by SP67

SMART errors

SMART errors

SMART errors

SMART errors

SMART errors

SMART errors

SMART errors

How to download usb backup with powered-down server

How to download usb backup with powered-down server

Very slow performance - TCP out of memory

Very slow performance - TCP out of memory

Very slow performance - TCP out of memory

Very slow performance - TCP out of memory

Very slow performance - TCP out of memory

Very slow performance - TCP out of memory

Very slow performance - TCP out of memory

[Support] binhex - rTorrentVPN