Disk Errors, Requiere change disk?


Recommended Posts

Today unraid get disk errors(when 'mover' process)

See Image

ID_0e021888b39be5be2e744d2eade018ec

 

Two days ago I updated to unraid6 beta15 from 14b

 

SMART DATA

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.4-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   220   142   021    Pre-fail  Always       -       5958
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       1628
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   088   088   000    Old_age   Always       -       8895
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       588
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       84
193 Load_Cycle_Count        0x0032   174   174   000    Old_age   Always       -       79873
194 Temperature_Celsius     0x0022   116   099   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       1
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       83

 

Log say lot of times this..

Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964672
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964680
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964688
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964696
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964704
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964712
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964720
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964728
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964736
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964744
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964752
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964760
Apr 22 14:51:14 aina kernel: md: disk1 read error, sector=2577964768
Apr 22 14:52:40 aina kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 acti                                                on 0x0
Apr 22 14:52:40 aina kernel: ata6.00: irq_stat 0x40000001
Apr 22 14:52:40 aina kernel: ata6.00: failed command: READ DMA EXT
Apr 22 14:52:40 aina kernel: ata6.00: cmd 25/00:00:20:60:ab/00:02:da:00:00/e0 ta                                                g 18 dma 262144 in
Apr 22 14:52:40 aina kernel:         res 51/40:2f:f0:60:ab/00:01:da:00:00/e0 Ema                                                sk 0x9 (media error)
Apr 22 14:52:40 aina kernel: ata6.00: status: { DRDY ERR }
Apr 22 14:52:40 aina kernel: ata6.00: error: { UNC }
Apr 22 14:52:40 aina kernel: ata6.00: configured for UDMA/133
Apr 22 14:52:40 aina kernel: sd 6:0:0:0: [sdd] UNKNOWN Result: hostbyte=0x00 dri                                                verbyte=0x08
Apr 22 14:52:40 aina kernel: sd 6:0:0:0: [sdd] Sense Key : 0x3 [current] [descri                                                ptor]
Apr 22 14:52:40 aina kernel: sd 6:0:0:0: [sdd] ASC=0x11 ASCQ=0x4
Apr 22 14:52:40 aina kernel: sd 6:0:0:0: [sdd] CDB:
Apr 22 14:52:40 aina kernel: cdb[0]=0x28: 28 00 da ab 60 20 00 02 00 00
Apr 22 14:52:40 aina kernel: blk_update_request: I/O error, dev sdd, sector 3668                                                664560
Apr 22 14:52:40 aina kernel: ata6: EH complete
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664496
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664504
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664512
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664520
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664528
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664536
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664544
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664552
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664560
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664568
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664576
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664584
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664592
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664600
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664608
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664616
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664624
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664632
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664640
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664648
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664656
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664664
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664672
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664680
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664688
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664696
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664704
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664712
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664720
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664728
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664736
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664744
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664752
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664760
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664768
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664776
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664784
Apr 22 14:52:40 aina kernel: md: disk1 read error, sector=3668664792

 

Do you think I need to replace the disk? If I need to change drive, I want to install new 8TB but for this I need to change parity drive to 8TB drive and actual 6TB parity drive to data replace drive. How I can do that without lose nothing?

 

read errors means that the data is corrupted?

 

Thanks for all.

Link to comment

SMART report seems OK. Cables are easiest thing to check. Both ends of SATA and power. Are these all motherboard connections?

yes, my motherboard have 12 sata connnection on 2 controllers a marvel an intel.

I don't know exactly this disk what controller is connected because I use Disk enclousure, Tomorrow I check all, Thanks

Link to comment

Apr 22 14:52:40 aina kernel:        res 51/40:2f:f0:60:ab/00:01:da:00:00/e0 Ema                                                sk 0x9 (media error)

Apr 22 14:52:40 aina kernel: ata6.00: error: { UNC }

 

Do you think I need to replace the disk? If I need to change drive, I want to install new 8TB but for this I need to change parity drive to 8TB drive and actual 6TB parity drive to data replace drive. How I can do that without lose nothing?

 

read errors means that the data is corrupted?

 

The critical part of that is the 'media error' flag (means this is a bad sector) and 'UNC' (means UNCorrectable).  This appears to be a true bad sector, needs to be fixed, by rebuilding the drive.  This however should absolutely have shown up in the SMART report.  Are you absolutely sure you obtained those SMART attributes from the right drive?

 

In unRAID, data is not corrupted because of a read error.  It's still available through the 'virtual' disk, the simulated one, and will be correctly written onto a replacement drive.

 

The process you want (to simultaneously upgrade both the data drive and the parity drive) is called a 'swap-disable' (or something like that), but I cannot remember where the instructions are.  There's a specific set of steps that have to be carefully followed.  A search should find it.

Link to comment

Yes, ata6.00 is disk1(sdd)

 

Apr 22 18:42:52 aina kernel: ata6.00: ATA-8: WDC WD20EARS-00J99B0,      WD-WCAWZ1638173, 80.00A80, max UDMA/133
Apr 22 18:42:52 aina kernel: ata6.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Apr 22 18:42:52 aina kernel: ata6.00: configured for UDMA/133
Apr 22 18:42:52 aina kernel: ata15: SATA link down (SStatus 0 SControl 310)
Apr 22 18:42:52 aina kernel: scsi 6:0:0:0: Direct-Access     ATA      WDC WD20EARS-00J 0A80 PQ: 0 ANSI: 5
Apr 22 18:42:52 aina kernel: sd 6:0:0:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
Apr 22 18:42:52 aina kernel: sd 6:0:0:0: [sdd] 4096-byte physical blocks
Apr 22 18:42:52 aina kernel: sd 6:0:0:0: [sdd] Write Protect is off
Apr 22 18:42:52 aina kernel: sd 6:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr 22 18:42:52 aina kernel: sd 6:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr 22 18:42:52 aina kernel: sd 6:0:0:0: Attached scsi generic sg3 type 0

Link to comment

I run and extended SMART Test and this is the result... :S

 

Disk 1 attached to port: sdd
Num	Test Description	Status	Remaining	LifeTime(hours)	LBA of first error
1	Extended offline	Completed: read failure	40%	8917	2194146584
2	Short offline	Completed without error	00%	8899	None
3	Extended offline	Completed: read failure	60%	8897	1338112088
4	Short offline	Completed without error	00%	7993	None
5	Short offline	Completed without error	00%	7969	None
6	Short offline	Completed without error	00%	7946	None
7	Short offline	Completed without error	00%	7922	None
8	Short offline	Completed without error	00%	7910	None
9	Short offline	Completed without error	00%	7898	None
10	Short offline	Completed without error	00%	7891	None
11	Short offline	Completed without error	00%	7878	None
12	Short offline	Completed without error	00%	7862	None
13	Short offline	Completed without error	00%	7848	None
14	Short offline	Completed without error	00%	7812	None
15	Short offline	Completed without error	00%	7788	None
16	Short offline	Completed without error	00%	7764	None
17	Short offline	Completed without error	00%	7740	None
18	Short offline	Completed without error	00%	7718	None
19	Short offline	Completed without error	00%	7693	None
20	Short offline	Completed without error	00%	7669	None
21	Short offline	Completed without error	00%	7645	None

 

but ...

 

ID#	ATTRIBUTE NAME	FLAG	VALUE	WORST	THRESH	TYPE	UPDATED	FAILED	RAW VALUE
1	Raw Read Error Rate	0x002f	200	200	051	Pre-fail	Always	Never	0
3	Spin Up Time	0x0027	233	142	021	Pre-fail	Always	Never	5308
4	Start Stop Count	0x0032	099	099	000	Old age	Always	Never	1630
5	Reallocated Sector Ct	0x0033	200	200	140	Pre-fail	Always	Never	0
7	Seek Error Rate	0x002e	200	200	000	Old age	Always	Never	0
9	Power On Hours	0x0032	088	088	000	Old age	Always	Never	8919
10	Spin Retry Count	0x0032	100	100	000	Old age	Always	Never	0
11	Calibration Retry Count	0x0032	100	100	000	Old age	Always	Never	0
12	Power Cycle Count	0x0032	100	100	000	Old age	Always	Never	589
192	Power-Off Retract Count	0x0032	200	200	000	Old age	Always	Never	84
193	Load Cycle Count	0x0032	174	174	000	Old age	Always	Never	79875
194	Temperature Celsius	0x0022	116	099	000	Old age	Always	Never	36
196	Reallocated Event Count	0x0032	200	200	000	Old age	Always	Never	0
197	Current Pending Sector	0x0032	200	200	000	Old age	Always	Never	0
198	Offline Uncorrectable	0x0030	200	200	000	Old age	Offline	Never	0
199	UDMA CRC Error Count	0x0032	200	200	000	Old age	Always	Never	1
200	Multi Zone Error Rate	0x0008	192	192	000	Old age	Offline	Never	1888

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.