Jump to content

read only files and finding 5500 errors


Recommended Posts

it seems that some of my files are being listed as read only and i am unable to alter them using smb. Here is the syslog. need some help on next steps to address this?

 

Aug 4 22:24:13 SERVER shfs/user: shfs_unlink: unlink: /mnt/disk6/Personal/-=-Account Info-=-/New Text Document.txt (30) Read-only file system
Aug 4 22:24:16 SERVER avahi-daemon[32059]: Invalid response packet from host 192.168.0.102.
Aug 4 22:24:23 SERVER avahi-daemon[32059]: Invalid response packet from host 192.168.0.102.
Aug 4 22:24:23 SERVER avahi-daemon[32059]: Invalid response packet from host 192.168.0.106.
Aug 4 22:24:39 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/-=-Account Info-=-/account.txt (30) Read-only file system
Aug 4 22:24:44 SERVER avahi-daemon[32059]: Invalid response packet from host 192.168.0.106.
Aug 4 22:24:48 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/-=-Account Info-=-/account.txt (30) Read-only file system

 

When i do a ls -l on that directory, every thing looks ok

 

root@SERVER:/mnt/disk6/Personal/-=-Account Info-=-# ls -l
total 100
-rw-rw-rw- 1 nobody users     0 2015-06-15 22:03 New\ Text\ Document.txt
-rw-rw-rw- 1 nobody users    96 2014-11-08 17:21 account.txt

 

Ran a check using and ran a check : reiserfsck --check /dev/md6

 

Will read-only check consistency of the filesystem on /dev/md6
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
###########
reiserfsck --check started at Tue Aug  4 22:28:06 2015
###########
Filesystem seems mounted read-only. Skipping journal replay.
Checking internal tree.. \/  1 (of  23|/ 27 (of  86// 93 (of 170-root@SERVER:/mn

 

Getting a lot of these

 

bad_indirect_item: block 115197421: The item (3478 3551 0x1 IND (1), len 2688, location 96 entry count 0, fsck need 0, format new) has the bad pointer (177) to the block (115159433), which is in tree already
bad_indirect_item: block 115197421: The item (3478 3551 0x1 IND (1), len 2688, location 96 entry count 0, fsck need 0, format new) has the bad pointer (178) to the block (115159434), which is in tree already
bad_indirect_item: block 115197421: The item (3478 3551 0x1 IND (1), len 2688, location 96 entry count 0, fsck need 0, format new) has the bad pointer (179) to the block (115159435), which is in tree already
bad_indirect_item: block 115197421: The item (3478 3551 0x1 IND (1), len 2688, location 96 entry count 0, fsck need 0, format new) has the bad pointer (180) to the block (115159436), which is in tree already                    /  3 (of 170|

 

Aug  4 08:23:12 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0648.CR2 (30) Read-only file system
Aug  4 08:23:12 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0651.CR2 (30) Read-only file system
Aug  4 08:23:13 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0654.CR2 (30) Read-only file system
Aug  4 08:23:13 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0657.CR2 (30) Read-only file system
Aug  4 08:23:13 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0658.CR2 (30) Read-only file system
Aug  4 08:23:14 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0660.CR2 (30) Read-only file system
Aug  4 08:23:14 SERVER shfs/user: shfs_open: open: /mnt/disk6/Personal/`Pictures/LightRoom Pictures/2014/2014_05_18/IMG_0663.CR2 (30) Read-only file system

 

Array Status -> using unmenu

STARTED, 7 disks in array.    Parity is Valid:.  Last parity check 3 days ago .  Parity updated  5500  times to address sync errors.

Last checked on Sun Aug 2 07:23:33 2015 ADT, finding 5500 errors.

 

 

syslog-2015-08-04_1of2.zip

syslog-2015-08-04_2of2.zip

Link to comment

ok finished "--rebuild-tree" and "Check will start a Parity-Check.Correct any Parity-Check errors by writing the Parity disk with corrected parity." after a restart, i still see this message in unmenu

 

STOPPED, unRAID ARRAY is STOPPED 7 disks in array.    Parity is Valid:.  Last parity check < 1 day ago .  Parity updated  6266  times to address sync errors.

 

any ideas?

Link to comment

How doe one check on what is currently set?

I haven't used v5 recently. On the page where you tell it to start a parity check, is there a checkbox or anything to set this? If not then it probably does the default correcting parity check.

 

So if you are doing monthly parity check, are you using unMenu for this? I'm pretty sure it has a setting for the monthly. I'm not sure if that controls all parity checks or just the monthly one though.

 

If you look in your syslog there should be some entries when it does a parity check. I think when it starts is says either "check CORRECT" or "check NOCORRECT".

 

Looks like the parity checks you mentioned corrected errors, so if you didn't do anything to make them correcting I would assume that's what it always does.

Link to comment

ok in unmenu i found the script that does the monthly check and it was set to "NOCORRECT". i have changed it to "CORRECT"

 

http://Tower:8080/pkg_manager?select-monthly_parity_check=Select+monthly_parity_check

Description:	This package installs a script that will schedule a monthly parity check on the 1st of the month at midnight. 
Use NOCORRECT if you do not want parity to be automatically updated. 
Use CORRECT if you do want it automatically updated. It is recommended that you NOT automatically correct parity, since it might be a data drive that is in error, and the parity drive might be correct. 
Challenge is determining which is correct, and which is in error. unRAID normally assumes the data is correct and parity is wrong. Pressing the "Check" button on the web-interface will check AND update parity based on the data disks

 

As for the check you advised me to do again, it finished and now shows 3333 errors, so its dropping. Running another check now

 

Last checked on Sat Aug 8 13:47:47 2015 ADT, finding 3333 errors.

Last checked on Sat Aug 5              2015 ADT, finding 6266 errors.

Last checked on Sat Aug 1           2015 ADT, finding 5500 errors.

 

Link to comment

I don't know how you got to this point, but after a correcting parity check, all parity errors should be fixed and subsequent parity checks should have zero parity errors. Not less than the last time but exactly zero. Normally there should be NO parity errors. Any parity errors at all need to be investigated. If your parity isn't ALL correct then you can't expect to rebuild a failed drive.

Link to comment

So, was your most recent parity check a correcting parity check?

 

It's not entirely clear to me whether there has been a correcting parity check, which should have corrected all parity errors, followed by another parity check, which would have verified that no parity errors remained.

 

Have you done more than one parity check since the last time you rebooted? If so, post a syslog.

Link to comment

Looking at the syslog it seems you may have some sort of file system corruption on disk 6 that needs resolving.

 

It would also be sensible to get SMART reports for all your disks to see if any of them have non-zero values for Pending Sectors as that might be a possible cause of inconsistent results when trying to correct parity.

Link to comment

Here is the smart report. The only issue i see is the two oldies drivers (2x 1TB) are over the power on hours. the Disk 6 you pointed out is the newest drive to the system. its also the same drive that reported errors when i did reiserfsck --check and --rebuild-tree

 

Pending Sectors

S300JX7M: OK - Current_Pending_Sector is 0
WD-WCAV52771708: OK - Current_Pending_Sector is 0
5VPCREE7: OK - Current_Pending_Sector is 0
WD-WMAVU0591376: OK - Current_Pending_Sector is 0
WD-WCAWZ2958513: OK - Current_Pending_Sector is 0
WD-WCC1T1372439: OK - Current_Pending_Sector is 0
WD-WCC1T1394124: OK - Current_Pending_Sector is 0
Z500DL5F: OK - Current_Pending_Sector is 0

 

DISK6 Report

Z500DL5F: OK - ATA_Error_Count is 
Z500DL5F: OK - Raw_Read_Error_Rate is 174723920
Z500DL5F: OK - Spin_Up_Time is 0
Z500DL5F: OK - Start_Stop_Count is 515
Z500DL5F: OK - Spin_Retry_Count is 0
Z500DL5F: OK - Power_Cycle_Count is 6
Z500DL5F: OK - Reallocated_Sector_Ct is 0
Z500DL5F: OK - Seek_Error_Rate is 18022861
Z500DL5F: OK - Temperature_Celsius is 39 (0 19 0 0 0)
Z500DL5F: OK - Current_Pending_Sector is 0
Z500DL5F: OK - Power_On_Hours is 2527
Z500DL5F: OK - Offline_Uncorrectable is 0
Z500DL5F: OK - UDMA_CRC_Error_Count is 0

 

 

Full SMART report

Main
Array Management
Disk Management
Syslog
myMain
Useful Links
Disk Performance
Network Performance
Disk Usage
Smart History
Dupe Files
System Info
File Browser
Share ISO
User Scripts
Config View/Edit
Pkg Manager
unRAID Main
About
Help
SERVER unRAID ServerSun Aug 9 13:10:50 ADT 2015
Array Status
STARTED; 7 disks in array.

Parity Check in progress	
Total Size	3,907,018,532	 KB
Current	2,025,657,136	 (51.8%)
Speed	40,333	 KB/sec
Finish	774	 minutes
Sync Errors	0	 (corrected)

S300JX7M: OK - ATA_Error_Count is 
S300JX7M: OK - Raw_Read_Error_Rate is 160237480
S300JX7M: OK - Spin_Up_Time is 0
S300JX7M: OK - Start_Stop_Count is 1736
S300JX7M: OK - Spin_Retry_Count is 0
S300JX7M: OK - Power_Cycle_Count is 56
S300JX7M: OK - Reallocated_Sector_Ct is 0
S300JX7M: OK - Seek_Error_Rate is 43262053
S300JX7M: OK - Temperature_Celsius is 31 (0 14 0 0 0)
S300JX7M: OK - Current_Pending_Sector is 0
S300JX7M: OK - Power_On_Hours is 8175
S300JX7M: OK - Offline_Uncorrectable is 0
S300JX7M: OK - UDMA_CRC_Error_Count is 0
WD-WCAV52771708: OK - ATA_Error_Count is 5
WD-WCAV52771708: OK - Raw_Read_Error_Rate is 0
WD-WCAV52771708: OK - Spin_Up_Time is 6641
WD-WCAV52771708: OK - Start_Stop_Count is 8374
WD-WCAV52771708: OK - Spin_Retry_Count is 0
WD-WCAV52771708: OK - Calibration_Retry_Count is 0
WD-WCAV52771708: OK - Power_Cycle_Count is 632
WD-WCAV52771708: OK - Reallocated_Sector_Ct is 0
WD-WCAV52771708: OK - Seek_Error_Rate is 0
WD-WCAV52771708: OK - Temperature_Celsius is 42
WD-WCAV52771708: OK - Reallocated_Event_Count is 0
WD-WCAV52771708: OK - Current_Pending_Sector is 0
WD-WCAV52771708: WARNING - Power_On_Hours it is now 43676 (warning threshold is 20000)
WD-WCAV52771708: OK - Offline_Uncorrectable is 0
WD-WCAV52771708: OK - UDMA_CRC_Error_Count is 0
WD-WCAV52771708: OK - Multi_Zone_Error_Rate is 0
5VPCREE7: OK - ATA_Error_Count is 
5VPCREE7: OK - Raw_Read_Error_Rate is 171151553
5VPCREE7: OK - Spin_Up_Time is 0
5VPCREE7: OK - Start_Stop_Count is 2144
5VPCREE7: OK - Spin_Retry_Count is 0
5VPCREE7: OK - Power_Cycle_Count is 91
5VPCREE7: OK - Reallocated_Sector_Ct is 0
5VPCREE7: OK - Seek_Error_Rate is 5670907
5VPCREE7: OK - Temperature_Celsius is 30 (0 18 0 0 0)
5VPCREE7: OK - Current_Pending_Sector is 0
5VPCREE7: OK - Power_On_Hours is 13354
5VPCREE7: OK - Offline_Uncorrectable is 0
5VPCREE7: OK - UDMA_CRC_Error_Count is 0
WD-WMAVU0591376: OK - ATA_Error_Count is 
WD-WMAVU0591376: OK - Raw_Read_Error_Rate is 0
WD-WMAVU0591376: OK - Spin_Up_Time is 5850
WD-WMAVU0591376: OK - Start_Stop_Count is 5580
WD-WMAVU0591376: OK - Spin_Retry_Count is 0
WD-WMAVU0591376: OK - Calibration_Retry_Count is 0
WD-WMAVU0591376: OK - Power_Cycle_Count is 415
WD-WMAVU0591376: OK - Reallocated_Sector_Ct is 0
WD-WMAVU0591376: OK - Seek_Error_Rate is 0
WD-WMAVU0591376: OK - Temperature_Celsius is 28
WD-WMAVU0591376: OK - Reallocated_Event_Count is 0
WD-WMAVU0591376: OK - Current_Pending_Sector is 0
WD-WMAVU0591376: *ERROR* - Power_On_Hours it is now 44650 (error threshold is 44000)
WD-WMAVU0591376: OK - Offline_Uncorrectable is 0
WD-WMAVU0591376: OK - UDMA_CRC_Error_Count is 0
WD-WMAVU0591376: OK - Multi_Zone_Error_Rate is 0
WD-WCAWZ2958513: OK - ATA_Error_Count is 
WD-WCAWZ2958513: OK - Raw_Read_Error_Rate is 0
WD-WCAWZ2958513: OK - Spin_Up_Time is 9875
WD-WCAWZ2958513: OK - Start_Stop_Count is 2694
WD-WCAWZ2958513: OK - Spin_Retry_Count is 0
WD-WCAWZ2958513: OK - Calibration_Retry_Count is 0
WD-WCAWZ2958513: OK - Power_Cycle_Count is 104
WD-WCAWZ2958513: OK - Reallocated_Sector_Ct is 0
WD-WCAWZ2958513: OK - Seek_Error_Rate is 0
WD-WCAWZ2958513: OK - Temperature_Celsius is 33
WD-WCAWZ2958513: OK - Reallocated_Event_Count is 0
WD-WCAWZ2958513: OK - Current_Pending_Sector is 0
WD-WCAWZ2958513: OK - Power_On_Hours is 15774
WD-WCAWZ2958513: OK - Offline_Uncorrectable is 0
WD-WCAWZ2958513: OK - UDMA_CRC_Error_Count is 0
WD-WCAWZ2958513: OK - Multi_Zone_Error_Rate is 0
WD-WCC1T1372439: OK - ATA_Error_Count is 
WD-WCC1T1372439: OK - Raw_Read_Error_Rate is 0
WD-WCC1T1372439: OK - Spin_Up_Time is 5950
WD-WCC1T1372439: OK - Start_Stop_Count is 2887
WD-WCC1T1372439: OK - Spin_Retry_Count is 0
WD-WCC1T1372439: OK - Calibration_Retry_Count is 0
WD-WCC1T1372439: OK - Power_Cycle_Count is 107
WD-WCC1T1372439: OK - Reallocated_Sector_Ct is 0
WD-WCC1T1372439: OK - Seek_Error_Rate is 0
WD-WCC1T1372439: OK - Temperature_Celsius is 32
WD-WCC1T1372439: OK - Reallocated_Event_Count is 0
WD-WCC1T1372439: OK - Current_Pending_Sector is 0
WD-WCC1T1372439: OK - Power_On_Hours is 14740
WD-WCC1T1372439: OK - Offline_Uncorrectable is 0
WD-WCC1T1372439: OK - UDMA_CRC_Error_Count is 0
WD-WCC1T1372439: OK - Multi_Zone_Error_Rate is 0
WD-WCC1T1394124: OK - ATA_Error_Count is 
WD-WCC1T1394124: OK - Raw_Read_Error_Rate is 10
WD-WCC1T1394124: OK - Spin_Up_Time is 6100
WD-WCC1T1394124: OK - Start_Stop_Count is 2624
WD-WCC1T1394124: OK - Spin_Retry_Count is 0
WD-WCC1T1394124: OK - Calibration_Retry_Count is 0
WD-WCC1T1394124: OK - Power_Cycle_Count is 106
WD-WCC1T1394124: OK - Reallocated_Sector_Ct is 0
WD-WCC1T1394124: OK - Seek_Error_Rate is 0
WD-WCC1T1394124: OK - Temperature_Celsius is 32
WD-WCC1T1394124: OK - Reallocated_Event_Count is 0
WD-WCC1T1394124: OK - Current_Pending_Sector is 0
WD-WCC1T1394124: OK - Power_On_Hours is 14350
WD-WCC1T1394124: OK - Offline_Uncorrectable is 0
WD-WCC1T1394124: *ERROR* - UDMA_CRC_Error_Count it is now 176600 (error threshold is 75)
WD-WCC1T1394124: OK - Multi_Zone_Error_Rate is 0
Z500DL5F: OK - ATA_Error_Count is 
Z500DL5F: OK - Raw_Read_Error_Rate is 174723920
Z500DL5F: OK - Spin_Up_Time is 0
Z500DL5F: OK - Start_Stop_Count is 515
Z500DL5F: OK - Spin_Retry_Count is 0
Z500DL5F: OK - Power_Cycle_Count is 6
Z500DL5F: OK - Reallocated_Sector_Ct is 0
Z500DL5F: OK - Seek_Error_Rate is 18022861
Z500DL5F: OK - Temperature_Celsius is 39 (0 19 0 0 0)
Z500DL5F: OK - Current_Pending_Sector is 0
Z500DL5F: OK - Power_On_Hours is 2527
Z500DL5F: OK - Offline_Uncorrectable is 0
Z500DL5F: OK - UDMA_CRC_Error_Count is 0
8 device(s) active, 1 sleeping, 0 did not return SMART data.

Link to comment

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    Z500DL5F
LU WWN Device Id: 5 000c50 079456cc3
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Aug  9 14:39:40 2015 ADT

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
				was never started.
				Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		(   89) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				No Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 334) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   116   099   006    Pre-fail  Always       -       103032368
  3 Spin_Up_Time            0x0003   097   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       515
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   072   060   030    Pre-fail  Always       -       18027181
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       2529
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       6
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   096   096   000    Old_age   Always       -       4
190 Airflow_Temperature_Cel 0x0022   062   057   045    Old_age   Always       -       38 (Min/Max 37/39)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       23
193 Load_Cycle_Count        0x0032   097   097   000    Old_age   Always       -       6049
194 Temperature_Celsius     0x0022   038   043   000    Old_age   Always       -       38 (0 19 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       664h+59m+31.878s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       17161621216
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       60570380593

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...