Repeating Error On Screen

August 14, 201411 yr

Hello,

I initiated a parity sync last night and went to bed. I have a monitor attached to my UNRAID server and when I woke up the monitor is scrolling this over and over again.

reiserfs error (device sds1): zam-7001 reiserfs_find_entry: io error

I ran the command:

cp /var/log/syslog /boot/syslog.txt

to grab my log, but it's empty.

UNRAID login: root
Linux 3.9.11p-unRAID.
root@UNRAID:~# cp /var/log/syslog /boot/syslog.txt
root@UNRAID:~#
root@UNRAID:~# cp /var/log/syslog /boot/syslog.txt
root@UNRAID:~# cp /var/log/syslog /boot/syslog1.txt
root@UNRAID:~# cat /car/log/syslog
cat: /car/log/syslog: No such file or directory
root@UNRAID:~# cat /var/log/syslog
root@UNRAID:~# tail -f /var/log/syslog
^C
root@UNRAID:~#

The webgui is still available, and parity sync is still going.

Does anyone have any suggestions, thanks for the help!

Quote

August 15, 201411 yr

Author

I just got home and have had some time to try and look into this a bit further.

I don't seem to have a resolution, but I have found this article.

http://lime-technology.com/forum/index.php?topic=8386.0

In this thread Joe L. tells Teamhood that his file system looks to have become corrupt. He was receiving a combination of the error I was receiving plus an error that his file system is read only. I am not receiving the read-only error.

Should I attempt a file system repair?

Thank You

Quote

August 15, 201411 yr

See check disk filesystems in my sig.

Quote

August 16, 201411 yr

Author

Hello dgaschk,

You are always the one that replies and helps me, it's happened several times over the years and I really appreciate your assistance.

I think my cache drive is dying. I noticed that Simple Features showed a smart error on the drive and it would go away and come back. I ran a

smartctl -a -A  /dev/sds | todos >/boot/smart.txt

on the drive and got this output:

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               /10:0:7:
Product:              0
Physical block size:  0 bytes
Lowest aligned LBA:   14138
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

I did what Smartctl told me and ran the following:

smartctl -a -A -T permissive /dev/sds | todos >/boot/smart.txt

and got this

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

Short INQUIRY response, skip product id

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Read defect list: asked for grown list but didn't get it
Error Counter logging not supported

Device does not support Self Test logging

Do you think I should still run the check disk? Or just replace the drive?

Quote

August 16, 201411 yr

Author

I took out the disk and put it in another computer. I then ran a SeaTools quick scan on the drive. SeaTools claims the drive is good and that S.M.A.R.T. has not been tripped.

Quote

August 16, 201411 yr

I took out the disk and put it in another computer. I then ran a SeaTools quick scan on the drive. SeaTools claims the drive is good and that S.M.A.R.T. has not been tripped.

If you have fie system corruption of any sort then the SMART report would not show this. Running reiserfsck is the only way to fix such issues.

Having said that a disk can pass the SMART check and still be failing. It would be useful if you provided the full output of the SMART report so that we can see I'd there are signs of problems. One item that is of particular interest is whether the value or Pending reallocated sectors is none-zero.

Quote

August 16, 201411 yr

Author

Hello itimpi,

After putting the disk back in and running smartctl again, i was able to get the full output.

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1CH164
Serial Number:    <REDACTED>
LU WWN Device Id: 5 000c50 04f347dfa
Firmware Version: CC24
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Aug 16 03:02:48 2014 PDT

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
				was never started.
				Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		(  584) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				No Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 219) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x3085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       127972408
  3 Spin_Up_Time            0x0003   097   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       137
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   090   060   030    Pre-fail  Always       -       955454005
  9 Power_On_Hours          0x0032   085   085   000    Old_age   Always       -       13189
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       134
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       11 11 11
189 High_Fly_Writes         0x003a   042   042   000    Old_age   Always       -       58
190 Airflow_Temperature_Cel 0x0022   063   050   045    Old_age   Always       -       37 (Min/Max 26/37)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       104
193 Load_Cycle_Count        0x0032   073   073   000    Old_age   Always       -       54273
194 Temperature_Celsius     0x0022   037   050   000    Old_age   Always       -       37 (0 19 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10424h+34m+24.623s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       125385362821
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       122287978581

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Aborted by host               90%     13184         -
# 2  Short offline       Completed without error       00%     13184         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Quote

August 16, 201411 yr

That SMART report does not show anything obvious that indicates the disk is dying!

On that basis it is quite likely that something happened that caused file system corruption. You might therefore want to follow the earlier suggestion to run a reiserfsck file system check on the drive. Initially make sure that you only do a check and do not attempt to fix any errors. I would suggest that you report back here with the output of the check before attempting any suggested recovery action.

Quote

August 18, 201411 yr

Author

Hello Again,

I have done as requested, I started the array via maintenance mode and ran the following:

reiserfsck --check /dev/md1

###########
reiserfsck --check started at Sun Aug 17 14:17:47 2014
###########
Replaying journal: Done.
Reiserfs journal '/dev/md1' in blocks [18..8211]: 0 transactions replayed
Checking internal tree.. finished
Comparing bitmaps..finished
Checking Semantic tree:
finished
No corruptions found
There are on the filesystem:
        Leaves 711607
        Internal nodes 4334
        Directories 3021
        Other files 41133
        Data block pointers 714805349 (37 of them are zero)
        Safe links 0
###########
reiserfsck finished at Sun Aug 17 15:14:01 2014
###########

I also ran a check just on the drive in question sds:

reiserfsck --check /dev/sds1

Replaying journal: Done.
Reiserfs journal '/dev/sds1' in blocks [18..8211]: 534 transactions replayed
Checking internal tree.. finished
Comparing bitmaps..finished
Checking Semantic tree:
finished
No corruptions found
There are on the filesystem:
        Leaves 217811
        Internal nodes 1427
        Directories 638172
        Other files 426507
        Data block pointers 138123478 (0 of them are zero)
        Safe links 0
###########
reiserfsck finished at Sun Aug 17 17:17:03 2014
###########

It does not look like any errors were found.

Quote

August 18, 201411 yr

Author

I think I may have figured out my issue.

On boot I got an error message that sda1 was not unmounted properly and that I should run a fsck on it.

That didn't work, I got an error that it couldn't find the vfat.sys file or something.

I did some digging and found that there is a command called dosfsck. I ran the following:

dosfsck -av /dev/sda/

and it showed me a bunch of columns with numbers on screen, and had a comment like 'will not repair automatically". Then it gave me three choices. (paraphrasing here)

1. Copy backup to original

2. Copy original to backup

3. Do nothing.

Originally I figured the safest option would to be to 'copy backup to oringinal'. I choose that option and nothing happened. So I did it again and choose 'copy original to backup'. It then continued and looked like it repaired the filesystem on my usb thumb drive. After that I rebooted, and my issues seem to have gone away.

Thank You

Quote

Repeating Error On Screen

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)