November 11, 201015 yr Hi, i need some help with my unraid system, this is a 4.5.6 Pro system with 12 drives. Six weeks ago I received an error when I tried to copy a file from my windows machine to the Unraid volume. I found out that one of my HD had a lot of Pending Sectors, row error read rate and all this beautiful stuff. I decided to replace this drive and send it to my dealer for a warranty replacement. From the past I know that this will take a few weeks and because I do not want to leave my data unprotected I bought a new HD, insert this, rebuild the Volume and everything was fine for the last 4 weeks. Yesterday I receive the replacement drive. Because I run out of space (sigh….) I decided to replace one of my 1TB drives (DRIVE 3) with this new 2TB drive. (WD20EARS with jumper set). I shutdown the unraid server, replaced the drive, boot and started the rebuild process, like dozens of times before. The expected time for this was 860minutes, so I went to sleep. This morning I check the status, first I saw only green lights but then I notice on DRIVE 4 more than 23.000 Errors =:-( . Smart report for this drive told me 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 2669 3 Spin_Up_Time 0x0007 075 075 011 Pre-fail Always - 8280 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1127 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 3 7 Seek_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0 8 Seek_Time_Performance 0x0025 100 100 015 Pre-fail Offline - 12739 9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 13238 10 Spin_Retry_Count 0x0033 100 100 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 278 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 2601 183 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 184 Unknown_Attribute 0x0033 100 100 099 Pre-fail Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 2608 188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 072 058 000 Old_age Always - 28 (Lifetime Min/Max 20/32) 194 Temperature_Celsius 0x0022 077 058 000 Old_age Always - 23 (Lifetime Min/Max 17/34) 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 228353600 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 3 197 Current_Pending_Sector 0x0012 087 087 000 Old_age Always - 531 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 100 006 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 072 072 000 Old_age Always - 2681 Now I telnet to my unraid and tried to copy files from DISK4 (the one with errors) to the new DISK3 and after a few seconds I receive an error. The logs told me: Nov 11 04:40:01 nas syslogd 1.4.1: restart. Nov 11 06:08:06 nas kernel: md: sync done. time=43177sec rate=45244K/sec Nov 11 06:08:07 nas kernel: md: recovery thread sync completion status: 0 Nov 11 07:08:07 nas kernel: mdcmd (4578): spindown 0 Nov 11 07:08:08 nas kernel: mdcmd (4579): spindown 3 Nov 11 07:08:09 nas kernel: mdcmd (4580): spindown 9 Nov 11 07:08:09 nas kernel: mdcmd (4581): spindown 10 Nov 11 07:08:10 nas kernel: mdcmd (4582): spindown 11 Nov 11 07:58:40 nas kernel: mdcmd (4887): clear Nov 11 07:59:28 nas kernel: mdcmd (4899): spinup 3 Nov 11 07:59:28 nas kernel: Nov 11 07:59:56 nas kernel: mdcmd (4905): spinup 4 Nov 11 07:59:56 nas kernel: Nov 11 08:01:00 nas in.telnetd[5309]: connect from 192.168.1.105 (192.168.1.105) Nov 11 08:01:07 nas login[5310]: ROOT LOGIN on `pts/0' from `192.168.1.105' Nov 11 08:03:31 nas kernel: REISERFS error (device md3): reiserfs-2025 reiserfs_cache_bitmap_metadata: bitmap block 235438080 is corrupted: first bit must be 1 Nov 11 08:03:31 nas kernel: REISERFS (device md3): Remounting filesystem read-only Nov 11 08:03:31 nas kernel: REISERFS warning (device md3): clm-6006 reiserfs_dirty_inode: writing inode 2003 on readonly FS Nov 11 08:04:07 nas kernel: mdcmd (4942): clear Nov 11 08:04:16 nas kernel: mdcmd (4947): spinup 3 From our forum i found the hint to do a filesystem check... Cd samba stop umount /dev/md3 reiserfsck --check /dev/md3 And receive a lot of … Trans replayed: mountid 99, transid 146490, desc 1051, len 6, commit 1058, next trans offset 1041 Trans replayed: mountid 99, transid 146491, desc 1059, len 24, commit 1084, next trans offset 1067 Trans replayed: mountid 99, transid 146492, desc 1085, len 21, commit 1107, next trans offset 1090 Trans replayed: mountid 99, transid 146493, desc 1108, len 23, commit 1132, next trans offset 1115 Trans replayed: mountid 99, transid 146494, desc 1133, len 23, commit 1157, next trans offset 1140 Trans replayed: mountid 99, transid 146495, desc 1158, len 22, commit 1181, next trans offset 1164 Trans replayed: mountid 99, transid 146496, desc 1182, len 24, commit 1207, next trans offset 1190 Trans replayed: mountid 99, transid 146497, desc 1208, len 10, commit 1219, next trans offset 1202 Replaying journal: Done. Checking internal tree.. finished Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Checking Semantic tree: finished 1 found corruptions can be fixed when running with --fix-fixable ########### reiserfsck finished at Thu Nov 11 08:48:23 2010 ########### Sorry for this long description and my english, but know I’m sitting in front of my computer and ask myself what is the next step? Can I trust the parity disk? Can I trust the new rebuild DISK3? (The old 1T disk with all data lies next to me) Can I trust the DISK4? Any advice is welcome…unfortunately I'm a linux noob. By syslog-2010-11-11.txt
November 11, 201015 yr You solution is to run reiserfsck with the --fix-fixable option as specified in the output of the reiserfsck you ran. cd samba stop umount /dev/md3 reiserfsck --fix-fixable /dev/md3
November 11, 201015 yr Author hmmmm... i'm not sure that the rebuild process from DRIVE 3 has "really" rebuild my data. Joe if i got read errors on Drive 4 while i rebuild DRIVE 3, will this not corrupt my data? What is the best way to go back to the old HD? Removing new 2T Drive 3 with old 1T Drive Rebuild the parity disk ? exchange the Drive4 Rebuild DRIVE4 Thanks!!
November 11, 201015 yr hmmmm... i'm not sure that the rebuild process from DRIVE 3 has "really" rebuild my data. Joe if i got read errors on Drive 4 while i rebuild DRIVE 3, will this not corrupt my data? Correct. It will corrupt your data.. What is the best way to go back to the old HD? Removing new 2T Drive 3 with old 1T Drive Rebuild the parity disk ? exchange the Drive4 Rebuild DRIVE4 Thanks!! If disk4 is giving errors, then rebuilding parity with it will result in the same corruption of parity. There is no simple solution to your errors.
November 11, 201015 yr Author Hi, my question was not clear, sorry. I read some things in the FAQ. Can i do this: Shutdown Unraid Replace the new 2Tbyte with the old 1TByte one (DISK3) Unplug the DRIVE4 Boot Server Remove the DRIVE4 from configuration and check that the DRIVE3 is the old one Login as root and run initconfig -> Have a working system with 11 Drives, lose everything that i have not copied from old DRIVE4
November 11, 201015 yr Hi, my question was not clear, sorry. I read some things in the FAQ. Can i do this: Shutdown Unraid Replace the new 2Tbyte with the old 1TByte one (DISK3) Unplug the DRIVE4 Boot Server Remove the DRIVE4 from configuration and check that the DRIVE3 is the old one Login as root and run initconfig -> Have a working system with 11 Drives, lose everything that i have not copied from old DRIVE4 Yes you can do that. When you next start the array it will begin a complete new parity calculation. You'll be without any parity protection until it is complete. When it is done you should then do a full parity check by pressing the "Check" button. Joe L.
Archived
This topic is now archived and is closed to further replies.