sacretagent Posted March 1, 2011 Share Posted March 1, 2011 HI, a bit scratching my head here Mar 1 14:48:36 p5bplus kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Mar 1 14:48:36 p5bplus kernel: ata9.00: irq_stat 0x40000001 (Drive related) Mar 1 14:48:36 p5bplus kernel: ata9.00: failed command: READ DMA EXT (Minor Issues) Mar 1 14:48:36 p5bplus kernel: ata9.00: cmd 25/00:a8:27:55:13/00:03:64:00:00/e0 tag 0 dma 479232 in (Drive related) Mar 1 14:48:36 p5bplus kernel: res 51/40:17:b0:55:13/00:03:64:00:00/e0 Emask 0x9 (media error) (Errors) Mar 1 14:48:36 p5bplus kernel: ata9.00: status: { DRDY ERR } (Drive related) Mar 1 14:48:36 p5bplus kernel: ata9.00: error: { UNC } (Errors) Mar 1 14:48:36 p5bplus kernel: ata9.00: configured for UDMA/133 (Drive related) Mar 1 14:48:36 p5bplus kernel: ata9: EH complete (Drive related) not sure what they are ... changed a sata cable already to see if there is no issue with that ... but the problem returns it is one of the ports on my Mobo I might be running on the treshold of my PSU capacities as it started after adding a disk ... and it was not this disk syslog attached running 5.0b6 but that is not related as i had them before too ... it just gets on my nerves now that it is filling up my logs syslog-2011-03-01.zip Quote Link to comment
sacretagent Posted March 1, 2011 Author Share Posted March 1, 2011 So what are media errors ? run reiserfschk and this is the result Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --check started at Tue Mar 1 18:37:51 2011 ########### Replaying journal: Done. Reiserfs journal '/dev/md2' in blocks [18..8211]: 0 transactions replayed Checking internal tree.. finished Comparing bitmaps..finished Checking Semantic tree: finished No corruptions found There are on the filesystem: Leaves 229750 Internal nodes 1478 Directories 459 Other files 2277 Data block pointers 232091559 (0 of them are zero) Safe links 0 ########### reiserfsck finished at Tue Mar 1 18:54:39 2011 ########### root@p5bplus:~# what else can i do ? Quote Link to comment
sacretagent Posted March 1, 2011 Author Share Posted March 1, 2011 ran a short smart report root@p5bplus:~# smartctl -a -d ata /dev/sdj smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green (Adv. Format) family Device Model: WDC WD10EARS-00MVWB0 Serial Number: WD-WCAZA0014151 Firmware Version: 50.0AB50 User Capacity: 1,000,204,886,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Mar 1 19:03:58 2011 ICT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (18300) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 211) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 188 188 051 Pre-fail Always - 12249 3 Spin_Up_Time 0x0027 174 163 021 Pre-fail Always - 6291 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1026 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 2995 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 148 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 116 193 Load_Cycle_Count 0x0032 192 192 000 Old_age Always - 25796 194 Temperature_Celsius 0x0022 115 083 000 Old_age Always - 35 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 192 192 000 Old_age Always - 1309 198 Offline_Uncorrectable 0x0030 199 198 000 Old_age Offline - 196 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 192 184 000 Old_age Offline - 2375 SMART Error Log Version: 1 ATA Error Count: 7 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 2995 - # 2 Short offline Interrupted (host reset) 10% 2992 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@p5bplus:~# I don't see anything wrong but i am a noob on drives Quote Link to comment
Joe L. Posted March 1, 2011 Share Posted March 1, 2011 Media errors are un-readable sectors on a disk. Quote Link to comment
Joe L. Posted March 1, 2011 Share Posted March 1, 2011 Media errors are un-readable sectors on a disk. Your disk is basically failing... and badly. There are over 1300 unreadable sectors marked for re-allocation when they are next written. 197 Current_Pending_Sector 0x0032 192 192 000 Old_age Always - 1309 You need to replace the disk. DO NOT perform a parity CHECK, if you do it will be updated with the incorrect data from the un-readable sectors. Stop the array, replace the drive, and then simply "Start" the array. It will re-construct the correct contents from parity, as long as you do not overwrite parity by performing a "Check" Joe L. Quote Link to comment
Joe L. Posted March 1, 2011 Share Posted March 1, 2011 I might be running on the treshold of my PSU capacities as it started after adding a disk ... and it was not this disk That power supply has multiple 12 volt rails, but ONLY 1 is used for all the disks, and it is probably also shared by the motherboard. That 1 rail is rated apparently at 19 Amps peak. It is probably WAY overloaded, even if you have all green drives and figure only 2 amps per drive required. You have about 33 Amps being drawn upon spin up by your disks. Add to that the motherboard needs, and the fans... and I figure you need a 12 volt rail of 40 to 45 Amps capacity. Quote Link to comment
SSD Posted March 1, 2011 Share Posted March 1, 2011 I have run a large array with an older 550 watt PSU that was reported to be multi-rail, but I found out later that internally it was single rail. (There was a time that "multi-rail" had marketing appeal I guess). But I agree that if this is really a multi-rail PSU internally, your PSU is underpowered. I would be a little surprised that an underpowered PSU would cause read errors on this one disk. I've always thought that an underpowered PSU would cause boot problems as disks could not get enough power to spin up (spinning up is the most power hungry operation your computer does). There have been several cases that we've seen in the forums with pending sectors clearing themselves and all going back to zero after a parity check. Although I agree with Joe L. that running a regular parity check could pollute your parity, I would suggest you run a read-only parity check to see if the pending sectors clear or become true reallocated sectors. It would also be instructive to see if you get parity sync errors. Quote Link to comment
dgaschk Posted March 1, 2011 Share Posted March 1, 2011 Change the PSU, then run a read-only parity check. Quote Link to comment
sacretagent Posted March 2, 2011 Author Share Posted March 2, 2011 ok parity was already botched due to my fault .. so no saving that info not a biggie .... just wanted to know is it ok to run a preclear on a disk while parity sync is running i removed disk 2 from the array and now i am trying to get a valid parity before i add a new disk... might go for a wdd black 2gb for parity and try preclearing the old parity drive other psu is planned ... but want to go for a good one ... so need to save some money to buy a corsair tx 850 as eventually i want to go for a 20 drive setup i have the icute case already standing here ... but waiting for the new psu and for a solution to get 5 in 3 cages here in Thailand ... Quote Link to comment
Rajahal Posted March 3, 2011 Share Posted March 3, 2011 You can run preclear during a parity check or any other array activity for that matter. Unless you want to run mostly 7200 rpm drives, then a Corsair 650W PSU is enough for 20+ drives. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.