Treytor Posted April 13, 2015 Share Posted April 13, 2015 I got a couple new 5TB drives, replacing a 4TB parity with one. This went without a hitch. I then swapped out an older 2TB drive with the "old" but working parity drive. While populating and expanding the file system it failed around the 2TB mark. I then tried a brand new 4TB drive. Same thing. I then tried a brand new 5TB drive, and it just happened again. The port this drive is on is running off a Silicon Image 8 port controller, so replacing one of the two cables coming off the controller is not an ideal solution. I checked, cleaned, and re-seated them just to make sure. I also cleaned the hot-swap bay the drive is in just to be sure. I find it curious that the parity rebuild worked perfectly fine right before doing this. I checked the Tunable (md_num_stripes), Tunable (md_write_limit), and Tunable (md_sync_window) settings, and the first two were default and the 3rd wasn't. I don't remember what it was set at before but I changed it back to default (384) and the rebuild seemed to go faster but it still failed. The only difference being the drive failed after 320 errors instead of the usual 288. Is it a possibility that there's an error happening somewhere else in my array? I noticed these errors in the syslog a little while after the failure occurred while trying to do something else. Note Disk 5 is NOT the one that is being rebuilt: Apr 12 20:16:15 Cooper kernel: REISERFS error (device md5): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [128045 1209 0x0 SD] Apr 12 20:18:19 Cooper shfs/user: shfs_readdir: fstatat: UFC (13) Permission denied Apr 12 20:18:19 Cooper shfs/user: shfs_readdir: readdir_r: /mnt/disk5/TV (13) Permission denied Apr 12 20:18:19 Cooper shfs/user: shfs_readdir: fstatat: UFC (13) Permission denied Apr 12 20:18:19 Cooper shfs/user: shfs_readdir: readdir_r: /mnt/disk5/TV (13) Permission denied Apr 12 20:18:19 Cooper kernel: md: disk5 read error, sector=1464509760 Apr 12 20:18:19 Cooper kernel: REISERFS error (device md5): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [128045 1209 0x0 SD] Apr 12 20:18:19 Cooper kernel: REISERFS error (device md5): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [128045 1209 0x0 SD] I'm not sure if that's related, though. I missed the syslog when the error happened, so I don't have that available right now. I could try again and fetch it if necessary. Any ideas? Thanks! Link to comment
BRiT Posted April 13, 2015 Share Posted April 13, 2015 What do the smart report from the drives preclear cycles indicate? Link to comment
Treytor Posted April 13, 2015 Author Share Posted April 13, 2015 I've never done that preclear script on any of my drives. Shame on me, I know. I don't have any open ports to do it, though. Every drive supported by the case / mobo is used in the array (21 drives). Link to comment
trurl Posted April 13, 2015 Share Posted April 13, 2015 Better get a SMART report from every one of those 21 drives then. Link to comment
morbidpete Posted April 13, 2015 Share Posted April 13, 2015 Agreed, get a full smart reports. Also just for safe measure. With the 5tb Parity drive in place. and all the old drives in place. Run another parity check with error correction just to be on the safe side. Link to comment
SSD Posted April 13, 2015 Share Posted April 13, 2015 Agreed, get a full smart reports. Also just for safe measure. With the 5tb Parity drive in place. and all the old drives in place. Run another parity check with error correction just to be on the safe side. I would not do that. Let's see if we have any serious disk issues. A parity check with a failing drive can cause the drive to completely fail and make recovery more difficult / impossible. Link to comment
trurl Posted April 13, 2015 Share Posted April 13, 2015 A complete syslog might allow us to concentrate on a few drives. I would still recommend at least checking SMART reports for all drives though. Since you have posted in v6 subforum, it is very easy to get SMART report from the GUI. From Main, click on drive, then Health - Disk attributes. Known ATA S.M.A.R.T. attributes Link to comment
morbidpete Posted April 13, 2015 Share Posted April 13, 2015 Agreed, get a full smart reports. Also just for safe measure. With the 5tb Parity drive in place. and all the old drives in place. Run another parity check with error correction just to be on the safe side. I would not do that. Let's see if we have any serious disk issues. A parity check with a failing drive can cause the drive to completely fail and make recovery more difficult / impossible. Good call, my bad Link to comment
Treytor Posted April 13, 2015 Author Share Posted April 13, 2015 A complete syslog might allow us to concentrate on a few drives. I would still recommend at least checking SMART reports for all drives though. Since you have posted in v6 subforum, it is very easy to get SMART report from the GUI. From Main, click on drive, then Health - Disk attributes. Known ATA S.M.A.R.T. attributes Crap... I blew it I'm actually running 5.0.6. Sorry guys! The rebuild is still going from my latest retry (I did one more reseat and a full system reboot) and it's around 70% now, which is farther than it's ever gotten before. So it may finish this time. How would I go about getting a smart report on all the drives in 5.0.6? Also, could a mod move this to the v5 subforum? Thanks guys! Link to comment
Treytor Posted April 14, 2015 Author Share Posted April 14, 2015 So the rebuild finished successfully this time around. Not sure why. I should still probably get a smart report for every drive though, eh? Link to comment
sureguy Posted April 14, 2015 Share Posted April 14, 2015 So the rebuild finished successfully this time around. Not sure why. I should still probably get a smart report for every drive though, eh? Yes. Link to comment
Treytor Posted April 15, 2015 Author Share Posted April 15, 2015 Is there an easy way to do that for all 21 drives? Reminder I'm actually on 5.0.6 Thanks again! Link to comment
trurl Posted April 15, 2015 Share Posted April 15, 2015 Is there an easy way to do that for all 21 drives? Reminder I'm actually on 5.0.6 Thanks again! unMenu Link to comment
Treytor Posted April 15, 2015 Author Share Posted April 15, 2015 Here's a screenshot of the smart status window in unmenu: http://i.imgur.com/KM1l0LJ.jpg Is this what we need? Does anything stand out? Thanks! Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.