Koperfild Posted October 15, 2017 Share Posted October 15, 2017 Hi, my server has been running over 8 years without any big issues. Now the disks seem tired of. Ther SMART reports don't look good on four of the six drives. Have a look please. Now, the problem arised, becaused sync errors started to show up. They are here every time when i run parity check. Last time it completed with 100 of errors, earlier with 26 errors etc. I ran reiserfsck on filesystems. Only one drive had problem, but was fixed using fix-fixable option. I know i have to replace faulty drives (with high allocated sector number, or report uncorrect number). I just have one question. How can i be sure that all of my files are OK if there are constantly sync errors. Is it even possible? Link to comment
JorgeB Posted October 15, 2017 Share Posted October 15, 2017 1 hour ago, Koperfild said: How can i be sure that all of my files are OK if there are constantly sync errors. Is it even possible? Only if you have checksums for your files or use btrfs. Do you have notifications enable? You should have noticed those issues before they were on so many disks. Link to comment
Koperfild Posted October 15, 2017 Author Share Posted October 15, 2017 45 minutes ago, johnnie.black said: Do you have notifications enable? You should have noticed those issues before they were on so many disks. Unfortunately not. Do you think that all disks except the secon one should be replaced? How high number for "reported uncorrect" is bad? So it seems i can't make sure the files are OK. Some of them may probably be bit-rootten? I think i will replace faulty drives and create new setup using btrfs... Link to comment
JorgeB Posted October 15, 2017 Share Posted October 15, 2017 The most critical are the ones with current pending sectors, you should run an extended SMART test on all disks, sometimes pending sectors are false positives, but if 2 or more disks fail the test there will be some loss of data during the replacement. Link to comment
Koperfild Posted November 2, 2017 Author Share Posted November 2, 2017 So i replaced disks, verified data with my backups and run into another problems ehhh Could you help me johnie? Because i can't figure out what went wrong. Btw can i buy you a coffe somehow? I bought two brand new Seagate 2TB pipeline HDD and also used 1TB drive from my previous server because it was fine (according to SMART). I created new config and everything was fine for two days. Then i noticed REPORTED UNCORRECT flag on one of the new 2TB discs. I quickly bought another one and replaced the drive (also put new sata cable). The new one started to produce the same errors very quickly. Just after data rebuild i run parity check and there were errors. Both REPORTED UNCORRECT and REALOCATED SECTORS flags were rising after each parity verification. unraid reported errors in the webgui (over 1400 errors). Today i tried to access server and... all the data from the drive was missing. I downloaded diagnostics, rebooted the server. The drive was shown as unmountable! So i tried to mount it a few times, and when it didnt work i just disabled it. The problem is that even with disabled disc and emulated data. there is no data at all It's not yet a tragedy, because i have fresh backups, but i dont know what i did wrong. I don't want to repeat problems, so could you please look into my syslogs? The file named -1 is before reboot and -2 is after reboot. syslog-1.txt syslog-2.txt Link to comment
JorgeB Posted November 2, 2017 Share Posted November 2, 2017 15 minutes ago, Koperfild said: Both REPORTED UNCORRECT and REALOCATED SECTORS flags were rising after each parity verification These are disks errors, you'll need to replace them, it's also a good idea to test any new disk with preclear or other utility before using it in the array. As for the disk being unmountable, you need to run xfs_repair on the emulated disk or after it's rebuilt to a new disk. https://wiki.lime-technology.com/Check_Disk_Filesystems#Drives_formatted_with_XFS Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.