Hi,
New user here. Just as a warning that the obvious may not seem that obvious to me personally
I just discovered unraid little over a week ago now and after some reading on the forums, wiki and such I knew unraid was it for me and I decided to take the plunge and build my first unraid server with hardware from an old gaming pc I still had. I keep an actual overview updated with specs, config and such on google docs for myself but also for support reasons. This is the link.
First off I started with a simultaneous preclear_disk of 4 new disks (logs attached). It lasted 32 hours without any errors. In the smartctl report I did however notice an increasing high count of "Hardware_ECC_Recovered" on all drives but after research this seems normal for these Samsung HD154UI drives (correct me if I'm wrong). During preclearing the disk temps didn't exceed 30 degrees and ran at 60 MB/s on average which did seem quite low (again, correct me if I'm wrong) but I would investigate later. The result for all 4 drives was about the same like:
[color=blue]Preclear Successful
... Total time 33:08:00
... Pre-Read time 5:12:22 (80 MB/s)
... Zeroing time 8:11:43 (50 MB/s)
... Post-Read time 19:42:51 (21 MB/s)[/color]
Aren't these MB/s values rather low? Anyway. I created the array with adding 2 as data drives without parity drive and copied everything over from my old server. After completion I added the parity drive and started parity-sync. No errors still at this point.
Parity sync went fine for a while starting at 110 MB/s and dropping a little to around 90 MB/s when suddenly the horror started. Suddenly it was only around a few hundred KB/s and Sync Errors started occuring and counting up rapidly. The syslog (attached) started showing loads of errors. In unmenu MyMain I checked syslog entries per disk. The parity disk seems normal. Disk 1 and 2 however both report many errors. Disk 4 is not in the array since I use the basic version and is spun down. When I look at smartctl reports per disk I however only see trouble with disk 1 (all disk smartctl reports attached).
This rises several questions which I hope anybody can help me with;
- What should I obviously do?
- Is it so simple that disk1 is bad? Because syslog entries for disk2 also show many errors. Or is this because of the array.
- All are new drives. everything went fine, even preclearing for 32 hours. now suddenly errors. is it disk, psu, mem, etc? where should I look.
- Is data on disk1 bad and should I replace it with the warm spare (disk4) and rebuild or just copy all data over again from the old server.
Please advice and thank you in advance..
UPDATE 1:
The parity error count is 8222 now not building up anymore though syslog is still spitting out errors. I also attached a txt with the parity building progress at 3 times I checked.
UPDATE 2:
It eventually finished and now displays as parity being valid (is this true?). Unmenu says "Parity updated 8222 times to address sync errors". It didn't increase anymore. Anybody who would give me advice as to what to still do, what to check?
UPDATE 3:
I started a smartctl --test=long /dev/sd* on all disks. disk1 aborts after 20 seconds or so (tried several times). see report below. the other disks are still running the long test.
SMART Self-test log structure revision number 1
[color=blue]Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 86 1526120678
# 2 Extended offline Completed: read failure 90% 86 1526120671
# 3 Short offline Completed: read failure 20% 85 1526120671
# 4 Extended offline Aborted by host 90% 85 -[/color]
preclear_reports.zip
syslog-2011-09-05.zip
smartctl_2011-09-05.zip
parity_rebuilding_progress.txt