November 22, 201510 yr I just moved all my disks and USB key to a new server the other day. Ever since then device sdg1, which is the second disk in my cache pool has been having numerous read errors resulting in the pool being forced into read only mode after a while. Stopping and restarting the array has solved the problem when this happens, but I'm wondering what I should do to troubleshoot and prevent this behavior. Should I shutdown the server and move the drive to a different slot? Should I remove it from the pool and do a preclear or some other sort of test on this drive? The drive is relatively new so I would expect it to be under warranty if there truly is a problem, but I'm not sure how to proceed. I have tried to attach the syslog, but it is too large to be accepted by the forum (500K). I have attached a shortened file but it should be enough to get the idea. syslog.zip
November 22, 201510 yr Community Expert Post the complete diagnostics or at least the SMART info of the problem disk.
November 22, 201510 yr Author Is this attachment what you are talking about? nalbonefs1-smart-20151122-1429.zip
November 22, 201510 yr Community Expert Don't see any obvious issues with the disk, since the problem started after moving to a new server first thing I would try is replacing the power and sata cables on that ssd
November 22, 201510 yr Author Well, its on a SAS backplane, but I will reseat it in a different slot.
November 22, 201510 yr Community Expert You should also run scrub from the cache webpage to check the file system, before or after changing the backplane.
November 22, 201510 yr Community Expert Yes but performance will suffer while the scrub is in progress.
November 22, 201510 yr Author Thanks. I will kick off the scrub soon and probably reseat the drive in the morning.
November 22, 201510 yr Author Scrub finished and this is the result scrub status for c6da8ae9-6b2d-491f-be0e-2a21ebcd7e82 scrub started at Sun Nov 22 16:10:18 2015 and finished after 00:03:32 total bytes scrubbed: 199.90GiB with 27530 errors error details: verify=46 csum=27484 corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 Should I have run it without -r? Should I run it again and allow repair or wait until I reseat the drive?
November 22, 201510 yr Community Expert There's little documentation on repairing btrfs filesystem. You can try to repair it, first backup your cache data then see here. or I don't know if there's a way to know if the errors are all on cache2, if they are and you are using the cache pool with 2 SSDs with the default mirror option it's probably simpler to just rebuild that disk. - backup cache data - stop array, unassign cache2 (note, if you start the array with one cache disk unassigned but still connected to the server cache will appear as unmountable, it's normal) - put cache2 on a different backplane just in case - run preclear on that disk - reassign disk to cache2 and start array, unraid will rebuild cache mirror - rerun scrub on cache
November 22, 201510 yr Community Expert Thinking more about the 2nd option, it will only be worth doing if cache1 is error free, so in case you opt for that, I would do this: - backup cache data - power off server, disconnect cache2 - power on and start array, run scrub on cache, if there are no errors proceed with rebuilding mirror
November 22, 201510 yr Author Thinking more about the 2nd option, it will only be worth doing if cache1 is error free, so in case you opt for that, I would do this: - backup cache data - power off server, disconnect cache2 - power on and start array, run scrub on cache, if there are no errors proceed with rebuilding mirror Thanks. I will try this method.
November 23, 201510 yr Author Disconnected Cache2 this morning and ran scrub on cache. 0 Errors. Moved Cached 2 to new slot and precleared. Passed with flying colors. Readded Cache2 to the pool and let it balance. Scrubbed cache pool again with 0 errors. Looks like this is solved. Thank you for your help Johnnie.black.
Archived
This topic is now archived and is closed to further replies.