tazman Posted October 3, 2018 Share Posted October 3, 2018 (I have updated this first post to reflect the solution I have found) My syslog shows read errors like those: Sep 3 08:17:01 SS kernel: print_req_error: critical medium error, dev sdr, sector 3989616 Sep 3 08:17:01 SS kernel: md: disk17 read error, sector=3989552 A parity check also confirms read errors on the same drive. I wanted to find out which files are affected and have used the following approach: Start maintenance mode Mount the drive partition: e.g. mount /dev/sdr1 /mnt/test. Get the drive number from the unRAID Main GUI. Add 1 to the drive number which indicates the first partition. Check the block size: xfs_info /mnt/test - look for data = bsize=[Block size]. In my case, on a 4TB drive it was 4096 Check the start sector of the partition with fdisk -lu /dev/sdr. In my case 64. Calculate the block number of the sector as: (int)([sector]-[start sector)*512/[block size]. My bad sector 3989616 is in block 498694. Unmount the partition so it xfs_db will run: e.g. umount /mnt/test Run xfs_db -r /dev/sdr1 (-r is for read-only) On the xfs_db command line: Get the information of the block with blockget -n -b [block number]. E.g. blockget -n -b 498694 This will run for a while as it reads the entire disk. At the beginning it will output the inode number for the block. In my case it was 35676. The larger the size of the drive, the more memory it needs. With 4GB on a 4TB disk I got an out of memory error: xfs_db: ___initbuf can't memalign 32768 bytes: Cannot allocate memory. Upgrading to 16GB allowed the command to run. Get the file name for the inode with: ncheck -i [inode] Enter quit or press Ctrl-D to exit xfs_db I have not figured out: How to get blockget to run faster or with less memory usage. Maybe there is an alternative way to determine the inode of a block. How to check additional blocks without exiting xfs_db and running blockget again. I tried convert but couldn't get it to work. Maybe someone from the community has an idea about those. Unfortunately, xfs_db is not very well documented yet on the web beyond man pages e.g. on https://linux.die.net/man/8/xfs_db Kind regards, Tazman Link to comment
JorgeB Posted October 3, 2018 Share Posted October 3, 2018 Why not just doing a standard disk replacement? As long as your parity is valid all data will be correctly rebuilt, and if you still want, you can then use the disk unassigned, or in another PC, and copy all the files, any file that you can't copy due to an I/O error was on the affected sectors. Link to comment
tazman Posted October 4, 2018 Author Share Posted October 4, 2018 Sure, this is a possibility. Just takes longer to do this. But I am still wondering how we can turn the sector number reported in the log to identify the file(s) affected. Link to comment
tazman Posted October 7, 2018 Author Share Posted October 7, 2018 I found a solution and have updated the first post accordingly. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.