steve1977 Posted July 30, 2016 Share Posted July 30, 2016 This is a new and different issue. I am using RC2 and one disk in the array can no longer be accessed. It still shows "green, "error 0" and indicates free space in the GUI. However, I cannot access it over the network and when trying to see folder structure in the GUI, it shows no folders. I can imagine that rebooting may fix things at least temporaily, but wanted to get advice on this forum first. To make sure that I don't break anything by restarting. Attached the diagnostic file. Your help is much appreciated! tower-diagnostics-20160730-1324.zip Link to comment
JorgeB Posted July 30, 2016 Share Posted July 30, 2016 You need to use xfs_repair on disk1. Before that there are a few errors on disk11, you may want to check/replace cables and keep monitoring the log. Jul 25 21:14:33 Tower kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jul 25 21:14:33 Tower kernel: ata2.00: failed command: READ DMA EXT Jul 25 21:14:33 Tower kernel: ata2.00: cmd 25/00:08:60:5b:f9/00:00:71:01:00/e0 tag 6 dma 4096 in Jul 25 21:14:33 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jul 25 21:14:33 Tower kernel: ata2.00: status: { DRDY } Jul 25 21:14:33 Tower kernel: ata2: hard resetting link Jul 25 21:14:33 Tower kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 25 23:41:42 Tower kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jul 25 23:41:42 Tower kernel: ata2.00: failed command: STANDBY IMMEDIATE Jul 25 23:41:42 Tower kernel: ata2.00: cmd e0/00:00:00:00:00/00:00:00:00:00/40 tag 5 Jul 25 23:41:42 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jul 25 23:41:42 Tower kernel: ata2.00: status: { DRDY } Jul 25 23:41:42 Tower kernel: ata2: hard resetting link Jul 25 23:41:42 Tower kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 26 11:12:44 Tower kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jul 26 11:12:44 Tower kernel: ata2.00: failed command: STANDBY IMMEDIATE Jul 26 11:12:44 Tower kernel: ata2.00: cmd e0/00:00:00:00:00/00:00:00:00:00/40 tag 29 Jul 26 11:12:44 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jul 26 11:12:44 Tower kernel: ata2.00: status: { DRDY } Jul 26 11:12:44 Tower kernel: ata2: hard resetting link Jul 26 11:12:44 Tower kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 Thanks for your reply. Let me run xfs_repair and report back. When it comes to the errors related to cables, I have been having this for a while and changed them just recently (again). I stumbled over a notion about "autoparking" on Green drives (https://lime-technology.com/forum/index.php?topic=21007.0). I am using almost exclusively green drives. Could this have anything to do with the errors on disk 11 et al? Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 Thanks. I ran xfs_repair, but getting below error message. Any thoughts? Phase 1 - find and verrify superblock... - block cache size set to 1421280 entries Phase 2 - using internal log - zero log.... zero_log: head block 1615378 tail block 1615345 ERRROR: The filesystem has valuable metadate changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmont it before re-running xfs_repair. If you are unable to mount the filesystem, them use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Link to comment
JorgeB Posted July 30, 2016 Share Posted July 30, 2016 Try starting and stopping the array in normal mode, then start in maintenance mode and run xfs_repair again, if you still get the same error using -L is the only option. Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 Drive now shows as "unassigned", so can no longer be mounted. Shall I try /L or shall I recover on a new disk? Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 This means running: xfs_repair -v -L /dev/md1 Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 I was not able to restart in maintenance mode. See new diagnostic below. Any advice: tower-diagnostics-20160730-1631.zip Link to comment
JorgeB Posted July 30, 2016 Share Posted July 30, 2016 Drive now shows as "unassigned", so can no longer be mounted. Shall I try /L or shall I recover on a new disk? If it shows unassigned it may have dropped offline, reboot and try again. Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 as you may see from the log, disk11 now also dropped, so I have two unassigned disks. any thoughts? Sent from my iPhone using Tapatalk Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 Thanks. Disk11 came back after reboot and I was able to go into maintenance mode again. Ran with -L and got some error mistakes. I stopped the array (in maintenacne mode). Disk 1 still shows as "unassigned". Reboot again? Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 I went ahead and rebooted. The disk shows again. Now, I am facing a critical question: shall I start the array (and probably lost something because of -L) or shall I rebuild from parity (if even possible?). tower-diagnostics-20160730-2111.zip Link to comment
trurl Posted July 30, 2016 Share Posted July 30, 2016 Rebuilds typically will not fix filesystem issues. You need to figure out why the disks are dropping or you are going to continue having problems even if you get your files back. Rebooting is not really a fix for anything. Have you checked all your connections? Power, SATA both ends? Controller seated well in the slot? Link to comment
JorgeB Posted July 30, 2016 Share Posted July 30, 2016 If the disk goes offline when running xfs_repair there's a hardware issue, be it disk/cable/controller, after checking cabling and if you have a spare you can do a rebuild, you still need to run xfs_repair after the rebuild completes. Link to comment
steve1977 Posted July 30, 2016 Author Share Posted July 30, 2016 Thanks for your messages. Actually, I got it wrong. The disk did not go offline during xfs_repair. Things are working again now. Thanks for your help! Having said this, there remains some form of issue, which has (probably) nothing to do with the xfs issue for disk1. But the disk11 issue is a constant issue that I am facing since starting to run Unraid. Unraid disables my disks from time to time. I have really tried everything you can imagine. Bought a new M1015 controller card, exchanged the PSU to one of the most expensive ones, had a professional IT person redo the cabling (twice). The issue remains though. From all I can tell, it only impacts disks that are connected to the M1015 and not the Mobo directly, but I really doubt that the M1015 is the issue. My suspcision is that it may still be cable-related as the IT guys could have picked cheap cables. Don't know how to better do this or whether anyone has recommendations for good cables. A new idea from today is that it may be related to the "autoparking" given I am mostly using Green drives and I don't think one of my Seagate disks ever caused an issue. But I saw your (johnnie.black) reply that this is rather unlikely. Link to the threads with more details and context about my issues: https://lime-technology.com/forum/index.php?topic=49815.0 https://lime-technology.com/forum/index.php?topic=21007.0 Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.