bamhm182 Posted May 27, 2017 Share Posted May 27, 2017 I have had an issue twice in the past day where everything will be working fine, then I notice I have an issue with a docker or VM. I take a look at the Main tab and next to my cache drive, it has nothing but a * under temperature next to it. I shutdown the array and it takes forever, then I try to restart it and the cache drive says unmountable. Additionally Fix Common Problems said Call Traces found on your server last time it did this, but this time it didn't say that. If I start the array in maintenance mode and tell the cache drive to do a File System Check, it outputs around 25,000 lines that say the following: Quote Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... bad magic number bad on-disk superblock 3 - bad magic number primary/secondary superblock 3 conflict - AG superblock geometry info conflicts with filesystem geometry would zero unused portion of secondary superblock (AG #3) would reset bad sb for ag 3 bad uncorrected agheader 3, skipping ag... sb_icount 65344, counted 41728 sb_ifree 137, counted 3705 sb_fdblocks 118821857, counted 89311086 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 entry "PLEX MEDIA SERVER" in shortform directory 268435554 references non-existent inode 805306466 would have junked entry "PLEX MEDIA SERVER" in directory inode 268435554 entry "PLEX DLNA SERVER" in shortform directory 268435554 references non-existent inode 805396887 - agno = 3 would have junked entry "PLEX DLNA SERVER" in directory inode 268435554 entry "appdata" in shortform directory 99 references non-existent inode 805499114 would have junked entry "appdata" in directory inode 99 ...Similar until line 17,084.... - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 100, would move to lost+found disconnected dir inode 107, would move to lost+found disconnected dir inode 132749, would move to lost+found ...Similar until line 21,174... Phase 7 - verify link counts... would have reset inode 99 nlinks from 15 to 12 would have reset inode 100 nlinks from 10 to 9 would have reset inode 132858 nlinks from 11 to 8 ...Similar until line 24,585... No modify flag set, skipping filesystem flush and exiting. At this point, if I do a -v instead of a -n in the File System Check, it tells me that it can't do it. I should have written down the message, but it basically tells me that I should try remounting the drive or I can force it to fix the problems with -L. If I try, it still doesn't mount, so I run -L and it fixes the problem. Since this started happening, I've taken the long overdue steps to configure CA AutoBackup and a VM Backup solution, but I would still like to see if there's anyone here that can help me diagnose the root cause in case it decides to come back again... I have also attached the zips generated with a diagnostics command. Thank you for your time! Specs of my R710 in case they're needed: CPU: 2x 6-Core Xeon Processors RAM: 72 GB HDD: 2x 3TB WD Reds (Both Data, no Parity) SSD: 1x 512GB Intel 600p attached via Ablecon M.2 PCI-e Card (Cache) r710-diagnostics-20170527-1302.zip r710-diagnostics-20170526-2259.zip Link to comment
JorgeB Posted May 27, 2017 Share Posted May 27, 2017 No apparent motive on the logs for the corruption, is this a new config or a new cache device? If not back up your cache and re-format instead of repairing, restore data a see if it holds up. Link to comment
bamhm182 Posted May 27, 2017 Author Share Posted May 27, 2017 Thanks for the reply. I'll give it a reformat when I get a moment. Amazon says the package was delivered on 28APR2017, so it has been working fine for the past month or so. I put together unRAID w/ the two HDDs around 2 months ago, then added in the cache drive 1 month ago, and now all of the sudden I'm having issues. The only thing I've changed is that over the past week or two, I've spun up two VMs. One to host websites on an apache server, and another that has multicraft installed on it. Neither of these seem to me like they would be incredibly problematic, so I'm not sure what would be going on. They're both running strictly on the cache drive, as well as the 8 or so docker containers. I have all the docker containers and VMs running 24/7. Link to comment
bamhm182 Posted May 29, 2017 Author Share Posted May 29, 2017 Reformatted today and while I was putting my data back on the cache drive, it did it again. I just reformatted as XFS. General consensus says to avoid btrfs, but at this point, I'm willing to try it... Link to comment
JorgeB Posted May 29, 2017 Share Posted May 29, 2017 You can try but there may be an underlying issue, something hardware related, that's not normal. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.