Jump to content

wheel

Members
  • Content Count

    167
  • Joined

  • Last visited

Community Reputation

1 Neutral

About wheel

  • Rank
    Advanced Member

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. So Disk13 completed the extended SMART self-test without error. Since I'm probably going to end up upgrading a handful of other disks during the course of this mess, my new concern is why Disk13 threw up read errors during the Disk12 rebuild - and how to prevent that from happening again the next time I rebuild a disk. Any guidance on how best to trace that problem to its source and stop it from reocurring would be greatly appreciated!
  2. Extended test's at 50% now, so - holding off! Been spot-checking D12, and already found a few files that won't open properly. Going to be a hunt, but I've got time for it. Thanks a ton for your patience and advice in such a weird time for everyone, JB.
  3. Well, the short SMART test on D13 came back fine, but the extended's been sitting on 10% for over 2 hours now, which feels weird on a 6tb. I'm going to let it keep rolling for awhile, but I feel like this doesn't bode well for that 6tb having much life left in it. Am I safer off replacing that 6tb (if Extended SMART fails) before upgrading unraid to a newer version? If so, since I just ran a non-correcting parity check, is any of the (now-corrupted) D12 data repairable through the old parity I haven't "corrected" yet? Or should I run a correcting parity check before replacing that 6tb?
  4. Old disk 12 is still in the exact same shape, and I have an eSATA caddy on another tower I can hopefully easily use for the checksum compare on the two 12s over the network (about to do some reading on that). Also looking into the reiserfs thing - definitely news to me, and feeling like I should be better safe than sorry on all towers during this mess. (EDIT: File juggling is going to be tough until I can get some more drives in the mail. Hopefully their being reiserfs won’t screw me too hard during the crisis if external hard drives keep getting their shipment times pushed back as non-essential.) Any recommendations on how to confirm whether D13 needs replacing now with the unraid version still sitting at 6.3.5? Thanks again!
  5. Damn. No special reason on the old version; vaguely remember planning to upgrade around 6.6.6(?) but read about some weird stuff going on and decided to hold off for a future version. Time flew by in between then and now (unraid’s mostly a set-and-forget thing for me). So I’m out of 6TBs but can upgrade one in another tower to an 8TB and get another 6TB to use and replace 13’s 6TB if needed. I’m guessing these are my next steps: (1) Confirm file integrity on D12 and D13 (2) Identify whether disk 13 has a problem or if it’s related to hotswap cage or wires or whatever (NOT sure on this one) (3) Upgrade to last stable Unraid release OR (3) Replace D13 and upgrade to last stable Unraid release On the right track? Thanks for the swift help, JB!
  6. Some history on this tower: So the LSI controllers have been in since December and doing fine. I’ve successfully upgraded at least 3, maybe more, 4TB to 6TB drives in the time since, always parity checking before and non correcting parity check after. This is the first time since the LSI cards came in that I had a disk die on me (Disk 12, 2 sync errors on the GUI and drive automatically disconnected). I was maybe 2 days max away from upgrading a random 4 to 6 for space reasons anyway, so I went ahead and put the 6 in and started the rebuilding process. I’d been steadily adding files over the past month and a half or so since my last upgrade (and last parity check), but weirdly not many to the disk that’s now showing 166k errors (Disk 13). The first half or so of the parity check had zero errors. I checked it with about 3 hours left and saw the 166k errors, but let the check run to completion. No more errors popped up in the last 3 hours of the check, the sync error disk (13) isn’t disabled or marked in any negative way outside of the error count, and all files (including the ones added to that disk during the ~45 days of no parity checks) seem to open fine still. With all these factors in play, any suggestions on next steps here? Got a feeling hardware replacements are going to be a pain in this environment, but I’m swimming in free time if there are some time-intensive steps I can take to figure out what’s going wrong here and get things back to normal. Thanks in advance for any help or guidance! tower-diagnostics-20200326-0913.zip
  7. For sure; that's what I'm going to do with this and my other unraid boxes, but since it's never happened before and it's happening twice in relatively quick succession now, I figured I should check in here in case that's a symptom of something else weird going on under the hood. I'm a ridiculously basic user with almost no linux experience, so I realize this could all be totally innocuous - it's the fact that I've run this specific setup (network, unraid, kodi, no changes) for years with no issues, but two strange things are happening concurrently, that's kind of freaking me out. I'm really appreciating all the eyes on this and advice from everyone, though! This place is the best.
  8. Aaaaand my IP just dropped and renewed on that box again out of nowhere. Logging from that happening up to most current log entry: Jan 8 17:14:39 Tower3 kernel: mdcmd (1048): spindown 4 Jan 8 17:49:39 Tower3 kernel: mdcmd (1049): spindown 1 Jan 8 18:00:38 Tower3 dhcpcd[1703]: eth0: NAK: from 10.0.0.1 Jan 8 18:00:38 Tower3 avahi-daemon[3554]: Withdrawing address record for 10.0.0.9 on eth0. Jan 8 18:00:38 Tower3 avahi-daemon[3554]: Leaving mDNS multicast group on interface eth0.IPv4 with address 10.0.0.9. Jan 8 18:00:38 Tower3 avahi-daemon[3554]: Interface eth0.IPv4 no longer relevant for mDNS. Jan 8 18:00:38 Tower3 dhcpcd[1703]: eth0: deleting route to 10.0.0.0/24 Jan 8 18:00:38 Tower3 dhcpcd[1703]: eth0: deleting default route via 10.0.0.1 Jan 8 18:00:38 Tower3 dnsmasq[4383]: no servers found in /etc/resolv.conf, will retry Jan 8 18:00:38 Tower3 dhcpcd[1703]: eth0: soliciting a DHCP lease Jan 8 18:00:39 Tower3 dhcpcd[1703]: eth0: offered 10.0.0.17 from 10.0.0.1 Jan 8 18:00:39 Tower3 dhcpcd[1703]: eth0: probing address 10.0.0.17/24 Jan 8 18:00:43 Tower3 dhcpcd[1703]: eth0: leased 10.0.0.17 for 86400 seconds Jan 8 18:00:43 Tower3 dhcpcd[1703]: eth0: adding route to 10.0.0.0/24 Jan 8 18:00:43 Tower3 dhcpcd[1703]: eth0: adding default route via 10.0.0.1 Jan 8 18:00:43 Tower3 avahi-daemon[3554]: Joining mDNS multicast group on interface eth0.IPv4 with address 10.0.0.17. Jan 8 18:00:43 Tower3 avahi-daemon[3554]: New relevant interface eth0.IPv4 for mDNS. Jan 8 18:00:43 Tower3 avahi-daemon[3554]: Registering new address record for 10.0.0.17 on eth0.IPv4. Jan 8 18:00:43 Tower3 dnsmasq[4383]: reading /etc/resolv.conf Jan 8 18:00:43 Tower3 dnsmasq[4383]: using nameserver 10.0.0.1#53 Jan 8 18:19:10 Tower3 in.telnetd[17347]: connect from 10.0.0.16 (10.0.0.16) Jan 8 18:19:11 Tower3 login[17348]: ROOT LOGIN on '/dev/pts/2' from '10.0.0.16'
  9. Yeah, Kodi buddies were mystified too and said it must have something to do with unraid. I'll probably just ignore it for now unless more weird things happen - no real "oh man someone's looking at my stuff" issues so much as "don't want my box to be part of some botnet" concerns.
  10. Screenshots attached; "1405986280" is the weirdly inaccessible / not visible on the unraid side folder (Kodi's listing everything that shows up in the root directory of the unraid tower, including shares and actual disks). Thanks for the swift response! And yeah, knew about the DHCP setting, but I feel like I've never had an unraid tower "drop itself" and swap addresses with another device in the middle of a nighttime period of otherwise zero activity for a good while on either side - this and the weird folder got my paranoia tingling.
  11. Strangeness continues, with no new ideas on the Kodi front. Woke up this morning, and my tower's weirdly assigned to a different IP address. Feels like it's the first time it's ever happened in a decade of unraid usage. No new strange wireless or wired devices showing up on my network, but the old unraid address is now being held by a wireless device (iPad) which was turned on, connected to wifi, in the home, and completely untouched for hours before and hours after the swap. This looks like where it happened in the log; full diagnostics attached again. tower3-diagnostics-20200108-1542.zip
  12. Yeah, I've been meaning to work on circulation - weekend project for sure! I ran ls -ail /mnt/user, and the 10-digit folder doesn't show up. Everything else looks in order. If nothing else seems off from the unraid end of things, I'm going to check with Kodi forums to see if that software has a history of "creating" weird folders like this that only it can see. Thanks for the swift help!
  13. Random, and possibly innocuous, but figured I'd check to see if this has ever happened to anyone before and been a warning sign: A strange folder is showing up among all my other user shares (named "1405986280") when I look at my unraid box's list of usual user shares in Kodi. It's not showing up in any of the individual drives viewed by Windows or Putty terminal. It's not showing up under /user or /user0. When I click on it in Kodi to try and access it, I'm warned that the share is not available. If I reset the Kodi system and go back to the folder list, the strange 10-digit folder is there and still inaccessible. Could someone have compromised my Unraid box and created some sort of folder like this for whatever purpose? If so, is there a good way to go about finding out when it happened if it didn't occur during my system's current log uptime? tower3-diagnostics-20200104-0231.zip
  14. Earlier this year, a disk inexplicably dropped from my array (but looked fine on SMART), and my Marvell AOC-SASLP-MV8 PCI Express x4 Low Profile Controllers were identified as the culprit (see attached logs for where the machine was left last time it was active - I have not turned it back on since they were captured). Those cards were totally fine in my 5.0 days, but I'm on 6.0+ now and forever with this box, so... I'm finally ready to make the LSI Controller upgrade and get this box back up and running! I thought I'd ask three big questions in here before I put any money down just to be safe: (1) From what I've read on IT flashing, I personally won't have an easy time of doing it (will need to disable and mess with a pretty crowded active unraid box regardless of which board I use to flash the controller), so unless the cost difference is double or more what I'd pay for something I need to flash, I'm fine with paying a premium for a plug-n-play return to form. It sounds like Dell H310 6Gbps SAS HBA LSI 9211-8i P20 IT Mode Low Profile or HP H220 6Gbps SAS PCI-E 3.0 HBA LSI 9207-8i P20 cards might be good options for my setup, but I figured I'd check in here with my diagnostics to be safe. (2) If I'm getting Dell H310 6Gbps SAS HBA LSI 9211-8i P20 IT Mode Low Profile or HP H220 6Gbps SAS PCI-E 3.0 HBA LSI 9207-8i P20 or something completely different, is there a particular retailer I should steer toward or country I should avoid in hopes of getting my shipment within a week or so of ordering? Right now I'm leaning toward "The Art of Server" on eBay, who pre-flashes, prices seem right (especially for the Dell) and seems fast on shipping. (3) Once the cards arrive, should I take any special steps in terms of re-attaching drive cables, or do anything on the unraid GUI before I remove the SASLPs and install the LSIs? Since the April problem disk's apparently physically fine, I'm hoping to get everything back to where it was in April by simply buying new cards and replacing the old ones, but I'm paranoid there's a step I'm missing despite all the threads I've gone through on these Marvell issues. Bonus Question: Am I looking at this entirely wrong? Am I stuck in the past and there's something new I could be doing with controller cards vs getting another pair of low-profile 8-SAS-lane cards, and I'm completely overlooking it? I think I'm on the right track, but again, asking to be safe; I don't upgrade often, so I may as well future proof as much as possible (don't really need more drive performance day to day, but faster parity checks would be nice). Thanks in advance to everyone for any guidance before I embark on my biggest unraid hardware upgrade since I started using the product 8 years ago! tower-diagnostics-20190418-2316.zip tower-diagnostics-20190418-2344.zip
  15. Well, hell. Nothing like staying up late to complete a rebuild and starting the parity check immediately after and doing it half-cocked because I'm half-asleep. Hopefully the slew of errors during the correcting check didn't screw me on a rebuild, but if I lost a few things, c'est la data.