Auggie

Members
  • Posts

    387
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by Auggie

  1. I noticed this sometime ago, but I wasn't sure exactly what was happening. About a month ago I noticed 128 errors after a parity check was completed but saw it only after a week or so had already passed. All SMART reports were good. I ran my monthly parity check several days ago and the exact same number of 128 errors occurred. All SMART reports were again good with zero anomalies on all drives. The syslog showed a spin down an hour or so after parity check was started (FYI, my disk settings are set to spin down after an hour) and that's when the read I/O errors occurred (specifically involving only three drives at that time). After performing an XFS repair on the three drives with zero issues, I again ran the parity last night. This morning, I noted in the syslog that again an hour after parity check was initiated, UnRAID spun down ALL drives and soon thereafter I/O errors on ALL drives were reported. Two hours later, again UnRAID spun down ALL drives and I/O errors started. This appears to be a UnRAID 6.7 bug wherein it's incorrectly spinning down drives that may be physically I/O active. I don't believe its hardware related as I've had this particular motherboard/chassis (SuperMicro X11SPH/Chenbro 48-bay RM43348) combination running for over six months now and never experienced anything like this before. Attached is my syslog (FYI, the parity check is still in progress and although there were numerous I/O errors, the parity check has reported ZERO errors thus far). syslog.txt
  2. It appears the problem is due to the Dynamix Cache Dirs, which I had installed recently. Removing it and the issue has not surfaced since.
  3. This has been an ongoing irritating issue with my Media Server for some time now; I can't pinpoint when this started happening, whether it was in recent 6.x releases or earlier releases (I don't recall experiencing this issue under v5 and certainly not under v4). Upgrading to a completely brand new system in every single aspect has not resolved the issue. As I watch a movie, every 20 minutes or so the video would freeze for 10-20 seconds. It's almost like clockwork. I have several different media players from different companies so it's not an issue specific to media players (though my Oppo 203 seems to experience this less often; however, it is used only to watch 4K). If I log onto the NAS and spinup all drives, this seems to alleviate the issue for the remainder of the movie. These movies are full copies; no re-encoding so they command a lot of NAS/network traffic when playing. Regarding network, I recently moved so the topology is completely different, with only one switch (16-port rackmount NetGear ProSafe) transferred to new LAN. My drives are set to the default 1 hour spin down delay. I don't believe the issue is with the specific drive the movie is on, but I haven't dug into the logs to see if UnRAID was attempting to spin that drive down, or any other drives. Perhaps it's when UnRAID is attempting to spin another drive down that it unexpectedly causes a problem with other drives. I have the Cache Directories, Auto Fan Control (I no longer need this and will delete), Nerd Tools, VM Wake-On-LAN, and Unassigned Drive plugins. My VMs are currently not running (haven't resolved libvert service error since hardware migration), though they have been in the past. When It happens again, I will try to review the logs at that time. Until then, has anyone else experienced this issue?
  4. Interesting. I will test on my 9211-8i to see if the LSI's are more immune to the PWDIS feature...
  5. Is there one molex connector per backplane, or two? The version with one is the newer with which the system I have had to have the pins taped. If you have the single, what HBA's are you using?
  6. It didn't with mine with 1 LSI 9211-8i, 1 SuperMicro SAS2LP, and 1 IBM M1015, connected to a Norco 4224. Not sure which of the HBA's I initially tried without taping, but it was with at least two drives that I experienced no power-up unless the pins were taped, after which I started taping them all immediately after shucking. FWIW, the Norco in this setup is the "newer" version that supports only a single PSU and thus has a differently designed "backplane". I've retired this case and relocating the X9 hardware to my older, dual-PSU capable 4224 for my backup server. And next PWDIS drive I get I'll test it without tape in both my new Chenbro as well as the Norco system (all three HBA's)... FYI, rebuilding an 8TB drive on the X9/Norco setup netted an average of 110MB/s. Rebuilding a 6TB of the same array on the new X11/Chenbro averaged 198MB/s, and replacing an 8TB with a 10TB averaged 117MB/s; the CPU utilization maybe pegged 50% now and then, but it was typically idling around 5%, unlike the X9/Pentinum which maxxed out quite often during rebuilds. Certainly a big step up, especially for my VMs. Now I have to build a noise reduction cabinet to quiet down this maddingly howling rig, and upgrade my home network to 10Gb...
  7. I got it all sorted out so this is to provide closure and hopefully help anyone else who expereinces similar networking issues. By default, UnRAID automatically activates and uses the first network interface it comes across, which may not be the physical first one (i.e. eth0, which is the case on my SuperMicro X9 board as UnRAID selected eth1 as the default), regardless of which port is physically connected to the LAN. All other network interfaces are then automatically set to inactive (shutdown). Since I didn't know the physical port order on the 2-port IBM NIC, I had plugged the ethernet cable to the most convenient one at the time of initial setup, which happened to be port 2 (eth1), which was the port closest to the motherboard PCIe slots. This caused all the problems I had experienced with no IP assignments and thus no network connectivity other than through the dedicated IPMI port. During my back and forth testing, I had by chance reconnected the ethernet cable to the other port (port 1 or eth0) when I had installed the trial UnRAID stick. This is why UnRAID network connectivity had become fully functional under that system, which I had initially attributed it to some configuration setting difference between my registered UnRAID stick and the trial; that was not the case here. It's all due to how UnRAID detects, enables, and disables the network ports it discovers. My recommended procedure for handling a new system and/or new NIC with multiple ports is to boot UnRAID into GUI mode (I had never before used the console GUI as all my UnRAID servers are headless) to determine which port had been assigned the default network port, and if it's currently active. If not, you can either switch the cable to the actively assigned port, or manually enable and assign the port you wish to use on the Network Settings page.
  8. Most of the Easystore/My Book drives are "white label" Red drives and may need pins 1-3 taped to disable the SATA 3.2+ spec Power Disable (PWDIS) function in order for most motherboards to recognize them through their SATA ports, both onboard and HBAs. The current anecdotal evidence, and my own experience, shows the 10TB Easystore's are presently assembled with white label WD100EMAX drives which have this PWDIS feature. On my SuperMicro X7 and X9 boards with a mix of SAS2LPs, 9211-8i's, and M1015 HBA's in Norco RPC-4224 chassis, I've had to tape the pins for these newer-spec'd drives. I just got my new X11 system up and running with a Chenbro RM43348 chassis and I went ahead and taped the WD100EMAZ pins but I haven't tested whether it needs it or not.
  9. OKAY, from this tidbit, I discovered how to enable the web GUI at the console via this "Boot GUI Mode" thread (I manually edited the syslinux.cfg file on another computer) but just in case, I also learned how to automatically pause console text scrolling via the "| less" command in case I needed to use it for screen capture purposes. Now, due to some SNAFU when I was fiddling behind the "retiring" server while it was performing a disk-to-disk file move operation and inadvertently bumped the power cord enough for it to turn off, I downloaded a trial UnRAID and used that to boot my new build and enable the GUI mode while my "old" server is busy checking parity. Right off the bat, the trial UnRAID recognized the PCI NIC and had acquired a network IP address, and all was well according to both the GUI and on the text console via lspci. So my next step is to retest my "Pro" UnRAID stick and see if it, too, will now recognize the PCI NIC. If not, at least I've narrowed it down to something with the configuration of the Pro stick and am prepared for further troubleshooting and tweaking with knowledge of how to enable the Web GUI locally and to control console text scrolling...
  10. Yea, I'll do that once I find the time to bring down the "retiring" machine again (I've been swapping its stick to the new machine for setup and testing). I'm also going to attach a monitor to the new machine to get better "eyes" on the output. I also have a brand new ASUS XG-C100C that is intended for my other UnRAID NAS and I'll test it in place of the IBM 1G NIC I'm having problems with to see if it get automatically recognized by UnRAID...
  11. Well, I was able to "disable" via BIOS, but I still can't seem to get UnRAID to recognize the PCI NIC now there doesn't appear to be any Ethernet NIC's available on UnRAID. The problem is that since I can't get a network-assigned IP address, I can't access UnRAID's web GUI from anywhere on the network. I can't figure out how to pause the scrolling in the small window of the SuperMicro mobo's iKVM/HTML5 screen so there's no way I can see what was scrolled out of view. I tried the Java-based console redirection but continually get errors getting the screen working. But regardless, I don't think being IBM-branded should be problematic; it's still an Intel-chipped dual-RJ45 Pro/1000 NIC. So at this time, I'm dead in the water as I just can't access unRAID's GUI nor can I effectively use the "console" as I can't pause the scrolling of pertinent information before it disappears.
  12. I know in the past that UnRAID always used the first onboard NIC it came across with no option to select any other onboard or PCI NIC. Has this changed with the current version of UnRAID? I'm in the last steps of transitioning to my new SuperMicro X11SPH-nCTPF build, but since it has a pair of SFP+ 10Gb and I haven't yet upgraded my network to 10Gb, I'm trying to use a PCI 1G-BaseT NIC and haven't been able to get UnRAID to default to it as there is no way to disable the onboard NICs via BIOS or other means. Also, I'm not sure if the IBM Intel Pro/1000 39Y6127/39Y6128 NIC is even being recognized at all if I read the lspci list correctly (I'm a real noob at this command). So, two questions: 1) How do I set up UnRAID to use a specific NIC? I came across a few threads which mentioned "stub" but those were way too technical and lacked in a "layman's" step-by-step approach that I could follow. 2) What are some UnRAID "friendly" (i.e. plug-n-play) NICs I could select from, if the one I got isn't compatible?
  13. I bit the bullet and did the long route. I just didn't want to spend another 48 hours or more building parity then rebuilding drive, but it's the safest bet.
  14. Yea, the data rebuild finished with massive errors; I didn't know anything was wrong until I got home and the rebuild was completed. Before I left, I had been checking in on it all day and it had less than 2TB left to go in rebuilding from 4TB to 6TB and there were zero errors at the time. I'm just going to reinstall the original 4TB and rebuild the parity disk itself. Can't wait to migrate this server to the new Chenbro case with new SM X11 board. This current RPC4224 case with that broken SAS backplane has always given me fits since day one.
  15. In my ongoing saga, a cable got disconnected (actually, it's the metal cage on the SAS backplane that holds the SAS cable in place that's partially broken off so it no longer securely locks the cable in place) during a disk replacement data rebuild. After resecuring the cable, I ran a parity check (no write corrections) and now its coming across sync errors but the log shows very generic entries, an example of which: Oct 15 09:05:16 UnRAID kernel: md: recovery thread: P incorrect, sector=9344918184 I don't know which disks these errors are being detected on or if it's on the parity disk itself so my question: How do I rebuild the replacement drive correctly? Should I reinstall the original drive and do a parity rebuild of the parity drive, then once successfully completed, reinstall the replacement drive? Or is there another way I could just restart the data rebuild? I really need to know which drive(s) the sync errors are occuring on as I want to make sure it's only on the replacement drive itself. All other data drives should be fine and parity drive should be accurate as I rebuilt it from scratch a few days ago after my previouis SNAFU.
  16. I figured however the limit was established, many of the dependent routines were coded with that limit in mind and that changes to the limit would necessitate a cascading affect in order to support the new limits. Ergo, not a simple task.
  17. Yea, I've always been wary of the SAS2LP's as there have been unexplained "drops" that eventually get cleared up by reseating or restarting. Hence my very thorough swapping of SAS cables through multiple changes to both see if it magically corrects itself or if a dropped disk follows the cable. In this case, it was a cable. The problem is that the corruption was already done from the parity check with auto-correction. Anywho, I'm switching to a Chenbro chassis and new mobo which has two 12GBs SAS ports builtin so I will be doing away with the SAS2LPs; the current mobo will be swapped into my backup server, which already has one LSI and with the addtion of the IBM and the expander card, I can finally rid myself of the problematic SAS2LP's entirely from my life.
  18. I ran the xfs_repair again via terminal and attached text output is what I got. And this is my new syslog from startup after getting home today (didn't save the original log when the errors occurred as I felt at the time it would be all related to the bad cable and not a failure of the drive hardware-wise): Oct 9 17:50:24 UnRAID kernel: XFS (md10): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len 1 error 117 Oct 9 17:50:24 UnRAID kernel: XFS (md10): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117, agno 0 Oct 9 17:50:24 UnRAID root: mount: /mnt/disk10: wrong fs type, bad option, bad superblock on /dev/md10, missing codepage or helper program, or other error. Oct 9 17:50:24 UnRAID emhttpd: /mnt/disk10 mount error: No file system I figured that if the emulated drive also showed no mountable file system then the data is essentially lost unless I engage a data recovery service. It's all media files and I could repopulate; a time consuming but not insurmountable task... Terminal Saved Output
  19. Apparently one of my SAS cables went bad after I physically moved the server’s site location. I determined it was a bad cable by swapping cables to different ports then different backplanes and the issue followed the cable. Unfortunately, unRAID did not detect any issues until I started a parity check with auto error correction and suddenly got thousands of errors on two affected drives, both resulting in No Mountable Filesystem errors. I was able to correct one of them but xfs_repair failed on the other so presently I have no way of seeing any of its contents not even in emulated mode. I’ve tried both the webGUI and terminal sessions to fix the drive to no avail. So my question: should I just reformat the drive and rebuild the data directly over it or is there a possible way to somehow recover data on it? If there is a reasonable way, then I’ll pull it and get a brand new drive to rebuild the data and use the old drive to compare or recover any corrupted data. Otherwise, I won’t spend the extra coin and just rebuild over it.
  20. Perhaps. For my specific application, I need just one "share point" to access my entire library; I can't have multiple arrays as that would then make accessing the media very cumbersome (everyone would need to know where certain titles exist or have to individually and separately access each array until they find what they were looking for). Still, there hasn't been any real answer as to why the 28+2 data drive limit; it seems an arbitrary number based upon assumptions of what the target audience usage would be. In the meantime, I'm just throwing more money into the pit by being forced to switch to 10TB drives to replace perfectly functioning 6TB drives in the media server as it's the only way presently to increase data capacity within the 28+2 data drive limit.
  21. This is what I currently do as I have two unRAID boxes in Norco RP4224s (one of which will be replaced with the new Chenbro 43348), but I've reached the point where I just don't any more capacity on my backup unRAID as it has more than enough to handle all my computers. So all these replaced drives are just not being truly utilized in any meaningful manner while wasting power. I'm now just stacking the unused drives on a shelf to gather dust...
  22. Yea, I just noticed those. The latter is somewhat generic as 'device' could refer to either data or cache, while the former is a bit specific, though I would certainly be satisfied with 45-60 data drives as I just received my Chenbro 48 (3.5") + 2 (2.5") drive top-load chassis and a 45-60 drive limit would maximize this investment. It doesn't hurt to keep adding multiple feature requests to increase its visbility!
  23. Request to have the current 30 data drive (28+2) limit increased. My specific application is a media server and with the advent of 4K videos, the need for storage has almost tripled per video. I've hit the maximum 30 drives and as I replace my smaller 4TB and 6TB models with 8 or 10TB, I cannot reutilize those 4TBs within the same data drive pool; cache drives server no real purpose for me on this server. Speaking for myself, I am willing to pay an upgrade or higher tier license fee for the ability to go beyond 30 data drives.
  24. I, too, would like to see the 30 data drive (28+2) limit increased. My specific application is a media server, therefore, data integrity is not the highest concern as I could always reload any media files lost; I agree with Ashman that the responsiblity rests on the end user, and so should the option to incorporate a larger data pool beyond 30 drives. With the advent of 4K videos, the need for storage has almost tripled per video. As the OP, I'm about to purchase a SuperMicro 36-bay chassis since my current Norco 4224 I've begun putting in drives loosely inside the enclosure in any free space around the mobo, as well as on PCI backplane brackets. Since the chassis is installed in a rack with other enclosures, even with sliding rails, accessing these drives requires removing the enclosure immediately above it in order to get the cover off since it only slides out about 2/3rd's its length; a big PITA. I did a quick search in the feature request thread but didn't see if increasing the data drive limit had been requested. If it hasn't I will post a new feature request.
  25. I just installed 6..1-rc5 and under OS X Safari and Firefox, I still don't have any cursor inside the VNC window for Ubuntu 17.10 VM.