gubbgnutten

Members
  • Posts

    377
  • Joined

  • Last visited

Posts posted by gubbgnutten

  1. I’d gladly pay for one disk of extra redundancy.

     

    Please don’t dismiss #1/#2 until you at least get a quote for the royalties requested. It would be awesome not to have the entire array spun up. Not a deal breaker for me, though, but it would be really nice.

     

    Given my writing pattern I’d rather have dual fault-tolerance and all drives spinning while writing than single fault-tolerance and mostly spun down drives. There's always the cache pool if it bothers me...

  2. ...and for clarity please include what you mean by "getting hacked" in your answer to jonp's question. :)

     

    If a service is insecure it simply shouldn't be exposed to the Internet. Depending on the situation, I might possibly share such a service with select friends using OpenVPN or SSH tunnels.

     

    If I had a (supposedly) secure service exposed to the Internet and got lots of malicious access attempts from Chinese IP ranges, I would have my firewall block all known Chinese IP ranges from accessing the service as I don't expect any connections from China. Blocking those IP ranges wouldn't actually improve security that much, of course, but it wouldn't make it any less secure and my logs would probably get slightly less cluttered. Win.

  3. Works for me in 6.0.0.

     

    I had been running with VT-d disabled in BIOS since the first beta I tried in order to get the onboard 88SE9172 to work in 6. Tried enabling VT-d now after RobJ's post, and all my disks are still present and functional. Yay! :-)

     

    This is the second report I've seen that a card that *shouldn't* work (according to outside reports) *does* work.  The other report seemed to be a card with a newer firmware, so perhaps Marvell has quietly fixed the problem (but only available in new purchases).

     

    I'm more inclined to believe that the kernel now includes fixes than that someone replaced my hardware without me noticing.  :D

     

    The 88SE9172 did not work properly in 6b8; drives dropped out immediately on boot after a bunch DMAr errors until I connected the IOMMU dots and disabled VT-d. In 6.0.0 on the other hand the drives still work after enabling VT-d. The only hardware changes between the 6b8 and 6.0.0 has been the addition of a couple of more hard drives and a new NIC. No firmware updates or motherboard replacements...

     

     

     

  4. I hadn't given it a thought that 11 MB/s was slow... now I have another thing to think about! I'm using brand-new HGST 3TB drives with a brand-new Samsung SSD on a wired powerline adapter Ethernet network; I should be getting lots higher speeds! And yes I had cache enabled on the shares because I saw the red triangles to the left of each share I was using.  I actually started using TeraCopy because using Windows Explorer was giving me less than 6 MB/s.  The ironic thing is that I can stream huge 1080p files without a hiccup!  I'm going to have to run more experiments and report back, this doesn't seem right at all.  Thanks for your help!

     

    The powerline part is most likely your number one limiting factor.

     

    As for 1080p streaming, not so ironic. It doesn't require that much bandwidth.

  5. Those speeds are very good.    You should get that range even with a parity drive, since you're using a cache [As long as you don't move more data than you can cache until the Mover has a chance to empty the cache].

     

    Sure. In this case, more data has been transferred than the cache drive can hold, so speeds would certainly be lower towards the end with parity. But I have a hard time imagining that these volumes of data will be transferred very often. No biggie.

     

    More worrying, though: If the web-UI in the screenshot is up to date, the cache drive isn't being used for the destination "movies". The amount of used space looks more like just a Docker image (around 10G). No user share write caching...

     

    I'm moving nearly 2 TB of movies to my new server (through my 750GB SSD cache).  Getting about 100MB/s average transfer speed over the entire process.  I think that's pretty good, but wanted to check with folks to see what kind of speeds they are getting.  In case I should be considering something.  1GB/s network is about 125MB/s, so with protocol overhead and such, 100MB/s (at a 20% overhead) seems ok?

     

    This is what the Array Status page and the file transfer looks like with 5GB to go.

     

    http://my.jetscreenshot.com/12412/20150423-zkoq-230kb

     

    Comments?  Suggestions?  "just enjoy it"?

     

    I typically use FTP for transfers, so I can't comment on the overhead and transfer rates for SMB. Still, you might be slightly limited by the speeds of the HDD you are reading from and/or the ones you are writing to, but it isn't a life-changing difference in speed.  :)

     

    From a network perspective it's pretty good, with gigabit networking you can't go much faster so I'd say "just enjoy it" if it hadn't been for two things...

     

    1. Speed tests without a proper parity are not very interesting to me, that's only good for the initial transfer and not real-life use.

    2. The cache drive don't seem to be in use for the transfer. If the web UI in the screenshot was up to date, you should probably double-check your user share cache drive settings.

     

  6. But I could see an argument for a stop array button and a power down (Which is actually just stop array and powerdown) button.

     

    I am also a bit annoyed by the lack of an ever-present power down button in the web UI and the need for multiple steps, but just a bit. Pushing the power button works just fine for me (thanks LT for fixing that bug a bunch of betas ago!).

     

    If you need a workaround for quick and easy power down via a browser, bookmark http://tower/update.htm?shutdown=apply (replace tower with the correct server name).

  7. I guess the easiest way to see if the cache drive is properly set up is to write data to a user share and check the web UI if the used space on the cache drive went up. As for configuration, you might have to also check the individual shares as well and make sure that they are set to use the cache drive... I had that problem once, added a cache drive but not having it used. That was a long time and many versions ago, though.

     

    Anyways, a couple of things to consider regarding your performance evaluations:

     

    If you have plenty of RAM in your server it will use RAM to buffer lots of writes. As long as there is RAM available, files transferred to the server over the network will be as fast as the network can shuffle data. You won't notice any difference between cache drive/no cache drive writes until you've written enough data to fill the RAM buffer. With 16GiB of RAM and a benchmark application writing a 4GiB file over the network for example, the network will in virtually all cases be the bottleneck. It doesn't matter if the transfer is destined for a SSD or an array device with parity active.

     

    If you performed your tests with only one data drive assigned you hit a special case for the parity update which made it much faster than it would have been with multiple data drives.

  8. I think you are making it way too complicated... Let your Mac and CCC bother with HFS+ and disk images. unRAID should only provide storage of the image file created/edited via your Mac, and for that you don't need additional software.

  9. That's true -- if the amount of data written is always less than UnRAID can cache, then the perceived write speed could be at those speeds.    With 16GB of RAM, that's a possibility.  But the actual writes are NOT happening at those speeds.

     

    As I'm sure you know, the 80Mb/s you see "for the first gig or 2" isn't actually being written to the disk at that rate ... it's just being cached in the server's memory.

     

    So true. I have 16GiB of RAM in my server, and most transfers clock in at well above 100MB/s (112MB/s typically) but if I issue the "sync" command directly afterwards it takes some time for it to finish...

     

    The RAM buffer generally keeps the perceived array writes quick enough for me not to bother with a cache drive and Mover, or setting reconstruct-write manually in everyday use.

  10. Powerdown package 2.08 installed with rc.unRAID patched, the line with /root/mdcmd stop is commented out.

    Unclean shutdown (6b9_powerdown208_patchednostop_unclean.txt)

     

    This is guaranteed to cause an unclean shutdown.  The array is stopped by powerdown in the rc.unRAID script.

     

    Well duh. That's the point I was trying to make in the post you replied to. Granted, English is not my first language :-)

     

    I looked back through this post, but never saw anything about the plugins or extra stuff you have installed.  Do you have any plugins running?

     

    For those logs - no plugins added, no extras added.

     

    Except for the Powerdown package (when mentioned), there was nothing but Docker Manager and unRAID Server OS in the plugin list, and not a single Docker container in sight. The reason you didn't see anything about plugins or extra stuff in that post is because I didn't have them for those logs, and I expected the syslogs to speak for themselves in that regard.

     

    I can't work with the logs you posted.  I'd like to see a plain txt file of the powerdown shutdown.  It will tell me what powerdown is stopping that might be causing the unclean shutdowns of the stock unRAID.

     

    Depending on platform, might I suggest gunzip or 7-zip?

     

    Where exactly does the stock unRAID try to stop the array if everything is working as expected?

  11. Post a log of the unclean shutdown.  If powerdown is installed, the log is saved on the flash drive.

     

    I've tried powering down in so many situations, so I don't really consider it to be a specific "the unclean shutdown". The single line for "other cases" represent quite a few variations. See the attached 6b9_powerbutton_arraystarted.txt a couple of posts back for one of the unclean shutdowns. Should be valid enough still, although Docker might not have been manually stopped on that one before the power button was pressed.

     

    As I wrote, shutdowns with Powerdown installed work reliably, clean and nice. Most likely because rc.unRAID is stopping the array if it’s running.

     

    A few logs for comparison:

     

    Powerdown package 2.08 installed:

    OK (6b9_powerdown208_clean.txt)

     

    Powerdown package 2.08 installed with rc.unRAID patched, the line with /root/mdcmd stop is commented out.

    Unclean shutdown (6b9_powerdown208_patchednostop_unclean.txt)

     

    Stock (without Powerdown package), Docker manually stopped before pressing power button. Missing the end of the log for obvious reasons...

    Unclean shutdown (6b9_stock_dockeralreadystopped.txt)

    6b9_powerdown208_clean.txt.gz

    6b9_powerdown208_patchednostop_unclean.txt.gz

    6b9_stock_dockeralreadystopped.txt.gz

  12. The following methods seem to reliably result in clean shutdowns on my lab server:

    • Web UI (stop array, then checkbox+button)
    • Executing /root/powerdown (or the symlinked /usr/local/sbin/powerdown).
    • Quick press on the power button after stopping the array using the web UI.
    • Quick press on the power button after manually stopping the array in a manner I wouldn’t use on a non-lab server (/bin/umount on /dev/md* and /root/mdcmd stop)
    • Quick press on the power button with a Powerdown package installed (https://github.com/dlandon/unraid-snap/raw/master/powerdown-x86_64.plg)

    The following method seems to reliably result in unclean shutdowns:

    • Quick press on the power button in other cases than listed above.

    Docker is running without apps, but stopping Docker via either "/etc/rc.d/rc.docker stop" or the web UI before pressing the power button does not help, the shutdown is still unclean. I can only get a clean shutdown when the array is nicely stopped before powering down, and I can’t see that happening automatically in a stock 6b9 when the power button is pressed. Switching to runlevel 0, I don’t see any opportunity for an array stop via /etc/rc.d/rc.0, unless there is a custom stop script present in /boot/config/stop...

     

    I’m probably overlooking something vital, as I can’t see how anyone could get a stock system to power down nicely when the power button was pressed (with the array running at the time). Would love to have someone point out what I’m missing so I can read up on it :-)

  13. Can this be closed now?

     

    Not unless you made changes that fix this in the upcoming beta 10, or consider this to be either reasonable behaviour or user error.  :)

     

    I saw the same problem earlier, "No space left on device" when trying to create a cache only share on a newly formatted BTRFS drive. I could however create the folder manually with "mkdir /mnt/cache/appdata" and start using it right away, and couldn't reproduce it after a restart so I didn't think too much about it back then.

     

    Seeing this topic and reading ljm42's post made me remember I had this problem, so I sacrificed the cache drive on my lab server and reproduced it with the following steps:

     

    1. Set cache drive to BTRFS

    2. Start array

    3. Format unformatted disks

    4. Try to add a cache only share "appdata". Silently fails in the web ui and a "No space left on device" message in the log.

    5. Repeat 4, identical result.

    6. Stop array

    7. Start array

    8. Try to add a cache only share "appdata". OK

     

    System Log attached.

    6b9_btrfscache_cacheonlyfailed.txt.gz

  14. There are numerous ways to get the system log, e.g., click on Tools then System Log.  Capture that log, then click on Log button on menu bar to bring up a real-time log.  Then hit your power switch and let's see what is causing your unclean shutdown.

     

    Can't hurt with another log, I guess...

     

    Shutdown via web interface: OK (never a problem)

    Shutdown via power button when array hadn't been started: OK (6b9_powerbutton_arrayneverstarted.txt for comparison)

    Shutdown via power button when array was started: Unclean shutdown detected (6b9_powerbutton_arraystarted.txt)

     

    More or less a clean install of unRAID. Plex Media Server is installed a Docker app on the system, but the container was never started between boot and power down in any of the above situations.

     

    I could probably connect to the servers remote management interface and capture a video of any console messages if necessary.

    6b9_powerbutton_arrayneverstarted.txt

    6b9_powerbutton_arraystarted.txt

  15. OK this is an interesting theory but I fail to see the connection between VTd and the disk device drops.  I realize you performed these steps and got a good result, but what made you think to disable vtd in the first place?

     

    Right, probably should've mentioned the connection. I wasn't just randomly changing settings, although I realise it might have sounded like that. :D

     

    After upgrading BIOS and verifying that all (reasonably related) settings still had decent values, I started looking up the error messages more thoroughly and the trace eventually led to a discussion about VT-d. Archedraft's suggestion of googling "dmesg vtd marvell" probably says it all, but if needed I could provide dmesg output with the problem.

  16. Safe to use?

    Is what safe to use?

     

    If you simply mean beta 8 then it is certainly as safe to use if you are using base unRAID features.

     

    Assuming by base features you do not mean disk replacement expansion. It does not support replacing a smaller disk with a larger disk.

     

    Well sort of. You can replace a smaller disk with a larger disk, and your data will still be safe, right? You just won't get any of the extra space automatically at this time. Annoying for sure, but I'd still consider it "safe to use" in the context of that particular flaw, although it is obvious that there are different opinions regarding what "safe to use" means...

     

    Speaking of beta software, it is certainly the case that things that “used to work” in the stable release might stop working in a beta.

     

    After running betas flawlessly for a while on a dedicated lab computer I decided to try 6b8 on a more normal server that previously ran 5.0.5 flawlessly. Didn’t start too well, of course… Long story short, two disks were missing after booting 6b8, and I had to chase the problem for a bit before I could have everything up and running. Was tempted for a while to revert to my backup…

     

    So what went wrong? The server in question has a Gigabyte Z87X-UD3H motherboard, and the two missing disks were both connected to the Marvell 88SE9172 chip. Combing the dmesg output, the kernel indeed detected the disks when booting, but then lost them after DMAR errors. Upgrading the BIOS didn’t help, but disabling VT-d in BIOS made everything work nicely.

  17. 1. USB On Board

    2. 10 Sata X 6 "I know unless ssd, it's pointless" but is there anything wrong with the sata controller?

    3. Dual Lan "once again overkill" I haven't heard anything wrong with the intel

     

    Trying out a Z87 board at the moment. It is not the same model or brand, but I guess the chipset-related stuff is still somewhat relevant:

     

    1. Connected the unRAID flash drive via an adapter to an internal USB header. So far so good, very nice.

    2. Running pre-clear for 4-8 cycles on three drives in parallel. No problems for the first 66 hours (and counting). Speed as expected.

    3. No luck with the I217V LAN, using a separate Intel NIC for the time being.

     

    I was considering that particular ASRock board for a brief moment, but then remembered I usually stay away from that brand. Bad experiences from years back (probably not relevant anymore of course).

  18. This card is detect by unRaid (v5.0) and SATA Controller enabler script, works perfectly with 1 drive ( WDC_WD1502 ), but when comes the time to plug more drive, it fail ...

     

    Lots of data corruption, controler Timeout .... need to unpluggued and return to vendor ..

     

     

    In what situations did you encounter data corruption and controller timeouts?

     

    I'm using a Rocket 640L on 5rc11 (no enabler script though), and am in the process of adding my second drive to it. The second drive is currently pre-clearing and I've run a parity check at the same time to stress the system a bit. So far nothing bad reported in the system log, and no parity errors found. Still waiting for the pre-clear results, and I guess simultaneous writes to both drives on the 640L remain to test as well.

     

    EDIT: Two drives worked perfectly. Pre-clearing the third drive brought down the system, catastrophic failure. And perfectly reproducible. Two drives fine, three drives disaster. Controller card disappears under load.