Considering Unraid: please critique and/or advise


Recommended Posts

Hello all. I've been looking at building a file server for a long time, and have been lurking quite a while. I'm considering Unraid, firstly, and then possibly OMV or a free-standing Linux installation...or maaaaybe a Server 2019 install. I run a business and also enjoy movies and music. I'm not quite an /r/DataHoarder, but it wouldn't take much to push me over the edge.

 

My need is for a reasonable level of redundancy: real-time or snap-shotted parity should be fine. Most of the data just sits, so allowing the drives to spin down is very S.M.A.R.T.. My backup plan is to take the crucial data, sitting on one drive, and back it up twice: a nightly, and a monthy. That used to be my 'D:' drive with robocopy. To facilitate all of this, I'm making a rack-mounted server (my first) in a small, well-ventilated room under lock and key.

 

Here are the decisions I've made or am making. Please feel free to critique them: better to reevaluate or take a step backward, now, than to lose a lot of money or data, later.

 

I bought a used rack. In that rack will sit the file server, a surveillance camera server (to-do), and eventually, a web server for the business (long-term to-do). I'll use two power strips: one going to line (240v 30amp), the other going to a large UPS (needs selection and purchase). The 30amp line will eventually be backed by an automatic transfer switch on a whole-house generator. Every piece of equipment will have redundant power: one power cable to each of the two sources. Anything that's not redundant (switches, modem, screen, etc.) will plug into a power distribution unit which itself has the redundant cabling. I'm still struggling to understand how to arrange the rack, i.e. which order for the equipment, which pieces face which direction (facing front, facing rear), how to run cables neatly, how to install and populate patch panels (RJ-11, RJ-45, RG-6), etc. It's a huge learning curve. It's no wonder that Information Technology (IT) requires a four-year degree.

 

The file server: I bought a used, 24-drive Supermicro with a Supermicro X9 motherboard, two Xeons, and 64GB of RAM. The bays are driven by three (3) SAS 9210-8i cards, sitting on an 846A backplane. I'll populate the bay with my existing 'D:' drive (henceforth, "data" drive) followed by five (5) 10TB HGST drives (henceforth, "media" drive(s)) to start with. The data drive will rcopy (to-do: which backup software?) to the nightly, and it, to the weekly. The 10TB drives will hold (4) x 10TB and (1) x Parity. For cache, I'm upgrading the file server with a PCIe NVMe card and 2TB drive. After much research, I chose the ASUS Hyper M.2 (4x4x4x4 Bifurcation) and an HP EX950. I guess I'll boot the thing from a small Samsung 960 Pro. I'm also dropping in an Intel X550-T2 since I require fast file ops from my desktop. Those are my hardware choices.

 

For networking, I'm considering going with a complete Ubiquiti UniFi solution. In the short-term, I'll reuse an existing 100GbE switch, and an existing Cisco 24-port POE switch. The Cisco will be strictly for IPMI and cameras (12 ports are POE). A lot of research will be required to create safe, public-facing entry points for remote maintenance and connecting to the camera software.

 

- I have a dozen other uses for this server, including MySQL, source control, web server storage, Emby, etc. Unraid needs to play nicely, and professionally. The thought of running from a USB key is, quite frankly, crazy. That would be a show-stopper. What are the latest and greatest how-to instructions for running sanely, from a drive?

 

- How do I make hard drive lights blink for safer administration? SES? SGPIO? What software? How can the software be integrated into Unraid? They're all to-do questions that I've back-burnered. I'm pretty sure it'll be OK, because I didn't choose a SAS Expander but have individual cabling from the SAS cards.

 

- Is a 30-day trial enough? I'm worried. I need to get everything running on the bench, then I need to unplug it all for a week or two as I work on the rack, at which point I can open it to the rest of the network and do some informal load testing. Thirty days is not a lot of time to try before you buy.

 

Well, any thoughts are appreciated: tips and tricks for building the server, known sticking points or land mines, suggested how-to pages...anything. I'll be posting in here, occasionally, as I have questions or if anyone would like updates. Thanks for listening, and please feel free to advise and/or (gently) critique my setup. The community surrounding Unraid feeds into my decision-making process. Cheers!

 

Edited by mmmann
Link to comment
2 hours ago, mmmann said:

The thought of running from a USB key is, quite frankly, crazy. That would be a show-stopper. What are the latest and greatest how-to instructions for running sanely, from a drive?

Unraid doesn't run from a USB key. The USB key contains the archive of the OS, and settings from the webUI. These are unpacked and copied into RAM at boot and the OS runs completely in RAM. The USB key is used very little, mostly just to save config changes to be reapplied at boot.

 

2 hours ago, mmmann said:

populate the bay with my existing 'D:' drive

Unraid must format any disk it will use in the parity array or cache pool. The Unassigned Devices plugin will allow you to mount and read other disks with various filesystems.

 

2 hours ago, mmmann said:

real-time or snap-shotted parity should be fine.

Realtime parity is the only way Unraid does it.

 

Might be useful for you to take a look at some of the documentation since the storage design of Unraid (not RAID) may be somewhat different than you may be familiar with in other systems.

 

Here is the Overview from the wiki:

 

https://wiki.unraid.net/UnRAID_6/Overview

 

And of course the website:

 

https://unraid.net/

 

If you want to get into some of the details the pinned topics such as FAQs in the subforums. Here is one:

 

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?page=2&tab=comments#comment-554741

 

Link to comment

Update on Unraid Testing: I'm stuck!

 

Help! I put my hardware together, loaded Unraid on an approved key, loaded my drives and ran parity all night. I thought I was ready to send some files over.

 

Unraid seems to be completely hung! I decided to enable the 10GbE card--why not, since it's part of the test, and I was planning to send 2TB over the wire. I went to Network Settings and, following a Spaceinvader One video, set 'eth4' to a static IP address and hit [Apply]. A message appeared inside the 'eth5' box that said, "please wait ... configuring interfaces", and it never came back. Long story, short, I ended up issuing a shutdown at the root prompt. After a reboot, Unraid is dead.

 

My test of Unraid is dead in the water at this point. Any help would be appreciated.

Link to comment

Thanks. Unraid is alive, again.

 

This is an Intel X550-T2. According to Spaceinvader One, you assign 'eth4' a static IP address and enable Bridging (eth4, eth5) and hit [Apply]. Should I try again? Unraid v6.6.7. The adapter is found, looking at Tools->System Log ("Intel(R) 10GbE PCI Express Linux Network Driver - version 5.5.2").

Link to comment

file-server-diagnostics-20190419-0953.zip

 

Well, now I have a "red ball" next to one of my drives. Perhaps due to the shutdown(s)? Still, you'd think that Unraid, upon reception of a shutdown, would behave gracefully. Or maybe the drive really /is/ bad--it's brand new, and could be suffering from Infant Mortality. I found the instructions in the Troubleshooting documentation, and have uploaded the Diagnostics File.

 

Can someone please look at the diagnostics for me?

Link to comment

Wow. Another 24 hours before I can try to write my first file?

 

Trying to get the red-balled drive online, again, I ran the drive diagnostics, and then again, removing the '-n' flag. It seems that there was a dangling inode, probably due to the shutdown that I was forced to issue. Then I followed the advice in the Troubleshooting documentation to bring the drive back online. I stopped the array. I set the drive to unassigned. I restarted the array. I re-assigned the drive. God help me--1 day and 6 hours to do a rebuild?! There is NOTHING on any of the drives--they're empty of any user files. I have yet to write a single file, so far. Unraid has to rebuild 9.9TB of data again?

 

My test drive of Unraid is going poorly. Unraid hangs upon enabling an ethernet adapter. It doesn't gracefully exit on a shutdown? It thinks it has to spend 24 hours rebuilding a drive, due to one fault in the file structure? And the interface is difficult: you shouldn't have to refer to a Wiki to learn how to re-enable a drive--there should be a big [RE-ENABLE THE DRIVE] button next to the drive that's offline, or a drop-down with RE-ENABLE, REMOVE, ETCETERA options.

 

If a developer wants to talk to me, privately, please do. You can (a) gather my /fresh/ user experiences as feedback on the product, or (b) help me with this <redacted> product.

Link to comment

Unclean shutdown doesn't disable a disk, only a write error, there are more errors in the short log you posted:

 

Apr 19 09:47:03 FILE-SERVER kernel: scsi 7:0:0:0: rejecting I/O to offline device
Apr 19 09:47:03 FILE-SERVER kernel: scsi 7:0:0:0: [sdc] killing request
Apr 19 09:47:03 FILE-SERVER kernel: scsi 7:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
Apr 19 09:47:03 FILE-SERVER kernel: scsi 7:0:0:0: [sdc] CDB: opcode=0x7f, sa=0x9
Apr 19 09:47:03 FILE-SERVER kernel: scsi 7:0:0:0: [sdc] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
Apr 19 09:47:03 FILE-SERVER kernel: scsi 7:0:0:0: [sdc] CDB[10]: 00 00 00 08 00 00 00 08 00 00 00 00 00 00 00 01
Apr 19 09:47:03 FILE-SERVER kernel: print_req_error: I/O error, dev sdc, sector 64
Apr 19 09:47:03 FILE-SERVER kernel: md: disk3 read error, sector=0
Apr 19 09:47:03 FILE-SERVER kernel: md: disk3 read error, sector=8
Apr 19 09:47:03 FILE-SERVER kernel: md: disk3 read error, sector=16
Apr 19 09:47:03 FILE-SERVER kernel: md: disk3 read error, sector=24
Apr 19 09:47:03 FILE-SERVER kernel: Buffer I/O error on dev md3, logical block 0, async page read
Apr 19 09:47:03 FILE-SERVER kernel: Buffer I/O error on dev md5, logical block 0, async page read

These are hardware errors, suggest you update the LSI firmware to latest, 20.00.07, since the one you're using has known problems.

Link to comment

It is worth pointing out that Unraid has no idea that the drives are empty which is why it wants to do the rebuild.    The sort of time you quote is not atypical for 10TB of data.    In this particular case you could just use Tools->New Config to reset the array.    If you had data on the disks that would not be a good idea as it could lead to data loss.

 

As was mentioned a disk being ‘disabled’ means a write to it failed.   You need to work out why.   It could just be a cabling Issue but whatever the cause it needs resolving.

Link to comment

I'm going to figure out how to flash the latest firmware to the LSI SAS 9210-8i boards. Then I'll give Unraid 24 hours to rebuild, and give it another go. I can't be more fair than that. I'll still require help on the 10GbE card. So I'm dead in the water until tomorrow, at least, and my 30-day try-before-you-buy timer is ticking away...

 

I stumbled around and onto the LSI web site, and downloaded the 20.00.07 firmware. The instructions are a joke--a bunch of mumbo-jumbo command line options that require extensive web crawling to decipher. Something like, "make a bootable DOS USB stick, copy these files, boot it, and run this program" would be nice. I've come to believe that you can't be a programmer /and/ write user-friendly code or documentation...or interfaces. The companies that conquer this task are the titans of the business world. Edit: I'm sure that the LSI series of boards is incredibly common. Why, then, doesn't Unraid have a hardware-compatibility module that spits out the appropriate warnings?

 

Shouldn't Unraid have buttons or a drop-down next to the offline drive? Shouldn't there be balloon hints on EVERY object? Where are the [?] clickables that bring up documentation, online and on tap, inside the program? Nobody should have to refer to a Wiki or a forum in order to run business-critical software. If I were running Server 2016, one call to Microsoft (and a $200 fee) would resolve just about any question I had. So $130 for an Unraid license starts to look like a bad deal, when, for a few hundred dollars, you can have a Fortune 100 company backing you. I'm just keeping it real, here. I hope the unraid.net folks are listening: these are the observations of a fresh presales client. As far as customer feedback goes, it doesn't get any better than this.

 

EDIT: I read the above response, something about a "Tools->New Config to reset the array." It sounds like that's a shortcut to bring an empty drive online more quickly. THEN...why is there not a [RESET THE ARRAY] option on the main page? This just further illustrates my point: the user experience is paramount to the operating of the software. If this was in production, running my (very) small business, I'd be royally <redacted>. I'd be where I am, now: begging for scraps of help off the forum table, trawling and crawling around the web, reading outdated Wiki's, etc. Once again, I'm just feeding back my initial observations and struggles and experiences.

Link to comment

Consulting the forum is a good idea instead of having some button the user doesn't really understand what it does and which may take an inappropriate action.

 

Unraid disables a disk when a write to it fails. But, that failed write still updates parity. So, the parity array still contains the written data, but the physical disk does not. And after a disk is disabled, Unraid doesn't use it again until it is rebuilt, because its contents are no longer in sync with parity and in fact it has something missing from it. The disabled disk will be emulated by the other disks until it is rebuilt from the parity calculation. This means that you can continue to get the data, AND it even means you can continue to write, to the emulated disk. This allows the array to continue to operate even though it is somewhat degraded due to the missing disk.

 

New Config isn't the default way to deal with this problem, rebuilding the disk is. But that decision should be made based on a lot of information that is best left up to people, and if the disk needs to be replaced, the system can't do that by itself anyway.

 

And to elaborate on another point about rebuilding empty disks. As far as parity is concerned, the only way a disk doesn't affect parity is if all of its bits are zero. This is known as a clear disk. And an empty disk is not clear, not by a long shot. It has the metadata that represents an empty filesystem, and it probably has a lot of other nonzero bits as well that aren't part of that empty filesystem. When you format a disk it doesn't get cleared, and if it had data on it before most of those bits from that data are still on the disk.

Link to comment
35 minutes ago, mmmann said:

Where are the [?] clickables that bring up documentation, online and on tap, inside the program?

Most of the settings in the webUI have help. You can show/hide the help for a particular setting by clicking on its label. And you can toggle Global Help on/off with Help on the menu.

Link to comment
35 minutes ago, mmmann said:

houldn't Unraid have buttons or a drop-down next to the offline drive? Shouldn't there be balloon hints on EVERY object? Where are the [?] clickables that bring up documentation, online and on tap, inside the program? Nobody should have to refer to a Wiki or a forum in order to run business-critical software. If I were running Server 2016, one call to Microsoft (and a $200 fee) would resolve just about any question I had. So $130 for an Unraid license starts to look like a bad deal, when, for a few hundred dollars, you can have a Fortune 100 company backing you. I'm just keeping it real, here. I hope the unraid.net folks are listen

You do have built-in documentation.  You click on the text alongside any field to get help on that specific field.    Alternatively you can click the ? at the top of the page to turn the Help text on/off for all fields on the page.     However feel free to go the Microsoft route if that is what you think is best for you!

Link to comment

Re: the built-in documentation: I wouldn't mention it unless it wasn't obvious to me, a new user. I spent a lot of time hovering my mouse around, clicking on things, trying to bring up some documentation or built-in help. I do see the [?] Help menu item. Believe it or not, I brushed right past it. At any rate, my perceptions are the real deal--they're absolutely fresh and therefore of value; I hope that unraid.net are considering them.

 

Edit: I had written that none of the drives were showing up. I figured out why: whereas I had been connecting as <name>/main, I tried to get to the machine from its DHCP address, and had forgotten to turn on scripting for that site. It's quite interesting: Unraid displays most of the main/ screen and leaves a big gap in the middle, waiting for javascript to fill in the list of drives.

 

Suggestion: make sure that there is wording in that big gap, like "PLEASE WAIT … JAVASCRIPT IS RUNNING TO DISCOVER DRIVES". Otherwise, I was staring at a half-formed screen, scratching my head.

Link to comment

I do want to make another observation: even though it seems that official unraid employees are not actively monitoring the General Help threads--I haven't heard from developers or managers or technical support--it does seem that the community approach is working fairly well. I am getting responses, and I'm making (slow) progress. Bravo to those who are standing in the gap, trying to help.

Link to comment

Notifications: I'm confused about notifications--the linked, colored boxes at the upper-right corner of the interface. Sometimes they're there, sometimes they're not.

 

I just clicked the [Sync] button, which will restart the rebuild of the 6th drive, and I was expecting a notification--a pink (?) box that states that a rebuild is in progress. No notification appeared, and I can't see a place to click to reveal or display the notifications. I'm confused.

 

A rebuild seems to be running, so a notification should have been displayed, right? Please help me understand notifications, how to enable them at all times, and/or how to display them at will. Thanks.

Link to comment
1 minute ago, mmmann said:

Please help me understand notifications, how to enable them at all times, and/or how to display them at will.

Settings - Notifications. You can control what notifications you get and how you get them. Personally, I don't even get them in the browser. I have those I consider important emailed to me. That way I can know about them immediately wherever I am without even needing to get to the Unraid webUI. Other agents besides email are also supported. And many plugins let you get notifications from them also.

Link to comment
27 minutes ago, mmmann said:

I do want to make another observation: even though it seems that official unraid employees are not actively monitoring the General Help threads--I haven't heard from developers or managers or technical support--it does seem that the community approach is working fairly well. I am getting responses, and I'm making (slow) progress. Bravo to those who are standing in the gap, trying to help.

Limetech (the Unraid company) is a very small organization, fewer than 10 employees. Support is mostly from the enthusiastic user community on these forums. If you get a license, it allows you to get upgrades forever, but it doesn't include any technical support. There is the possibility of getting paid support but most things can be handled in the forum with a little patience. Lots of helpful, friendly, and as I already said, enthusiastic users helping each other. And almost all the addons (dockers, plugins) for Unraid are developed and supported by the users.

Link to comment

> Limetech is...small. ...enthusiastic users helping each other.

So I gather. When you're small, you have to lean on community support. Suggestion: the moderator(s) should have a mandate to keep a sharp lookout for presales trouble, and forward it to the owner(s). Has anyone worked software? I'm retired from C++ and Perl and et al. VERY IMPORTANT: the struggles of new users MUST be seen by the developers; there's no other way to work out the Human Factors Engineering issues in a product. My observations, a month from now, will be largely useless; the clock is ticking on my ability to reveal UI issues.

 

Notifications:

 

All notifications are set on for the Browser. Tools -> Notifications show no archived notification(s).

 

#1 - Archived notifications should be (also?) available under Settings -> Notifications. How is an archived notif a "setting?" It really isn't. The user interface element is not well-placed.

 

#2 - The product needs a button (or something) on the main page(s) that bring up the latest notifications. That's my opinion--it would have solved my question without coming to the forum.

 

#3 - I started a Sync and didn't get a notification!

 

Thanks

Link to comment
10 minutes ago, mmmann said:

Has anyone worked software?

I wouldn't be surprised if most of the more helpful users here have/had IT careers of some kind. I am retired from a long career of it myself, all software development, with a healthy dose of hardware on the hobbyist side.

 

The UI has evolved considerably over the years and continues to do so. And a lot of new features added with more to come. This was originally just one guy's idea for a different way to do a NAS with a unique (not RAID) implementation of parity. To a large extent the UI in its current form was driven and implemented by users.

21 minutes ago, mmmann said:

the moderator(s) should have a mandate

The moderators are just users too, unpaid and enthusiastic. They mostly do what they want, some are more active than others, and some were very active in the past but now are seldom seen. The employees will jump in from time to time. I wouldn't be surprised if they don't pay a little more attention to the presales area than some of the other areas. You seldom will see them on any of the threads for the user created/supported addons, for example.

Link to comment

[Cancel] : "Cancel will stop Parity-Sync/Data-Rebuild. WARNING: canceling may leave the array unprotected!"

 

I'm afraid to click on [Cancel], although it seems that I need to in order to create a share (everything is greyed out under Shares). With 24 hours of rebuilding ahead of me, I hate to lose even one hour. (Yes, I'll take some time to research New Config, as mentioned earlier. Maybe I can cut down on the pain.)

 

#1 - Can the sync be restarted where it left off? If so, then the wording should be: "Cancel will temporarily halt the Parity-Sync/Data-Rebuild."

 

#2 - When tasks are not possible due to the array running, there should be wording to that affect, e.g. "User Shares Cannot Be Configured: Array is Running." It could also be a bubble.

 

Link to comment

Flashing LSI 9210-8i RAID or HBA cards.

 

I'm writing this in case someone needs to update the firmware and BIOS on LSI 9210-8i cards. Hopefully, web search engines will find this post and spit it out to the next, poor sucker that has to navigate the LSI update instructions. STEP #0: don't try and read the LSI update instructions. They're garbage--twenty-four pages of technical fru-fru and printouts of command line options.

 

#1: Download the firmware from www.broadcom.com/products/storage/host-bus-adapters/sas-9210-8i#downloads.

 

#2: Make a bootable DOS USB key using Rufus.

 

#3: Copy SAS2FLSH.EXE onto the USB. Copy the *.ROM file, inside the SASBIOS_REL directory, onto the USB. Copy the *.BIN file, deep inside the Firmware directory, onto the USB. There will be two (2) BIN files: the IR file is for RAID; the IT file is for HBA (Host Bus Adapters, i.e. acting like a normal hard drive).

 

#4: Run SAS2FLSH -LISTALL and take a picture of the screen for posterity. This will display the current versions of the firmware and BIOS.

 

#5: Run SAS2FLSH -FWALL *.BIN. This updates the firmware on all the cards; it will take a while. Let MS-DOS convert the '*' to the actual file name.

 

#6: Run SAS2FLSH -BIOSALL *.ROM. Update the BIOS(es).

 

#7: Take another picture of the -LISTALL output, to be safe.

 

Reboot. Praying wouldn't hurt.

Link to comment

been running unraid for a couple of days. Everything is running as expected with minimal to zero configuration, ive been messing around with dockers so far but my array had been running smooth since day one, Im currently on my free trial but expecting to get a license soon.

 

I plugged in a 8Gb kingston data traveler usb to the motherboard usb, a couple of sata drives, an old drive for cache (appdata for dockers mainly), 2 ethernet cables on lacp to my switch and thats it.

Link to comment
12 minutes ago, mmmann said:

Can the sync be restarted where it left off? If so, then the wording should be: "Cancel will temporarily halt the Parity-Sync/Data-Rebuild."

This feature is under development.

12 minutes ago, mmmann said:

When tasks are not possible due to the array running, there should be wording to that affect, e.g. "User Shares Cannot Be Configured: Array is Running."

Not sure what the problem is here. You should be able to use everything normally during rebuild, though read/write of any disks will of course compete with the rebuild process for the disks.

17 minutes ago, mmmann said:

everything is greyed out under Shares

Maybe a screenshot and diagnostics would help make sense of this behavior.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.