Jump to content

Dr. Ew

Members
  • Content Count

    10
  • Joined

  • Last visited

Community Reputation

1 Neutral

About Dr. Ew

  • Rank
    Member

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Dr. Ew

    Slow NvME Cache Pool

    I'm trying to figure out what the issue is with my Cache Pool. I have both 40GbE and 10GbE NIC's, but I mainly utilize 10GbE, so I'll refer to that, for now. My NvME Cache Pool isn't saturating the 10GbE line. My unassigned array of Spinners does saturate the line though. So, I am deducing it isn't a network issue. I have 6 2TB NvME's in a Supermicro server. Client machine is an i9 extreme and/or xeon-w. I have an array of 24 spinners, LSI MegaRAID, attached as an unassigned device. This array hits 1.03Gb/s, no problem. I haven't put more than 2 of the 6 NvME's in the cache pool yet, so i am testing with a RAID 1 config with Intel 760P. It is maxing out aroud 750Mb/s write. What could be the cause for this slow NvME transfer rate?
  2. Dr. Ew

    More storage SSDs and other drives

    You could also utilize a separate RAID controller and run R5, 6, or 10 with the SSD’s and utilize that array as an unassigned Drive. On one of my servers, I have a 12-disk array with 3 ssd Cache on LSI megaraid utilizong fast path and cachecade, and in addition, I also have an 8tb ssd array with fastpath. Then for UnRAID Cache, I have 2 2tb nvme’s. That’s an option. On my new UnRAID server, I’m testing out a hybrid array. I’ve got a 2tb Samsung SSD as parity, with 12 500gb SSD’s, 12 2tb seagate barracuda’s, and two 1tb NvME’s for Cache. It certainly limited by parity, doesn’t reach 10gbe speed, but it does reach 450MB/s transfer, which is good enough for its use case.
  3. Dr. Ew

    Supermicro JBOD Chassis

    I’ve got several of those. I have two traditional servers, and then two I have swapped out the backplane and use them as storage Enclosures. After a few hiccups on the storage Enclosures, I’m all set up and loving it. As your needs grow, you can add another HBA when needed. The Mellanox NIC’s work just fine. Depending on other machines in your network you want to access the server, you may consider a Chelsio T580 instead. I use both. I thought the Mellanox allowed for QSFP+ to SFP+ breakout cables, but it does not. It’s either 2 40gbe ports or 2 10gbe ports. The chelsio does the same and also allows for the breakout cables. I use the Chelsio to connect to my switch, 2nd UnRAID server, Workstation, and Hackintosh. The Mellanox is used for 40gbe direct connection. you may want to grab a Supermicro add-on card, dual NvME, and save one pcie slot. You can also get a riser card that has dual 10gbe and 2 to 4 NvME ports. I’ve seen them go for as low as $200.
  4. Dr. Ew

    unRAID [6.6.6] cant see alle PCIe cards

    I’m having a similar problem. UnRAID sees my chelsio t-580 but not my Mellanox connectx-3. One reboot got it to see my t-580 and one port on my Mellanox, but now it just sees one. Not sure why.
  5. Dr. Ew

    Massive Hardware Failure - Supermicro

    Turns out it was just the battery apparently. The battery was fine, just needed to be reset. The automatic retraining was to take a month, so I did manual, and it zipped back into working order. i don’t know why it caused that huge of a problem, but it’s fixed now.
  6. Dr. Ew

    Massive Hardware Failure - Supermicro

    I should also mentioned some of the error messages reeceived. -BBU Failure -PHY Error on some disks, then fine, error on all disks, then fine. -Diagnostic System Error - Backplane -Backplane power error (even with new PSU, other PSU's) -Several other stange errors I need to document. The very peculiar final thing worth mentioning; the el1 backplane has two minisas ports. One to server. One for daisychain. I plugged a 6 bay ssd enclosure into the daisychain port. Installed 3 drives, and these all show up fine. So thats confuses me even further.
  7. After months of design and testing, then implementation and testing, I finally finished my AI, Deep Learning, & AR lab. It's a complex system, for which the expanse exceeds the necessity of explaining my current situation and hardware failure. One UnRAID server's storage system is failing (or has failed). It's a SuperMicro 8028 (Server nAR4), it has two connected 12-bay enclosures (x2 6027tt that has been converted to solely being a storage enclosure - nARstore4a, nARstore4b). The server itself contains 8 1tb SSD's. 4a contains the unRAID data disks, and 4b contains a RAID10 array which is a super fast unassigned drive. Storage Enclosures are not. The 6027tt came with the 8027hd backplane. I removed these backplanes and replaced them with sas2el1 backplane. For a month, everything worked properly, just as it should. There is nothing wrong with the 8028 server or its storage. Both storage enclosures are failing or have failed (trying to determine). Last week the unRAID system spun up a few times with missing drives, and then was normal for a few days. This happened with both enclosures. As UNRAID was loading, a few times, I saw super fast script running, and all I could make out was 'error', it was too fast to read. Yesterday, both backplanes went fully offline. It's very strange, and I have taken nearly everything out of the equation and still can't figure out what's up. It's very strange both enclosures would fail at the same time. Here is what it looks like Server nAR4 -> nARstore4a -> el1 backplane ->- IronWolf HDD's -> LSI 9286 (12 drive R10) -> nARstore4b -> el1 backplane -> IronWolf HDD's unRAID Data -> Areca HBA After going completely offline, it came back, missing several drives. I fiddled around with it, took the trays out, put them back in, the array came back, then went away again. Ocasionally, a few hard drives will spin up, sometimes they all spin up, but ultimately both enclosures end up offline. I tested on separate server, I tried different HBA, different controller, different cables, different PSU, verified drives are okay. I've tried every failure point. Even changing out the backplane for new ones. Something weird is going on, and there is nothing that makes sense. I tried an HP HBA, another LSI RAID card. The BBU on both controllers is showing failed, even though all its metrics look fine. My conclusion is these are the only logical explanations: 1) Something unknown cause killed both hba's and both controller's, along with their batteries. Very improbably, but I have a new Supermicro HBA on the way to check this. 2) BIOS setting replicated in both Server nAR4, and other servers I tested to make sure it wasn't server itself. Doubtful, but i'll try resetting bios a second time. Those are my only two possible explanations at this point. Everythung else has been tested. How it went from working fine to this type of failure is insane. Any Suggestions?
  8. Dr. Ew

    NIC Priority

    I haven't seen any other threads on the topic, nor could I find much in my search. I have finally built two high performance unRAID rigs. Finally put the network together. Now, my final piece is to utilize the performance I have theoretically achieved. My overarching question is 1) How to assign priority to my 10gbe or 40gbe NIC. I have 2 NIC's on the motherboard, and two NIC's installed (Mellanox Connectx-4 and Chelsio T580) I have the Chelsio directly attached to two devices. I have both ports of the Mellanox connected to a switch. Given the switch is also the method for connecting all networked devices to the gateway, the internal NIC's are also on this switch. Every transfer I try goes ETH0. I cannot figure out how to get the shares/transfers to go through either the 10gbe or 40gbe connection. Would it be better to put the file transfer network on a separate subnet than the subnet which the 1gbe NIC's are on? Or would it be better to bond the two 1gbe and two 10gbe routes? (that leads to the next question) 2) Of all the aggregation/bonding protocols, which one will provide an increase of both bandwidth and throughput? I've re-read the descriptions of all the protocols, and conceptually understand what each does, but there is very little documentation I could easily find (via google search) to talk about both bandwidth and throughput in relation to the protocols. The protocols usually just mention how bandwidth is increased, if at all. It does not mention if throughput will scale with bandwidth. Thus, if I used balance-rr to bond two 40gbe NIC's, I have 80 gigabits per second bandwidth, does throughput scale in the same manner? I'd ideally be able to achieve 3000-4000 megabytes per second throughput for at least two connected devices (utilizing nvme-of most likely). 3) In addition to the above, for the Chelsio, which provides DAC (connection) to two separate devices; how do I send the transfer through the 40gbe route, as opposed to the 10gbe or 40gbe? I assume the only option here is to assign static addresses on a separate subnet, like 255.255.0.0. Thanks!
  9. Dr. Ew

    Super High Performance Server

    Correct, the Parity Drive is still initializing. R/W will suffer decreased performance once the parity drive kicks in? The IronWolf Pro’s hit 200mb/s+ sequential. My initial thought was along those lines, but I don’t see how the math works there. I figured UNRAID would see them as single volumes, but with one volume of ~1.5tb and one of ~8tb, I’m not sure what is mirroring. The available cache space is a strange number. The purpose of utilizing UnRAID is multi-fold. Storage utilization is my most critical objective. The 6 IW drives currently dedicated to UnRAID are there as an integral failsafe for my critical, non-replaceable/non-replicable data. I expect UnRAID to be the holding area for my cold, archival, and critical data. The UnRAID data will be replicated on one of the QNAP’s I decided to keep. There will also be a second array (still to-be-determined) that will hold a backup of the critical data. This data currently resides on a DAS Array, on the QNAP, transferring to UnRAID now, and will perhaps occupy another UnRAID Server (the still TBD part). The VM support, and Docker’s will be utilized. It’s also one of the deciding factors, which has sold me on UnRAID, compared to the other options. The VM’s will be spun up to handle A/R content, Research, AI & Deep Learnimg. So, that will be coming, once high-performance is established. The three RAID array’s exist purely for the needed performance, with protection to keep the data flowing in the even of failure. I don’t need any platform to manage these arrays. The controller management is all that’s needed, given it’s use. I have explored. I’ve read through the documentation and FAQ’s a few times now. But, I’ve also consumed a lot of information on the comparative platforms. With so much information consumption, I undoubtedly have crossed a few concepts into confusion, and I’ve read conflicting information, and some information which still remains unclear, after initial and today’s re-assessment. After spinning up active installs of the aforementioned platforms, the clear contenders are UnRAID and Windows Server 2019. I would much prefer UnRAID, if I can clear up the high performance needs. Once I decide on the platform, I’m going to purchase licenses for my employees; all who will have similar, but not totally congruent use cases. My lack of clarity still resides in utilizing cache. The attached cache array is just for testing purposes. Ideally, I would like at least 16tb worth of cache for hot storage. Ideally there will be 8tb of NvME and 8tb of SSD in the cache pool. In addition to the warm additional storage from the 24-Drive Array. I want to be able to offload the cache to the large RAID array and back, with very high transfer rates, and very low waiting time. The 24-drive array, directly attached to a system clears 3gb/s sequential. I need this speed, or close, over the 40gbe network. And I need to eventually expand to 100gbe to cover even higher speed with the NvME-OF.
  10. After trying out several boxed NAS solutions, I decided none were as powerful, or as high-performing as I need. The only solution I could see, was to turn one of my workstations into an open-source NAS. So, over the last several days I’ve tested FreeNAS, Xpenology, OMV, And now UnRAID. After all the testing, I’ve decided, thus far, I like the UI and initial ease of UnRAID the most. I’d like to make this a permanent solution for myself, and several of my employees. I don’t know if I am not understanding, comprehending, or just looking at old information; but, past the main data volumes, I’m a bit confused. A) I’m looking at SSD cache first. I’ve read UnRAID doesn’t support multiple cache pools, yet it appears I have been able to setup multiple pools. But, I’m not sure how it’s working. Unraid let me assign two tri-mode RAID volumes as cache. Volume 1 is 8 256gb NvME’s in RAID5, and Volume 2 is 8 2TB Samsung EVO SSD’s, in RAID 10. It is only showing 3tb of the 8tb SSD RAID Volume, and I’m not sure how much of the NvME volume it’s showing. When I transfer to a share assigned to cache pool, it shows equal in/out on both volumes, but all the data is landing on the SSD volume I don’t understand how this works. B) I have 6 14tb IronWolf Pro’s, attached to the motherboard, which is my main UnRAID data storage. I’m cool with this being my slow storage volume, it’s r/w over 210mb/s and that’s cool for main storage. but, I have a 24-drive RAID 6 array comprised of 8tb Seagate EXOS SAS Drives. Originally UnRAID saw this volume, but it doesn’t show anymore. And, I’m not sure how to setup access to it. The volume is currently empty, but I want to store my warm data on it. the idea is: Cold Data on the UnRAID data volume Warm Data on the External RAID Hot Data on the NvME-OF And SSD Array How do I achieve this? systems specs intel core i9 7280x ASUS x299 mobo 64gb RAM 6 Internal IronWolf Intel Tri-Mode RAID x2 Mellanox Connectx-4 or Chelsio t580 NIC