Chia farming, plotting; array and unassigned devices


Shunz

Recommended Posts

ok. added *another* script to my collection (monitor). :D

folders:

  • temp\# ... individual drives where a single plot is processed.
  • harvest... where each plot dumps its final file locally, before it's moved to the farm computer

scripts:

  • wrapper... starts all the other scripts (monitor, mover, initchia). one and done.
  • monitor... checks the harvest folder (every 60 seconds) and suspends (.\pssuspend.exe robocopy.exe) the mover if a TMP file appears (keeps from thrashing the disk, which MASSIVELY impacts both final file creation and network transfers), and resumes the mover if not.
  • mover... checks the harvest folder (every 300 seconds) for the presence of TMP or PLOT files, and starts moving them (robocopy) when only PLOT files exist.
  • initchia... kicks off (every 1000 seconds) the individual plotter scripts (plotschia) for each of the temp\# drives. one and done.
  • plotschia... keeps a plot going on each temp\# drive, and processes the log file generated.

 

 

 

Link to comment

Are you planning on just solo mining for now or going to give hpool a try?

 

I just got plotman setup with some baseline settings to try out and kicked off a run.

 

I have each disk setup individually (except for those under 300gb which I raided together). It is set to run either 3 plots per disk or the max the disk can hold space wise, whichever is smaller.

 

It will wait until phase 2 to start a second plot on each disk, I might need to tweak this setting once I see the outcome. Netdata is a miracle for this, I can go back and see every conceivable metric for the system for the entire plotting process. Making it easy to see where the bottlenecks are.

 

Kinda sad the 300gb 10k drives are too small for more plots, plenty of wasted bandwidth there.

 

I did manage to get some 600gb 15k drives for $13 each so grabbed 3 of them for my inside server, figure I can use them as a cache if chia doesn't work out, always hated burning SSD's for caching mundane items. Although I should of ordered 4 in retrospect as it would fill the slots that I can't really use for 3.5 drives anyways.

Link to comment

Kinda fun watching all the plots get started, 1 plot per drive all in phase 1 now, have to wait awhile for phase 2 for the next batch to start:

 

firefox_JDTOEnxcV8.thumb.jpg.5add0cc97020537a5c917e447da6f742.jpg

 

firefox_aInbtDmAqT.thumb.jpg.cb7246dab25c4f4860c54db603bf9d40.jpg

firefox_uEThKtwthX.thumb.jpg.98e4a95810f99009d5b2fbdaa2b560b2.jpg

 

Here I can watch each disks space usage over time to see what the peak usage is and dial in the most plots per disk without over-filling:

firefox_uLfyrq9UQ3.thumb.jpg.2c68d36f62916c242b508d84171f6725.jpg

 

 

Can do the same for memory:

firefox_G5Iar4r2iQ.thumb.jpg.47fa63aca713023d50aaea3bf967d416.jpg

 

CPU is just pegged lol:

 

firefox_1jgXYZG0eH.thumb.jpg.d5c7a6f2f0d4bcb23f9dea75e22eb808.jpg

Link to comment

you see how far chia has slid?

actually think if it went down by half from now, it would be good. would cause a bunch of people to bail I bet.

$250 chia might cause a nice sized contraction of netspace.

 

also, at 194 plots now.

Edited by sota
Link to comment

Yeah, I noticed the price has been on the downturn, as more people win and sell off I kinda expect that to continue TBH. Which is why I think people spending thousands on hardware for this are crazy lol. Heck I feel bad about the ~$70 or so I have spent on hardware basically just for chia but I should be able to get that back or worst case more spare hardware for the closet.

 

Plotman is working beautifully. I think I will need to dial in the settings but boy does it make things easier and more efficient. With proper tweaking looks like I can keep 2x plots under 400gb, possibly 3x plots for 600gb so I could raid 2x 300gb drives together to get an extra plot.

 

It is also super nice to hit go and it automatically starts new plots at the optimal timing.

 

1616557372_Screenshot2021-06-09at08-49-01Machinaris.thumb.png.37230b688385610335b812c73c113fb2.png

 

I am CPU limited 100%, drives are NOT the limiting factor for me lol. I need to move some to my other system to spread the load out although I will have to start using these drives for farming before long as I fill them up.

 

With a basic phase 2 stagger setup memory usage has peaked so far around 140gb, 2x plots just under 400gb on a drive. A few drives have a 3rd plot starting now so going to see what the space gets up to on those.

 

Looks like I am at the 32 plot limit I setup right now, so it won't start anymore until some finish.

 

Over provisioned in a massive way but at least I know I am getting all the performance it has to give. Makes it easier on the drives since they have time to catch up and still got plenty of ram available.

Link to comment

Caught plotting error: bad allocation

 

seems to be the bane of my existence now.  getting that error at random times on 2 of the 3 plotter machines.  bad part is it leaves lint that can clog up the disk, preventing a plot from completing.  just added code to the script to flush the disk after every plot run, successful or not.

Link to comment

It has been 48 hours since I started plotman on all the drives.

 

It is coping the last plots now to the farm drive and looks like in 2 days I manged to basically fill up a 6tb drive. A few more hours and it will have it completely full.

 

Now that is including the spin up time for the parallel plots (takes 6-8 hours for them all to get going full speed) and I found some optimizations I could do as well.

 

Basically this one system using nothing both old HDD's could easily plot over 3TB a day. I am not complaining about that at all, at that rate it will only take me 2 weeks to fill up my old drives.

 

Now once these last few plots finish coping I plan to get this new plotter a try and see what results I get from ramdisk plotting.

 

With a very similar setup the maker of that multi-thread plotter said he was doing a plot every 45 mins, which would work out to basically the same ~32 plots a day and a lot less wear on the drives.

 

He also said he plans to get GPU plotting working next which will be even faster.

Edited by TexasUnraid
Link to comment
13 hours ago, TexasUnraid said:

Ok, this looks REALLY interesting for guys like you and me with lots of ram and cores at our disposal:

 

 

 

Apparently a docker build on the way, only ~5% performance drop compared to bare metal.

Sounds like a good deal to me.

 

K32 will be dead far sooner than devs anticipated, and certainly the final nail in the coffin will be the coming GPU acceleration from the mad max guys.

If you're replotting for pools, I'd think twice about K32, just my 2c.

 

Link to comment
6 hours ago, tjb_altf4 said:

Apparently a docker build on the way, only ~5% performance drop compared to bare metal.

Sounds like a good deal to me.

 

K32 will be dead far sooner than devs anticipated, and certainly the final nail in the coffin will be the coming GPU acceleration from the mad max guys.

If you're replotting for pools, I'd think twice about K32, just my 2c.

 

Yep, I was thinking the same thing. If it is already down to 24 mins in the first days after this release I can see it getting down to ~5-10 mins pretty easily with GPU plotting. at that point we will become drive speed limited again. Someone will use a massive ram drive and do everything in memory.

 

Heck my mobo can technically support 384GB of cheap DDR3 rdimms, I could plot the whole thing in ram if I wanted and it is something I am considering actually.

 

I just cleared all my old plots off the system and going to give this new plotter a try today. If it could be implemented with plotman and then this docker it would be magical. I am not sure I could give up this nice GUI lol.

 

Course I will be moving to hpool soon anyways and this docker is not compatible anyways I suppose.

Link to comment

Just gave madmax a go... wow.

(note: this seems legit, with few vectors for issues and lots of eyesballs on code, but please DYOR)

 

So on the original chia plotter, with an over zealous 12C/6G allocation, it still took 9.5 hrs to plot.

Throwing 24C at madmax plotter got this down to 1hr + 6min !

 

Lots of interesting optimizations to experiment with around lower core count and parallel jobs!

Unfortunately I've run out of storage so more plotting will have to wait a little while.

 

Link to comment
4 minutes ago, tjb_altf4 said:

Note I only have 64GB of RAM on my server, so this was a "normal" plot on nvme drives

 

That is really interesting, I somehow killed my install so having to reinstall this morning and plan on giving it a go later myself with a ramdrive for the 2nd temp folder and 10k HDD for the 1st temp and tweak from there.

 

Considering grabbing some more ram and running the whole thing in ram to completely eliminate the drive wear and max speed.

 

With DDR3 prices I can get enough ram for the price of some good NVME drives and never worry about killing it + faster speed.  Plus unraid really seems to like more ram, it keeps a ton of stuff cached greatly reducing the need to spin up drives.

Edited by TexasUnraid
Link to comment

Could not get the docker version of madmax to use more then 50% of the cores (1 CPU) so installed the full program on the clean ubuntu install.

 

I setup a 180gb ramdisk to see exactly how much room it uses and started it up. It is indeed using 100% of the cpu, the HDD access so far also appear to be sequential so raiding the drives together should get some impressive bandwidth for that as well.

 

First test finished in 72 mins and was disk bottlenecked most of the time. Raiding some disks together should remove that bottleneck.

 

Max memory usage was 126gb total system ram and max temp 1 usage was 180GB.

 

I will try it again later with a raid setup to see what the numbers are.

Edited by TexasUnraid
Link to comment
Number of Threads: 32
Number of Buckets: 2^7 (128)
Pool Public Key:  
Farmer Public Key:
Working Directory:   /media/chia/300gb-1/
Working Directory 2: /media/chia/ramdisk/
Plot Name:
[P1] Table 1 took 14.5046 sec
[P1] Table 2 took 133.808 sec, found 4295060384 matches
[P1] Table 3 took 157.282 sec, found 4295086275 matches
[P1] Table 4 took 192.85 sec, found 4295149514 matches
[P1] Table 5 took 190.693 sec, found 4295061899 matches
[P1] Table 6 took 182.84 sec, found 4294980729 matches
[P1] Table 7 took 141.748 sec, found 4295035658 matches
Phase 1 took 1013.75 sec
[P2] max_table_size = 4295149514
[P2] Table 7 scan took 10.2265 sec
[P2] Table 7 rewrite took 44.4503 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 33.606 sec
[P2] Table 6 rewrite took 50.9536 sec, dropped 581245485 entries (13.5331 %)
[P2] Table 5 scan took 32.5058 sec
[P2] Table 5 rewrite took 227.18 sec, dropped 762067859 entries (17.7429 %)
[P2] Table 4 scan took 32.0901 sec
[P2] Table 4 rewrite took 208.52 sec, dropped 828989235 entries (19.3006 %)
[P2] Table 3 scan took 31.3222 sec
[P2] Table 3 rewrite took 205.521 sec, dropped 855131107 entries (19.9095 %)
[P2] Table 2 scan took 346.265 sec
[P2] Table 2 rewrite took 52.2521 sec, dropped 865637822 entries (20.1543 %)
Phase 2 took 1299.26 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 313.3 sec, wrote 3429422562 right entries
[P3-2] Table 2 took 124.027 sec, wrote 3429422562 left entries, 3429422562 final
[P3-1] Table 3 took 62.3612 sec, wrote 3439955168 right entries
[P3-2] Table 3 took 80.7924 sec, wrote 3439955168 left entries, 3439955168 final
[P3-1] Table 4 took 140.171 sec, wrote 3466160279 right entries
[P3-2] Table 4 took 93.836 sec, wrote 3466160279 left entries, 3466160279 final
[P3-1] Table 5 took 286.112 sec, wrote 3532994040 right entries
[P3-2] Table 5 took 135.383 sec, wrote 3532994040 left entries, 3532994040 final
[P3-1] Table 6 took 273.757 sec, wrote 3713735244 right entries
[P3-2] Table 6 took 71.7316 sec, wrote 3713735244 left entries, 3713735244 final
[P3-1] Table 7 took 70.6805 sec, wrote 4295035658 right entries
[P3-2] Table 7 took 126.446 sec, wrote 4294967296 left entries, 4294967296 final
Phase 3 took 1786.17 sec, wrote 21877234589 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 228.939 sec, final plot size is 108835866333 bytes
Total plot creation time was 4328.22 sec

 

Dropping this here more for my own reference then anything. This is madmax using a ramdisk for temp2 and a single 10k 300gb drive for temp 1.

 

Just started another round except I made a 10x 300gb 10k drive raid0 array for temp 1 this time around (bottlenecked to 1gb/s due to the sas1 backplane). Gonna see how it keeps up as I was HDD bottle necked for a significant part of the time last time.

Link to comment

Finished the second test with the raid, dropped the time to 40 mins but I was doing some other stuff that might of effected it some. Gonna try setting up a swapfile and plotting "entirely" in memory and just letting the swapfile hold the overflow.

 

It has been reported that you can plot entirely in memory with a ~256gb ramdrive. The mobo can support up to 384gb of cheap ram before you have to get the high cap lrdimms.

 

Even without that though, I should be able to do ~36 plots a day like this if they could be staggered to start the next while it is coping the last one to storage. Better then parallel plotting on individual drives and a lot less wear and tear / drives needed.

Edited by TexasUnraid
Link to comment

After some trial and error I figured out how to get a ramdrive and swap file to work together. For the first test I just used a 10k sas drive as the swap file.

 

The server has 224gb of ram, anything over that gets pushed to the swap drive.

 

The results are....interesting.

 

Naturally the single drive is slow to write the swap info out but that only causes minor slowdowns at a few points. The real issue comes when it has to read back from the drive, that really slows things down a lot and takes a lot longer (lot of random accsess that was around 30mb/s).

 

The good news is it looks like it only writes around ~50gb to the swap drive with this much ram, I can tolerate that on an SSD.

 

I think I will give the same run a try with an enterprise SSD as the swap file and see if it fairs any better. Although I kinda doubt it, looks like madmax did a good job optimizing the temp1 vs temp2 folders.

 

Does leave me thinking I should just grab some more memory as doing it all in memory is within reach and would be the fastest and simplest option of all.

 

You will note how some stages went much faster and others took much longer. The stages that did not have to read from the HDD went faster but the ones that needed it went MUCH slower.

 

Net result was 62 mins, slower then using the raid HDD's for temp1.

 

Working Directory:   /media/chia/ramdisk/
Working Directory 2: /media/chia/ramdisk/
Plot Name: 
[P1] Table 1 took 15.2192 sec
[P1] Table 2 took 130.282 sec, found 4294930133 matches
[P1] Table 3 took 150.935 sec, found 4294852263 matches
[P1] Table 4 took 183.736 sec, found 4294796489 matches
[P1] Table 5 took 180.497 sec, found 4294754453 matches
[P1] Table 6 took 177.737 sec, found 4294532726 matches
[P1] Table 7 took 136.991 sec, found 4294102621 matches
Phase 1 took 975.415 sec
[P2] max_table_size = 4294967296
[P2] Table 7 scan took 9.36441 sec
[P2] Table 7 rewrite took 100.805 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 32.2058 sec
[P2] Table 6 rewrite took 48.0625 sec, dropped 581314467 entries (13.5362 %)
[P2] Table 5 scan took 30.2798 sec
[P2] Table 5 rewrite took 70.9984 sec, dropped 762006945 entries (17.7427 %)
[P2] Table 4 scan took 30.2274 sec
[P2] Table 4 rewrite took 111.403 sec, dropped 828856065 entries (19.2991 %)
[P2] Table 3 scan took 31.7391 sec
[P2] Table 3 rewrite took 101.573 sec, dropped 855088805 entries (19.9096 %)
[P2] Table 2 scan took 336.482 sec
[P2] Table 2 rewrite took 173.266 sec, dropped 865601272 entries (20.154 %)
Phase 2 took 1099.08 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 439.708 sec, wrote 3429328861 right entries
[P3-2] Table 2 took 33.2532 sec, wrote 3429328861 left entries, 3429328861 final
[P3-1] Table 3 took 61.3089 sec, wrote 3439763458 right entries
[P3-2] Table 3 took 34.8653 sec, wrote 3439763458 left entries, 3439763458 final
[P3-1] Table 4 took 62.4652 sec, wrote 3465940424 right entries
[P3-2] Table 4 took 37.4194 sec, wrote 3465940424 left entries, 3465940424 final
[P3-1] Table 5 took 61.2894 sec, wrote 3532747508 right entries
[P3-2] Table 5 took 36.3129 sec, wrote 3532747508 left entries, 3532747508 final
[P3-1] Table 6 took 65.1089 sec, wrote 3713218259 right entries
[P3-2] Table 6 took 37.4954 sec, wrote 3713218259 left entries, 3713218259 final
[P3-1] Table 7 took 657.524 sec, wrote 4294102621 right entries
[P3-2] Table 7 took 43.6903 sec, wrote 4294102621 left entries, 4294102621 final
Phase 3 took 1576.8 sec, wrote 21875101131 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 80.2494 sec, final plot size is 108823343581 bytes
Total plot creation time was 3731.63 sec

 

Edited by TexasUnraid
Link to comment

Ok, finished the ram/swap test with the SSD.

 

Still bottlenecked at a few points with the SSD speed but it only lasted a few seconds each time. Also confirmed that with 224gb of ram, it only wrote 53gb to the SSD.

 

Total time was 39mins, basically the same as the 10x 10k drives with the SAS1 bottleneck. Gonna re-arrange some things and move those drives out of the jbod so they have full bandwidth to see what the performance is next.

 

Overall I think that if I did it 100% in ram I would cut a few minutes off the time.

 

Working Directory:   /media/chia/ramdisk/
Working Directory 2: /media/chia/ramdisk/
Plot Name: plot-k32-2021-06-11-16-30-34e8fe3838d4592d253464114514f4824088d34305ffb97a4305850cd6fef807
[P1] Table 1 took 15.3652 sec
[P1] Table 2 took 130.496 sec, found 4294912808 matches
[P1] Table 3 took 149.661 sec, found 4294990999 matches
[P1] Table 4 took 185 sec, found 4294910364 matches
[P1] Table 5 took 180.235 sec, found 4294770273 matches
[P1] Table 6 took 176.25 sec, found 4294453532 matches
[P1] Table 7 took 133.391 sec, found 4293991461 matches
Phase 1 took 970.415 sec
[P2] max_table_size = 4294990999
[P2] Table 7 scan took 9.47834 sec
[P2] Table 7 rewrite took 54.3766 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 31.9529 sec
[P2] Table 6 rewrite took 50.5049 sec, dropped 581346092 entries (13.5371 %)
[P2] Table 5 scan took 30.9766 sec
[P2] Table 5 rewrite took 56.566 sec, dropped 762041241 entries (17.7435 %)
[P2] Table 4 scan took 29.5163 sec
[P2] Table 4 rewrite took 68.378 sec, dropped 828891100 entries (19.2994 %)
[P2] Table 3 scan took 29.5927 sec
[P2] Table 3 rewrite took 67.0983 sec, dropped 855094826 entries (19.9091 %)
[P2] Table 2 scan took 67.6979 sec
[P2] Table 2 rewrite took 98.9228 sec, dropped 865543140 entries (20.1528 %)
Phase 2 took 617.233 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 86.8226 sec, wrote 3429369668 right entries
[P3-2] Table 2 took 32.75 sec, wrote 3429369668 left entries, 3429369668 final
[P3-1] Table 3 took 59.3424 sec, wrote 3439896173 right entries
[P3-2] Table 3 took 37.1951 sec, wrote 3439896173 left entries, 3439896173 final
[P3-1] Table 4 took 59.2572 sec, wrote 3466019264 right entries
[P3-2] Table 4 took 35.8962 sec, wrote 3466019264 left entries, 3466019264 final
[P3-1] Table 5 took 61.5009 sec, wrote 3532729032 right entries
[P3-2] Table 5 took 35.9014 sec, wrote 3532729032 left entries, 3532729032 final
[P3-1] Table 6 took 59.1766 sec, wrote 3713107440 right entries
[P3-2] Table 6 took 37.7628 sec, wrote 3713107440 left entries, 3713107440 final
[P3-1] Table 7 took 134.387 sec, wrote 4293991461 right entries
[P3-2] Table 7 took 43.4205 sec, wrote 4293991461 left entries, 4293991461 final
Phase 3 took 690.245 sec, wrote 21875113038 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 79.61 sec, final plot size is 108822915035 bytes
Total plot creation time was 2357.6 sec

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.