[Support] atunnecliffe - Splunk

Squid · May 2, 2020

3 hours ago, guitarlp said:

Is this container still up to date and okay for use?

It failed to install a couple of times, but now that it did install unRAID is showing the version as "not available".

Both symptoms imply you have issues connecting to the internet. Try setting appropriate static DNS addresses in network settings at a start (208.67.222.222 & 208.67.220.220)

andrew207 · May 2, 2020

Yes, the container is still actively maintained. Recently (days ago) updated to the latest Splunk and Alpine Linux. Thanks for your interest, hope you get your internet fixed up so you can use it!

tkohhh · May 5, 2020

I've come across Splunk when looking at the limitations of long-term logging on pfsense. Any chance you can explain exactly what Splunk is? Is it a database to store logs? Presumably I can have logs from many different services all stored (and searchable/reportable) in Splunk?

Also, is it just this one Docker container that's needed? Or are there other containers needed to make this work?

andrew207 · May 5, 2020

Hey @tkohhh yeah in super simple terms Splunk is like a syslog server with built in search/report/alert functionality. You only need this container, no dependencies. Check the readme for volume definitions, make sure you set these up so you persist your data and config.

It gives you an interface for searching. It's particularly useful when you want to correlate things, for example if you wanted to monitor a connection through your pfsense into your reverse proxy into your web server that's bread-and-butter for Splunk.

For long term logging Splunk is great. It holds the full raw log events, under the hood it just gzips them and runs a heap of accelaration over the data to make it run miles faster than grep. Yeah, you can store literally any machine data in Splunk.

I have my Splunk configured to predict my solar panel's output and therefore my solar battery capacity based on weather forecasts, and it can alert me (infinite alert vectors, I use Telegram) if it thinks my batteries look like they might run out if I keep using electricity per normal.

It's also got a cake of enterprise features like reporting/alerting/dashboarding/blah blah, but for UnRAID use cases the above is a decent overview.

It's got a bit of a learning curve, but the search documentation is pretty good. Happy to PM if you have any specific questions.

Edited May 5, 2020 by andrew207
more info

tkohhh · May 6, 2020

Thanks so much for the explanation! Can you tell me what's different about your container compared to the official (splunk/splunk) repository on Dockerhub?

andrew207 · May 6, 2020

@tkohhh The key reason I made my own was actually because the official repo requires root to install itself. My container does not, and therefore it is compatible with locked down environments like OpenShift where security is important.

Additionally the official repo is bloated for what it provides. The official repo seems to install itself by first installing Ansible inside a Debian/RedHat/CentOS container, then install Splunk using Ansible. This seemed odd to me as I'm able to fully download and install Splunk in about 5 lines of bash.

The official container also has no optimisations for small-footprint use. My container has several tweaks that make a massive difference to footprint. For example, the official repo will store Splunk's 20-ish internal log files in /opt/splunk/var/log/splunk/*, it will allow them each to grow to 25Mb before rolling them. It will then store 5 additional rolled files (e.g. splunkd.log.1, splunkd.log.2, splunkd.log.n) consuming a not-insignificant amount of disk space. My container will only store one of each file, in addition to logging dramatically less noise to them thanks to an optimised internal log config that I wrote. Additionally, I have also written config that disabled some features that are enabled by default but only used in enterprise environments, such as several internal automated Distributed Service Monitoring dashboards+reports in the Health Monitoring Console. Splunk's official repo doesn't even support distributed deployments so it's odd that they don't disable this stuff too. This container has no issues running in distributed mode (and it is in several instances) if that is required, but for the majority of my Docker users I suspect they're running in standalone and don't need this resource wastage.

So the TL;DR is that mine has a much smaller storage footprint, requires significantly less resources to run, and is capable of never using the root user/running as an arbitrary user.

tkohhh · May 6, 2020

Perfect, thank you! I generally try to use the official repository unless there's a compelling reason not to. What you've described seems pretty compelling, and you've been super helpful so far. I'm going to give it a shot and see what I come up with!

andrew207 · May 7, 2020

One of the coolest things about Docker is how easy it is to test them both out to make the best decision happy to help

tkohhh · May 8, 2020

So... first question in what will likely be many questions!

Is there a way to set the Splunk instance timezone with your container? From what I've read, Splunk assumes that the timestamp in log files is the same timezone as the Splunk instance. I believe this is making all of my logs appear in GMT, even when I change my user account to my actual time zone.

While we're talking about user account settings... my timezone is not persistent after a container restart. Any ideas about that?

Thanks for your help!

andrew207 · May 8, 2020

When you set any user's timezone in the UI, it is saved in a file /opt/splunk/etc/users/user-prefs.conf. You'll need to put this file in a volume if you want timezone changes to persist after a restart.

I suggest making /opt/splunk/etc/users/ a volume, as setting the whole /opt/splunk/etc parent directory will cause other issues.

Perhaps in the next version I'll add this to the UnRAID template

Edited May 8, 2020 by andrew207

tkohhh · May 8, 2020

OK, I'll look at that.

But what about the instance time zone? All of my logs have timestamps in Pacific time, but Splunk thinks they are GMT. Any ideas there?

tkohhh · May 8, 2020

Disregard that question for now... I'm reading some stuff that might set me straight. https://answers.splunk.com/answers/320021/how-do-i-set-timezone-properly-in-propsconf.html

tkohhh · May 9, 2020

Alright, I'm still trying to get my time zone issues settled. One thing that seems to be making it difficult is that the Splunk container time zone is set to UTC. All of my other dockers are using the Pacific timezone (which is the timezone used by my Unraid server). Is that perhaps a configuration in your container somewhere?

andrew207 · May 9, 2020

TL;DR: seems to be a UI bug, unsure what's causing it. Am looking into it.

Generally there are 3 things to consider:

1. The timezone of your log source. If the source data has a correct TZ specified in it then you shouldn't need to change anything. Splunk will use this timestamp and create a metadata timestamp in GMT for storage and retrieval.

2. The timestamp of your Splunk server. Generally servers are set in UTC, but unRAID docker will apply an environment variable that is set to your TZ. This container always runs the server in GMT (per Splunk best pracitces). All data in Splunk is stored with a GMT timestamp.

3. The timestamp of your end user, as set in the UI or by default through your web browser settings. All data presented to the end user has the GMT timestamp converted to their local time. THIS SETTING DOES NOT SEEM TO BE HONOURED BY SPLUNK.

If you have logs that are being created and sent in realtime (i.e. most syslogs) then you can check the difference between your server time and user time by running this search:

index=myindex

| eval indextime=strftime(_indextime,"%Y-%m-%d %H:%M:%S")
| table _time indextime

_time should be the local time of the event, and should be the GMT time of the event. It looks like both my GMT and localtime appear the same despite me setting a timezone in the UI, so there must be a bug somewhere. For now you can just present you're in GMT, or you can hard-code a conversion into any searches you're running -- e.g. there's a 28800 second difference between UTC and PST so you can run | eval _time=_time-28800. I'll look in to this TZ issue.

Here's some extra reading: https://docs.splunk.com/Documentation/Splunk/latest/Search/Abouttimezones

Edited May 9, 2020 by andrew207

andrew207 · May 15, 2020

@tkohhh

I'm testing a fix to this now. It'll be available temporarily under the :tzfix dockerhub tag, I'll push it to master (and update this post) later this arvo once I've fully tested it.

Issue seems to be related to my recent update to Alpine Linux -- they changed the way timezones are applied and I didn't notice.

EIDT: Now working fine. Update your container and set your timezone in user preferences in the UI, should be all good

Edited May 15, 2020 by andrew207

Phoenix Down · June 4, 2020

Does this install the 60 day free enterprise trial license?

andrew207 · June 4, 2020

@Phoenix Down Yes this installs the 60 day free enterprise trial license. In the Github repo readme there are instructions on how to reset the license when it runs out.

Edited June 4, 2020 by andrew207

Phoenix Down · June 4, 2020

1 hour ago, andrew207 said:

@Phoenix Down Yes this installs the 60 day free enterprise trial license. In the Github repo readme there are instructions on how to reset the license when it runs out.

Thanks! Does switching the version wipe out your settings, indexes, and kv stores?

andrew207 · June 4, 2020

It will wipe anything that's not in a volume. So make sure you map your /splunkdata to preserve your indexes, and make sure your settings are in /etc/apps/* (and not /etc/system/local/*) as well as ensuring /etc/apps is a volume as well. In the UnRAID template XML I have called these two "DataPersist" and "ConfigPersist".

Your KVStore will be wiped as it is not in a volume by default. If you want to retain your KVStore (tbh I just use lookups instead for the simple things I do in Docker) you'll just need to do a backup + restore which is simple too -- https://docs.splunk.com/Documentation/Splunk/8.0.4/Admin/BackupKVstore.

loheiman · June 15, 2020

I'm having trouble getting this to work but I'm probably missing something very obvious. I have no experience with syslog servers.

I'm trying to use this as a remote syslog server for my Unifi controller (running on same Unraid machine but different IP). I verified that my Unifi controller is sending logs by creating a temporary server on my mac in terminal (using UDP 514). I then pointed the Unifi controller to this container's and set port to 9997 (I also tried 514). When I search in Splunk, it says 0 events have been recorded.

EDIT: Once i created a Data Input in the Splunk Container for UDP 514 and added the 514 port to the container during the install it worked. Am I correct in that both of those steps were required? What's the benefit of using 9997 and how would i do that?

Thanks!

Edited June 15, 2020 by loheiman

andrew207 · June 15, 2020

@loheiman Nice job figuring out you need to define UDP 514 as a Data Input through the user interface.

This config will be applied in an "inputs.conf" file in /opt/splunk/etc/apps/*/local/inputs.conf (where * is whatever app context you were in when you created the input, defaults to "search" or "launcher"), so as long as you used the ConfigPersist volume configuration in the UnRAID Docker template you're all good there -- even the default is fine.

9997 is the port used by default for Splunk's own (somewhat) propriatary TCP format. It supports TLS and compression which is why it's generally preferred. This port is generally used by Splunk's "Universal Forwarder", an agent that gets installed on endpoints you want to monitor. I say 'somewhat' propriatary because several third party applications have implemented Splunk's TCP input stream in their own commercial applications, so theoretically anyone could too.

Splunk can also listen for HTTP or HTTPS data using a HTTP Event Collector data input, this defaults to port 8088. HTTP is available because similar to Syslog over 514, HTTP is very universal and supported by just about everything.

Edited June 15, 2020 by andrew207

4554551n · July 10, 2020

Hi,

I am trying to set this up for the first time and have a couple of questions:

Re resetting trial license:
If I don't need the enterprise version, and the free version suits my needs, can I just never worry about this and keep splunk running as a free version?

Re splunk index/data location:
I would really like to set up the hot bucket to live in appdata (on the cache, faster writes to the SSD/not constantly hammering my array with writes) limited to maybe 2-3GB (or when the docker is restarted), and have the warm and cold buckets moved to the array (so as not to fill up the cache disk).
Could I please have some guidance in how I would use the docker edit menu in unraid to create the extra folder/mount point for the warm and cold (and frozen/thawed I guess?) buckets?

Note, I am seeing on splunk docs that both hot and warm buckets live in "$SPLUNK_HOME/var/lib/splunk/defaultdb/db/*" so it may be impossible to separate them, in which case maybe how can I keep hot and warm on the cache, but all the others on array? (I would like to keep Frozen buckets archives and not deleted too, that's kinda the point of a massive RAID array, right)

andrew207 · July 10, 2020

Hey @4554551n thanks for your interest, here are some answers to your questions.

> Resetting trial license

Yeah sure, you can set it to the free license if you want. Whenenver you upgrade the container you'll just need to set it to free license again.

> Splunk index data location / splitting hotwarm and cold

You can't split hot and warm, but you can split hot/warm and cold. With Splunk there are a lot of ways to split cold data off into its own location, I'd use "volumes". Here's a link to the spec for the config file we'll be editing: https://docs.splunk.com/Documentation/Splunk/8.0.4/Admin/Indexesconf

In my docker startup script I run code to change the default SPLUNK_DB location to the /splunkdata mount the container uses. SPLUNK_DB contains both hot/warm and cold. OPTIMISTIC_ABOUT_FILE_LOCKING fixes an unrelated bug. We set this in splunk-launch.conf meaning the SPLUNK_DB variable is set at startup and is persistent through the whole Splunk ecosystem. As you correctly identified from Splunk docs, SPLUNK_DB is used as the storage location for all indexes data and all buckets by default, this config was made to split it off into a volume.

printf "\nOPTIMISTIC_ABOUT_FILE_LOCKING = 1\nSPLUNK_DB=/splunkdata" >> $SPLUNK_HOME/etc/splunk-launch.conf

1. Create a new volume in your docker config to store your cold data (e.g. /mynewcoldlocation)
2. Create an indexes.conf file, preferebly in a persistent location such as $SPLUNK_HOME/etc/apps/<app>/default/indexes.conf
3. Define hotwarm / cold volumes in your new indexes.conf, here's an example:

[volume:hotwarm]
path = /splunkdata
# Roughly 3GB in MB
maxVolumeDataSizeMB = 3072

[volume:cold]
path = /mynewcoldlocation
# Roughly 50GB in MB
maxVolumeDataSizeMB = 51200

It would be up to you to ensure /splunkdata is stored on your cache disk and /mynewcoldlocation is in your array as defined in your docker config for this container.

4. Configure your indexes to utilise those volumes by default by updating the same indexes.conf file:

[default] 
# 365 days in seconds 
frozenTimePeriodInSecs = 31536000 
homePath = volume:hotwarm\$_index_name\db 
coldPath = volume:cold\$_index_name\colddb 
# Unfortunately we can't use volumes for thawed path, so we need to hardcode the directory. 
# Chances are you won't need this anyway unless you "freeze" data to an offline disk. 
thawedPath = /mynewFROZENlocation/$_index_name/thaweddb 
# Tstats should reside on fastest disk for maximum performance 
tstatsHomePath = volume:hotwarm\$_index_name\datamodel_summary

5. Remember that Splunk's internal indexes won't follow config in [default], so if we want Splunk's own self-logging to follow these rules we need to hard-code it:

[_internal]
# 90 days in seconds
frozenTimePeriodInSecs = 7776000
# Override defaults set in $SPLUNK_HOME/etc/system/default/indexes.conf
homePath = volume:hotwarm\_internaldb\db
coldPath = volume:cold\_internaldb\colddb

[_audit]
# 90 days in seconds
frozenTimePeriodInSecs = 7776000
# Override defaults set in $SPLUNK_HOME/etc/system/default/indexes.conf
homePath = volume:hotwarm\audit\db
coldPath = volume:cold\audit\colddb

# ... etc etc for other indexes you want to honour this.

> Freezing buckets

When data freezes it is deleted unless you tell it where to go.

You'll see in my config above I set the config items "maxVolumeDataSizeMB" and "frozenTimePeriodInSecs". For our volumes, once the entire volume hits that size it'll start moving buckets to the next tier (hotwarm --> cold --> frozen). Additionally, each of our individual indexes will also have similar max size config that will control how quickly they freeze off, iirc the default is 500GB.

Each individual index can also have a "frozenTimePeriodInSecs", which will freeze data once it hits a certain age. If you have this set, data will freeze either when it is the oldest bucket and you've hit your maxVolumeDataSizeMB, or if it's older than frozentimePeriodInSecs.

When data freezes it is deleted unless you tell it where to go. The easiest way to tell it where to go is by setting a coldToFrozenDir in your indexes.conf for every index. For example, in our same indexes.conf we have an index called "web", it might look like this, here's some doco to explain further https://docs.splunk.com/Documentation/Splunk/8.0.4/Indexer/Automatearchiving:

[web]
# 90 days in seconds
frozenTimePeriodInSecs = 7776000
coldToFrozenDir = /myfrozenlocation
# alternatively,
coldToFrozenScript = /mover.sh

Hope this helps.

Edited July 10, 2020 by andrew207

4554551n · July 11, 2020

Thank you.
I'll get onto all of that over the weekend, there's a bit there to get my head around.

Quick question about the license though cos that's something I can knock out quickly: so this container is made to run the non free license presumably for your own necessity.
*So how would I go about changing that to free (that you say I would need to do every update)?
Also, with autoupdate dockers, what would happen if I updated the docker, and didn't immediately change it to free? Would I have issues ingesting logs from forwarders (other issues?), or it would work perfectly fine as long as I changed it within the trial period window of 60 days (and then what would happen if I didn't do it in time)?

andrew207 · July 11, 2020

Yeah it's set to enterprise by default because that's what I use.

Persisting the free license is a little bit more awkward than for example modifying your bucket locations due to the precedence Splunk reads config files with. There is a server.conf file located $SPLUNK_HOME/etc/system/local that contains your license details, and this file will take precedence over anything you put in $SPLUNK_HOME/etc/apps/. You could make this file (or folder) a volume in your docker config, then modify the file to instruct Splunk to operate under a free license, such a server.conf entry might look like this:

[lmpool:auto_generated_pool_free]
description = auto_generated_pool_free
quota = MAX
slaves = *
stack_id = free

[license]
active_group = Free

So the TL;DR is you can set your license type to "free" in the GUI, then ensure the file $SPLUNK_HOME/etc/system/local/server.conf is persisted in a docker volume. If you do this your free license will work fine, and will persist through container updates/autoupdates.

-----

The only reason it wouldn't work is if the config was messed up somehow. It's easy to check, just restart Splunk and make sure you're still on the free license, then you can even swap your dockerfile from the :latest branch to the :8.0.4 branch to force it to redownload the whole image, if you're still on free after that then you're golden.

If your config didn't apply for whatever reason (e.g. you remove read permissions on the file) Splunk will start up under the trial license. Splunk checks a bunch of things to figure out the 60 day limit (I discuss how to reset it in the Github readme), but basically if you're within the 60 days then all is good and you can change back to free and debug your issue, but if you're over 60 days it will stop ingesting data and you will be unable to search non-internal indexes. You'll also get an alert banner on the login page and a heap of messages in the interface telling you so.

Thanks

[Support] atunnecliffe - Splunk

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

andrew207

andrew207

tkohhh

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation