Splunk – Centralized, Real Time Log Analysis (NOS)
NOS = not open source; but there is a ‘free’ version… Before diving in some attractive verbiage from Wikipedia:
Splunk[1][2][3] is a search, monitoring and reporting tool for IT system administrators with search capabilities[4]. It crawls logs, metrics, and other data from applications, servers and network devices and indexes it in a searchable repository from which it can generate graphs, SQL reports and alerts.[5] It is intended to assist system administrators in the identification of patterns and the diagnosis of problems.
Before reading any further I encourage the visitor to read my ‘about page’ – I tend to be a disruptive user and say what I think so expect direct reactions to whatever I encounter.
Splunk sounds great if you are an SA (well, the features sound great.) I will hazard a guess that the solution combines a bit of Open Source approaches with custom crunch/analysis tool(s). Could you create your own? Most likely yes (and there probably already are some free Open Source solutions with similar features.) Since this is a commercial product my expectations are quite high:
- I should not have any configuration issues (should work without any manual tweaks by me.)
- I should not need to gotta-grab-yet-another-tool-library-etc.
- The installer should be the only command that I need to run – I should NOT have to make any system changes.
Ok, so what is the reality?
- I create a new VirtualBox VM, install FC12 (manually) and then do a complete system update | ~ 2 hours
- I lock down the network so that traffic from the VM NIC cannot ‘reach out’… (I block traffic via the VM MAC address by adding a rule on my router; internal network traffic is allowed.)
- I download Splunk | a few minutes
- I review the install information from the product web page(s) | ~30+ minutes. There is quite a bit of performance and ’sizing’ information – I pick up on this:
Performance Considerations
Splunk has three primary roles - indexer, searcher and forwarder. In many cases a single Splunk instance may [serve] two or all three roles at once. All have their own performance requirements, and bottlenecks.
- indexing, while relatively resource inexpensive, is often disk I/O bound
- searching can be both CPU and disk I/O bound
- forwarding uses very little resources, and is rarely a bottleneck
As you can see, disk I/O is frequently the limiting factor in Splunk performance, and deserves extra consideration in your planning. That also makes Splunk a poor virtualization candidate unless dedicated disk access can be arranged.
Did someone mention Beer? (some marketing from the Splunk web pages…)
More about Splunk Free
Splunk Free is a totally free (as in beer) version of Splunk. It allows you to index up to 500MB/day and will never expire. If you go over 500MB/day more than 3 times in a 30 day period, Splunk will continue to index your data, but search will be disabled until you are back down to 3 or fewer times in the 30 day period.
What’s it for?
Splunk Free is designed for personal, ad-hoc search and visualization of IT data. You can use Splunk Free for ongoing indexing of small volumes (<500MB/day) of data. Additionally, you can use it for short-term bulk-loading and analysis of larger data sets–Splunk Free allows you to bulk-load much larger data sets up to 3 times within a 30 day period. This can be useful for forensic review of large data sets.
Potential Install Concerns
- I locate a possible ‘red flag’ in the Linux install docs – seems that user/owner might be a concern (this should be a standardized item with perhaps an install option to customize further.) I would also expect this to have process/service configuration options; at this point I would prefer to see Apache integration for any HTTP service (that would allow one point of access/security configuration.) It may be possible to proxy this service via Apache – we’ll see.
- The software will present itself as a web server running on a dedicated port (you need to consider firewall or other network aspects for your environment – not a problem just an item to resolve.)
The actual Splunk Install
- I confirm that the new VM cannot send packets through the router (but, if needed, I can enable access via a monitored proxy.)
- I install Splunk via ‘time rpm -i splunk-rpmfile’ | output is:
warning: splunk-4.1.2-79191.i386.rpm: Header V3 DSA signature: NOKEY, key ID 653fb112
———————————————————————-
Splunk has been installed in:
/opt/splunk
To start Splunk, run the command:
/opt/splunk/bin/splunk start
To use the Splunk Web interface, point your browser at:
http://Your-SERVER:8000
Complete documentation is at http://www.splunk.com/r/docs
———————————————————————
real 0m24.007s
user 0m1.347s
sys 0m16.849s
Hmm – done? ready to run?
Ok, as a non-privileged user I try and get:
/opt/splunk/bin/splunk start
bash: /opt/splunk/bin/splunk: Permission denied
This is good and bad… If this is a ’service’ then I would expect a system level startup approach. I do an ’su – splunk’ and re-try.
Ok – after agreeing to the license agreement there is a lot of setup that occurs – note the message about SELINUX (should be have disclosed in setup info; also, there is a ‘management port‘ which should have also been mentioned in setup info…) The install process did not prompt for any setup info…
Do you agree with this license? [y/n]: y Copying '/opt/splunk/etc/myinstall/splunkd.xml.cfg-default' to '/opt/splunk/etc/myinstall/splunkd.xml'. Copying '/opt/splunk/etc/openldap/ldap.conf.default' to '/opt/splunk/etc/openldap/ldap.conf'. Moving '/opt/splunk/share/splunk/search_mrsparkle/modules.new' to '/opt/splunk/share/splunk/search_mrsparkle/modules'. /opt/splunk/etc/auth/audit/private.pem /opt/splunk/etc/auth/audit/public.pem ['openssl', 'genrsa', '-out', '/opt/splunk/etc/auth/audit/private.pem', '1024'] /opt/splunk/etc/auth/audit/private.pem generated. /opt/splunk/etc/auth/audit/public.pem generated. Generating RSA private key, 1024 bit long modulus .......++++++ .............................................++++++ e is 65537 (0x10001) writing RSA key /opt/splunk/etc/auth/distServerKeys/private.pem /opt/splunk/etc/auth/distServerKeys/trusted.pem ['openssl', 'genrsa', '-out', '/opt/splunk/etc/auth/distServerKeys/private.pem', '1024'] /opt/splunk/etc/auth/distServerKeys/private.pem generated. /opt/splunk/etc/auth/distServerKeys/public.pem generated. Generating RSA private key, 1024 bit long modulus ...................++++++ .......++++++ e is 65537 (0x10001) writing RSA key This appears to be your first time running this version of Splunk. Creating: /opt/splunk/var/lib Creating: /opt/splunk/var/run/splunk Creating: /opt/splunk/var/run/splunk/upload Creating: /opt/splunk/var/spool/splunk Creating: /opt/splunk/var/spool/dirmoncache Creating: /opt/splunk/var/lib/splunk/authDb Creating: /opt/splunk/var/lib/splunk/hashDb Checking databases... Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history, main, sample, splunklogger, summary Splunk> All batbelt. No tights. Checking prerequisites... Checking http port [8000]: open Checking mgmt port [8089]: open Checking configuration... Done. Checking index directory... Done. Checking databases... Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history, main, sample, splunklogger, summary Checking for SELinux. Command error: Splunk will not run with SELinux enabled. If you have adjusted Splunk's security level with chcon, you can bypass this check by setting the 'SPLUNK_IGNORE_SELINUX' environment variable.
I disable Selinux (it will re-set on reboot – I am not ready to make a long-term change via chcon for Splunk…)
export SPLUNK_IGNORE_SELINUX=1
/opt/splunk/bin/splunk start
Now it’s time to read the docs and perhaps configure data inputs. I don’t locate any quick way to get real info into the system. I do encounter a page indicating that I need a new version of Flash… Since I was running the Browser in a VM running Splunk I decided to try from an external browser. Firewall changes are required to accomplish this.
Next I select the ‘Getting Started’ link which leads me to Manager –> Data inputs –> Files & Directories. I attempt to add a standard system log file to the data set (expecting problems) and get one – ‘access denied’. It seems silly to me to have a tool designed (seemingly) to monitor sensitive system data but the default install does not take this in account on the system where the tool is installed? [ I will guess that 'advanced install' documentation may explain this; also I did run across docs on using Splunk via Apache... ]
- Build reports opens a new browser window – would prefer a new browser tab…
- ‘Help’ link in search drop-down box requires Internet access? (I prefer that apps include such data – no Internet or outside connections should be needed…)
- Since download I have received 4 emails (in five days) from this vendor: one ‘thanks for looking’, one follow-up from ’sales’, an announcement about the weekly ‘webcast’ (every Wednesday?) and the most recent a personal follow-up from a real person. (Sounds like pre-sales/sales process/team have their act together.)
- I load some system log data onto the test box and configure a ‘watch’ folder; sure enough as files show up they become accessible via Splunk
- I add a GEO-IP data file (IP addresses with countries); the IP addresses are from the previous data files; the GEO-IP info was generated via a custom script. If I search on an IP then I the results include ‘hits’ from both files; If I search on country codes or cities then I only get the GEO-IP data – of course, what I want is a ‘correlated result’ that ties the IP to a country so I could ask for something like, “Show me all log entries from city X OR country Y or region Whatever…” With my very limited knowledge of this tool I don’t see a simple way to do this so my review stops at this point.
- So far I have only tried a few simple searches and reports – lots of standard features with many options.
- SA perspective – install process/docs need tweaks for a seamless install.
- User perspective – looks really promising.
- This is not a ‘drop it in’ tool – it will take time to evaluate how to get the most benefit from using it but it has lots of potential in stream lining the analysis of system data.
I plan on attending some of the online webcasts/screencast events for this product. A quick review reveals quite a number of tools and settings that could be quite helpful when attempting to discover meaning in system data. A side note – this solution appears to be using Python for at least some portion of it’s work.
As always, you mileage should vary.
Related posts:
- Root cause analysis – do you really want to know? Root Cause Analysis - it's not about finger pointing (or...
- Converting Server Logs to GeoIP data (kml) – (1) This is part one of a multi-part part post on...
- Converting Server Logs to GeoIP data (kml) – (2) This is part two of a multi-part post on generating...
- It happened to Google – are you next? Well, it happened to Google (and a number of other...
- Migrating Static HTML pages to Wordpress CMS I previously posted about migrating static pages to Drupal –...