Mobile Barcode Tool

QR Code - scan to visit our mobile site

This is a 2D-barcode containing the address of our mobile site.If your mobile has a barcode reader, simply snap this bar code with the camera and launch the site.

Splunk – Centralized, Real Time Log Analysis (NOS)

NOS = not open source; but there is a ‘free’ version…  Before diving in some attractive verbiage from Wikipedia:

Splunk[1][2][3] is a search, monitoring and reporting tool for IT system administrators with search capabilities[4]. It crawls logs, metrics, and other data from applications, servers and network devices and indexes it in a searchable repository from which it can generate graphs, SQL reports and alerts.[5] It is intended to assist system administrators in the identification of patterns and the diagnosis of problems.

Before reading any further I encourage the visitor to read my ‘about page’ – I tend to be a disruptive user and say what I think so expect direct reactions to whatever I encounter.

Splunk sounds great if you are an SA (well, the features sound great.)   I will hazard a guess that the solution combines a bit of Open Source approaches with custom crunch/analysis tool(s).  Could you create your own?  Most likely yes (and there probably already are some free Open Source solutions with similar features.) Since this is a commercial product my expectations are quite high:

  • I should not have any configuration issues (should work without any manual tweaks by me.)
  • I should not need to gotta-grab-yet-another-tool-library-etc.
  • The installer should be the only command that I need to run – I should NOT have to make any system changes.

Ok, so what is the reality?

  1. I create a new VirtualBox VM, install FC12 (manually) and then do a complete system update | ~ 2 hours
  2. I lock down the network so that traffic from the VM NIC cannot ‘reach out’… (I block traffic via the VM MAC address by adding a rule on my router; internal network traffic is allowed.)
  3. I download Splunk | a few minutes
  4. I review the install information from the product web page(s) | ~30+ minutes.  There is quite a bit of performance and ’sizing’ information – I pick up on this:

Performance Considerations

Splunk has three primary roles - indexer, searcher and forwarder. In many cases a single Splunk instance may [serve] two or all three roles at once. All have their own performance requirements, and bottlenecks.

  • indexing, while relatively resource inexpensive, is often disk I/O bound
  • searching can be both CPU and disk I/O bound
  • forwarding uses very little resources, and is rarely a bottleneck

As you can see, disk I/O is frequently the limiting factor in Splunk performance, and deserves extra consideration in your planning. That also makes Splunk a poor virtualization candidate unless dedicated disk access can be arranged.

Did someone mention Beer? (some marketing from the Splunk web pages…)

More about Splunk Free

Splunk Free is a totally free (as in beer) version of Splunk. It allows you to index up to 500MB/day and will never expire. If you go over 500MB/day more than 3 times in a 30 day period, Splunk will continue to index your data, but search will be disabled until you are back down to 3 or fewer times in the 30 day period.

What’s it for?

Splunk Free is designed for personal, ad-hoc search and visualization of IT data. You can use Splunk Free for ongoing indexing of small volumes (<500MB/day) of data. Additionally, you can use it for short-term bulk-loading and analysis of larger data sets–Splunk Free allows you to bulk-load much larger data sets up to 3 times within a 30 day period. This can be useful for forensic review of large data sets.

Potential Install Concerns

  1. I locate a possible ‘red flag’ in the Linux install docs – seems that user/owner might be a concern (this should be a standardized item with perhaps an install option to customize further.)  I would also expect this to have process/service configuration options; at this point I would prefer to see Apache integration for any HTTP service (that would allow one point of access/security configuration.)  It may be possible to proxy this service via Apache – we’ll see.
  2. The software will present itself as a web server running on a dedicated port (you need to consider firewall or other network aspects for your environment – not a problem just an item to resolve.)

The actual Splunk Install

  1. I confirm that the new VM cannot send packets through the router (but, if needed, I can enable access via a monitored proxy.)
  2. I install Splunk via ‘time rpm -i splunk-rpmfile’ | output is:

warning: splunk-4.1.2-79191.i386.rpm: Header V3 DSA signature: NOKEY, key ID 653fb112
———————————————————————-
Splunk has been installed in:
/opt/splunk

To start Splunk, run the command:
/opt/splunk/bin/splunk start

To use the Splunk Web interface, point your browser at:
http://Your-SERVER:8000

Complete documentation is at http://www.splunk.com/r/docs
———————————————————————

real    0m24.007s
user    0m1.347s
sys    0m16.849s

Hmm – done? ready to run?

Ok, as a non-privileged  user I try and get:

/opt/splunk/bin/splunk start
bash: /opt/splunk/bin/splunk: Permission denied

This is good and bad… If this is a ’service’ then I would expect a system level startup approach.  I do an ’su – splunk’ and re-try.

Ok – after agreeing to the license agreement there is a lot of setup that occurs – note the message about SELINUX (should be have disclosed in setup info; also, there is a ‘management port‘ which should have also been mentioned in setup info…)  The install process did not prompt for any setup info…

Do you agree with this license? [y/n]: y
Copying '/opt/splunk/etc/myinstall/splunkd.xml.cfg-default' to '/opt/splunk/etc/myinstall/splunkd.xml'.
Copying '/opt/splunk/etc/openldap/ldap.conf.default' to '/opt/splunk/etc/openldap/ldap.conf'.
Moving '/opt/splunk/share/splunk/search_mrsparkle/modules.new' to '/opt/splunk/share/splunk/search_mrsparkle/modules'.
/opt/splunk/etc/auth/audit/private.pem
/opt/splunk/etc/auth/audit/public.pem
['openssl', 'genrsa', '-out', '/opt/splunk/etc/auth/audit/private.pem', '1024']
/opt/splunk/etc/auth/audit/private.pem generated.
/opt/splunk/etc/auth/audit/public.pem generated.
Generating RSA private key, 1024 bit long modulus
.......++++++
.............................................++++++
e is 65537 (0x10001)
writing RSA key

/opt/splunk/etc/auth/distServerKeys/private.pem
/opt/splunk/etc/auth/distServerKeys/trusted.pem
['openssl', 'genrsa', '-out', '/opt/splunk/etc/auth/distServerKeys/private.pem', '1024']
/opt/splunk/etc/auth/distServerKeys/private.pem generated.
/opt/splunk/etc/auth/distServerKeys/public.pem generated.
Generating RSA private key, 1024 bit long modulus
...................++++++
.......++++++
e is 65537 (0x10001)
writing RSA key

This appears to be your first time running this version of Splunk.
	Creating: /opt/splunk/var/lib
	Creating: /opt/splunk/var/run/splunk
	Creating: /opt/splunk/var/run/splunk/upload
	Creating: /opt/splunk/var/spool/splunk
	Creating: /opt/splunk/var/spool/dirmoncache
	Creating: /opt/splunk/var/lib/splunk/authDb
	Creating: /opt/splunk/var/lib/splunk/hashDb
	Checking databases...
	Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history, main, sample, splunklogger, summary

Splunk> All batbelt. No tights.

Checking prerequisites...
	Checking http port [8000]: open
	Checking mgmt port [8089]: open
	Checking configuration...  Done.
	Checking index directory...  Done.
	Checking databases...
	Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history, main, sample, splunklogger, summary
	Checking for SELinux.

Command error: Splunk will not run with SELinux enabled.
If you have adjusted Splunk's security level with chcon, you can bypass this check by setting the 'SPLUNK_IGNORE_SELINUX' environment variable.

I disable Selinux (it will re-set on reboot – I am not ready to make a long-term change via chcon for Splunk…)

export SPLUNK_IGNORE_SELINUX=1

/opt/splunk/bin/splunk start

Now it’s time to read the docs and perhaps configure data inputs. I don’t locate any quick way to get real info into the system. I do encounter a page indicating that I need a new version of Flash…  Since I was running the Browser in a VM running Splunk I decided to try from an external browser.  Firewall changes are required to accomplish this.

Next I select the ‘Getting Started’ link which leads me to Manager –> Data inputs –> Files & Directories.  I attempt to add a standard system log file to the data set (expecting problems) and get one – ‘access denied’.  It seems silly to me to have a tool designed (seemingly) to monitor sensitive system data but the default install does not take this in account on the system where the tool is installed?  [ I will guess that 'advanced install' documentation may explain this;  also I did run across docs on using Splunk via Apache... ]

  • Build reports opens a new browser window – would prefer a new browser tab…
  • ‘Help’ link in search drop-down box requires Internet access? (I prefer that apps include such data – no Internet or outside connections should be needed…)
  • Since download I have received 4 emails (in five days) from this vendor: one ‘thanks for looking’, one follow-up from ’sales’, an announcement about the weekly ‘webcast’ (every Wednesday?) and the most recent a personal follow-up from a  real person.  (Sounds like pre-sales/sales process/team have their act together.)  :)
  • I load some system log data onto the test box and configure a ‘watch’ folder; sure enough as files show up they become accessible via Splunk
  • I add a GEO-IP data file (IP addresses with countries); the IP addresses are from the previous data files; the GEO-IP info was  generated via a custom script.   If I search on an IP then I the results include ‘hits’ from both files;  If I search on country codes or cities then I only get the GEO-IP data – of course, what I want is a ‘correlated result’ that ties the IP to a country so I could ask for something like, “Show me all log entries from city X OR country Y or region Whatever…”  With my very limited knowledge of this tool I don’t see a simple way to do this so my review stops at this point.
  • So far I have only tried a few simple searches and reports – lots of standard features with many options.
  • SA perspective – install process/docs need tweaks for a seamless install.
  • User perspective – looks really promising.
  • This is not a ‘drop it in’ tool – it will take time to evaluate how to get the most benefit from using it but it has lots of potential in stream lining the analysis of system data.  :)

I plan on attending some of the online webcasts/screencast events for this product.  A quick review reveals quite a number of tools and settings that could be quite helpful when attempting to discover meaning in system data.  A side note – this solution appears to be using Python for at least some portion of it’s work.

As always, you mileage should vary.  :)

Share and Enjoy:
  • LinkedIn
  • Digg
  • del.icio.us
  • Google Bookmarks
  • Blogosphere News
  • Technorati
  • TwitThis
  • Live
  • Slashdot
  • Sphinn
  • Mixx
  • Yahoo! Buzz
  • StumbleUpon
  • Facebook
  • MSN Reporter
  • Reddit
  • RSS
  • Yahoo! Bookmarks

Related posts:

  1. Root cause analysis – do you really want to know? Root Cause Analysis - it's not about finger pointing (or...
  2. Converting Server Logs to GeoIP data (kml) – (1) This is part one of a multi-part part post on...
  3. Converting Server Logs to GeoIP data (kml) – (2) This is part two of a multi-part post on generating...
  4. It happened to Google – are you next? Well, it happened to Google (and a number of other...
  5. Migrating Static HTML pages to Wordpress CMS I previously posted about migrating static pages to Drupal –...

Leave a Reply - Please use your Real Name...

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Your GeoIP Data | Ip: 38.107.191.99
Continent: NA | Country Code: US | Country Name: United States
Region: DC | State/Region Name: District of Columbia | City: Washington
(US only) Area Code: 202 | Postal code/Zip: 20007
Latitude: 38.914398 | Longitude: -77.076302
Note - if using a mobile device your physical location may NOT be accurate...