It seems as if all software today calls itself open source. That usually means free, yet not all of it is free. In some cases, the vendor gives away a free copy of one product and charges for a supported version or one with more features. In the other case, companies with some vested interest in the outcome of the product contribute programmers, time, and money to the open source project.
One well-known example of the second situation is OpenStack, the cloud operating system. Google, Yahoo, and others back it because they wanted an alternative to VMware, which they would have to pay for.
Also, open source can mean community supported. Linux is the best-known example of that.
Open source software comes with different licensing schemes, like Apache, BSD; GNU, Eclipse, MIT, and Mozilla. Those vary in the details but what they have in common is those who want to change it agree to contribute any improvements they make back to the open source product and project.
Log processing is one area in which Open Source software dominates. Splunk is one commercial leader. But it is expensive, which gives business a reason to look at open source alternatives. Nxlog community edition is free. But their Enterprise Edition is not. And then there is ELK.
ELK stands for ElasticSearch, LogStash, and Kibana. Those three tools are often used together to produce log analysis. Most people use the Nginx web server as well so they can access the Kibana web interface using port 80, which is simpler than opening firewall ports or changing the Kibana port.
In this blog post, we swap out LogStash and replace that with NXlog and then look at those three products together.
Logs are a headache for most organizations to manage because there are so many of them and there are so many formats. Yet it is necessary to process logs for reasons of security, performance, business analysis, legal requirements, etc.
In addition to ELK, there are various tools you can wrap around logs to do specific things, like Prometheus for performance monitoring or Snort for cybersecurity. And then there is Kibana, which is a visualization and query tool.
What Logstash and NXlog do are act as middlemen between the log source, such as Linux Syslog or Windows events, and a log repository, like ElasticSearch.
NXlog reads log data and then sends it to somewhere else for storage. You configure it to load different modules, one for each type of data and one for each data destination.
For example, there are input modules to read the Windows Event log and Linux syslog. And there are output modules for sending data to a file or out over TCP, UDP, HTTPS, or messaging server. And of course there is an output module for ElasticSearch.
A tool like NXlog can be used for lots of purposes, including cybersecurity. If you have ever worked with a SIEM (Security Incident Event Management) tool like ArcSight then you know that is made really complicated by the difficulty of correlating events.
For example, a hacker who is attacking a network might first attack the firewall device, which has its own log, and then login to Active Directory, which has another. And for each device the hacker uses different credentials. Putting those two events together in ArcSight is very difficult. It is still difficult but made easier with NXlog.
[See Also: Survey of NoSQL Databases]
But event correlation is a special use case. What NXlog does best is handle different log formats and put them into common formats, like JSON, for loading into ElasticSearch or elsewhere.
NXlog can also handle the complex case of multi-line logs, where each event coming from a server, application, or device is written as more than one output line.
NXlog includes a set of fields that it gathers for all log events, like time, process id, source IP address, etc. But you can also parse data and fields further using regular expressions and even write your own subroutines in Perl.
ElasticSearch is a NoSQL document store that uses JSON to store documents. It is written in Java. What is most noteworthy about this product is it operates very fast, both on loading and querying. You can install a cluster of ElasticSearch servers and run distributed queries across those, just like you would do with Hadoop or Spark.
But ElasticSearch is not Hadoop or Spark. Instead you could use it in place of MongoDB or RavenDB NoSQL databases. And given its incredible performance, you can use it to collect logs from applications and machines across a large organization and gather them in one place.
You interact with ElasticSearch by sending JSON documents to its REST API. You can either send documents to its bulk loader URL or you send them to a specific index.
For example you can send documents to this index and document type (e.g., sales receipt, Windows event log, syslog)
Kibana is built to work with ElasticSearch.
Lots of YouTube presentations and articles on Kibana emphasize its ability to do visually stunning graphs. But that is not how the average person is going to use it. A time series is important for performance monitoring and to do big data analytics. But if you are looking for ordinary events by some filter then Kibana makes that relatively easy.
It does this using the Lucene Query Syntax. At its simplest format you can just write “*” and all events show up. Or you can filter by one field like, for example, hostname=”fred”. You query database on Index Patterns, which are fields that group your data by its data source, like “Windows”, “Linux”, or “Amazon CloudWatch”.
You save these queries and then add them to dashboards. So you might have a dashboard for “Tier I Support,” “manufacturing operations,” or “SIEM.” These dashboards can include the results of queries output as a simple table of text or it can include pie, bar, and scatter charts as shown in the graphic below.
Everything you need to know about outsourcing technology development
Access a special Introduction Package with everything you want to know about outsourcing your technology development. How should you evaluate a partner? What components of your solution that are suitable to be handed off to a partner? These answers and more below.