Tag Archives: security monitoring

Splunk Funk

Recently, I was asked to evaluate an organization’s Splunk deployment. This request flummoxed me, because while I’ve always been a fan of the tool’s capabilities, I’ve never actually designed an implementation or administered it. I love the empowerment of people building their own dashboards and alerts, but this only works when there’s a dedicated Splunk-Whisperer carefully overseeing the deployment and socializing the idea of using it as self-service, cross-functional tool.  As I started my assessment, I entered what can only be called a “dark night of the IT soul” because my findings have led me to question the viability of most enterprise monitoring systems.

The original implementer recently moved on to greener pastures and (typically) left only skeletal documentation. As I started my investigation, I discovered  a painfully confusing distributed deployment built with little to no understanding of  “best practices” for the product. With no data normalization and almost non-existent data input management, the previous admin had created the equivalent of a Splunk Wild West, allowing most data to flow in with little oversight or control. With an obscenely large number of sourcetypes and sources, the situation horrified Splunk support and they told me my only option was to rebuild, a scenario that filled me with nerd-angst.

In the past, I’ve written about the importance of using machine data for infrastructure visibility. It’s critical for security, but also performance monitoring and troubleshooting. Log correlation and analysis is a key component of any healthy infrastructure and without it, you’re like a mariner lost at sea. So imagine my horror when confronted by a heaping pile of garbage data thrown into a very expensive application like Splunk.

Most organizations struggle with a monitoring strategy because it simply isn’t sexy. It’s hard to get business leadership excited about dashboards, pie charts and graphs without contextualizing them in a report. “Yeah baby, let me show you those LOOOOW latency times in our web app.” It’s a hard sell, especially when you see the TCO for on-premise log correlation and monitoring tools. Why not focus on improving a product that could bring in more customer dollars or a new service to make your users happier?  Most shops are so focused on product delivery and firefighting, they simply don’t have cycles left for thinking about proactive service management. So you end up with infrastructure train wrecks, with little to no useful monitoring.

While a part of me still believes in using the best tools to gain intelligence and visibility from an infrastructure, I’m tired of struggling. I’m beginning to think I’d be happy with anything, even a Perl script, that works consistently with a low LOE. I need that data now and no longer have the luxury of waiting until there’s a budget for qualified staff and the right application. Lately, I’m finding it pretty hard to resist the siren song of SaaS log management tools that promise onboarding and insight into machine data within minutes, not hours. Just picture it: no more agents or on-premise systems to manage, just immediate visibility into data.  Most other infrastructure components have moved to the cloud, maybe it’s inevitable for log management and monitoring. Will I miss the flexibility and power of tools like Splunk and ELK? Probably, but I no longer have the luxury of nostalgia.

 

 

Tagged , , , , ,

Security Vs. Virtualization

Recently, I wrote an article for Network Computing about the challenges of achieving visibility in the virtualized data center.

Security professionals crave data. Application logs, packet captures, audit trails; we’re vampiric in our need for information about what’s occurring in our organizations. Seeking omniscience, we believe that more data will help us detect intrusions, feeding the inner control freak that still believes prevention is within our grasp. We want to see it all, but the ugly reality is that most of us fight the feeling that we’re flying blind.

In the past, we begged for SPAN ports from the network team, frustrated with packet loss. Then we bought expensive security appliances that used prevention techniques and promised line-rate performance, but were often disappointed when they didn’t “fail open,” creating additional points of failure and impacting our credibility with infrastructure teams.

So we upgraded to taps and network packet brokers, hoping this would offer increased flexibility and insight for our security tools while easing fears of unplanned outages. We even created full network visibility layers in the infrastructure, thinking we were finally ahead of the game.

Then we came face-to-face with the nemesis of visibility: virtualization. It became clear that security architecture would need to evolve in order to keep up.

You can read and comment on the rest of the article here.

Tagged , , , , , , ,

Is Your Security Architecture Default-Open or Default-Closed?

One of the most significant failures I see in organizations is an essential misalignment between Operations and Security over the default network state. Is it default-open or default-closed? And I’m talking about more than the configuration of fail-open or fail-closed on your security controls.

Every organization must make a philosophical choice regarding its default security state and the risk it’s willing to accept. For example, you may want to take a draconian approach, i.e. shooting first, asking questions later. This means you generally validate an event as benign before resuming normal operations after receiving notification of an incident.

But what if the security control detecting the incident negatively impacts operations through enforcement? If your business uptime is too critical to risk unnecessary outages, you may decide to continue operating until a determination is made that an event is actually malicious.

Both choices can be valid, depending upon your risk appetite. But you must make a choice, socializing that decision within your organization. Otherwise, you’re left with confusion and conflict over how to proceed during an incident.

baby_meme

Tagged , , , , ,

Mythology and the OPM Hack

Seems like every security “thought leader” on the planet has commented on the OPM hack, so I might as well join in.

Although the scope of the breach is huge, there’s nothing all that new here. In fact, it’s depressing how familiar the circumstances sound to those of us who work as defenders in an enterprise. For the moment, ignore attribution, because it’s a distraction from the essential problem. OPM was failing security kindergarten. They completely neglected the basics of rudimentary security: patching vulnerabilities, keeping operating systems upgraded, multi-factor authentication for accessing critical systems, intrusion detection.

Being on a security team in an organization often means that your cries of despair land on deaf ears. Much like a mythical figure named Cassandra. She was the daughter of the Trojan king Priam and greatly admired by Apollo, who gave her the gift of prophecy. When she spurned his affections, he converted the gift into a curse. While her predictions were still true, no one would believe them.

As a recent Washington Post story reminded us, many in security have been predicting this meltdown since the 90’s. Now that IT has become a critical component of most organizational infrastructures, there’s more at stake and we’re finally getting the attention we’ve been demanding. But it may be too late in the game, leaving worn out security pros feeling like the Trojan War’s patron saint of “I told you so’s,” Cassandra.

Cassandra on TVM

Tagged , , , , , ,

The MSSP Is the New SIEM

In the last year, I’ve come to a realization about incident management. In most cases, buying a SIEM is a waste of money for the enterprise. The software and licensing cost isn’t trivial, some of them utilizing what I like to call the “heroin dealer” or consumption licensing model. The first taste is free or inexpensive, but once you’re hooked, prepare to hand over your checkbook, because the costs often spiral out of control as you add more devices. Additionally, for most small to medium organizations, the complicated configuration often requires a consulting company to assist with the initial implementation and at least one full-time employee to manage and maintain. Even then, you won’t really have 24×7 monitoring and alerting, because most can’t afford a large enough staff to work in shifts, which means you’re dependent upon email or text alerts. That’s not very useful if your employees actually have lives outside of work. Most often, what you’ll see is an imperfectly implemented SIEM that becomes a noise machine delivering little to no value.

The SIEM’s dirty secret is that it’s a money pit. Once you add up the software and licensing cost, the professional services you spend to get it deployed and regularly upgraded, the hardware, the annual support cost, and staffing, you’re looking at a sizable investment. Now you should ask yourself, are you really reducing risk with a SIEM or just hitting some checkbox on a compliance list?

Alternatively, let’s look at the managed security service provider (MSSP). For a yearly cost, this outsourced SOC will ingest and correlate your logs, set up alerts, monitor and/or manage devices 24×7, 365 days a year. An MSSP’s level-1 and level-2 staff significantly reduce the amount of repetitive work and noise your in-house security team must deal with, making it less likely that critical incidents are missed. The downside is that the service is often mediocre, leaving one with the sneaking suspicion that these companies are happy to employ any warm body to answer the phone and put eyeballs on a screen. This means that someone has to manage the relationship, ensuring that service level agreements are met.

While there are challenges with outsourcing, the MSSP is a great lesson in the economy of scale. The MSSP is more efficient in delivering service because it performs the same functions for many customers.  While not cutting-edge or innovative, the service is often good enough to allow a security team to focus on the incidents that matter without having to sift through the noise themselves. The caveat? While useful in the short-term, security teams should still focus on building proactive controls with automation and anomaly detection for improved response. After all, the real goal is to make less garbage, not more sanitation workers.

Tagged , , , ,
%d bloggers like this: