Tag Archives: security

Fear and Loathing in Security Dashboards

Recently a colleague asked for my help in understanding why he was seeing a specific security alert on a dashboard. The message said that his database instance was “exposed to a broad public IP range.” He disagreed with this assessment because it misrepresented the configuration context. While the database had a public IP, only one port was available, and it was behind a proxy. The access to this test instance was also restricted to “authorized” IP address ranges. I explained that this kind of information is what security practitioners like to know as they evaluate risk, but then thought, “is this a reasonable alert for a user or just more noise?” When did security dashboards become like the news, more information than we can reasonably take in, overloading our cognitive faculties and creating stress?

I have a complicated relationship with security dashboards. Though I understand different teams need a quick view of what they need to prioritize, findings are broadly categorized as high, medium, and low without much background. This approach can create confusion and disagreements between groups because those categories are generally aligned to the Vienna Convention on Road Signs and Signals. Green is good, red is bad, and yellow means caution. The problem is that a lot of findings end up red and yellow, with categorization dependent upon how well the security team has tuned alerts and your organizational risk tolerance. Most nuance is lost.

The other problem is that this data categorization isn’t only seen as a prioritization technique. It can communicate danger. As humans, we have learned to associate red on a dashboard with some level of threat. This might be why some people develop fanariphobia, a fear of traffic lights. Is this an intentional design choice? Historically, Protection Motivation Theory (PMT), which explains how humans are motivated to protect themselves when threatened, has been used as a standard technique within the domain of cybersecurity to justify the use of fear appeals. But what if this doesn’t work as well as we think it does? A recent academic paper reviewed literature in this space and found conflicting data on the value of fear appeals in promoting voluntary security behaviors. It often backfires, leading to a reduction in desired responses. What does work? The researchers identify Stewardship Theory as a more efficacious approach leading to improved security behaviors by employees. They define it as “a covenantal relationship between the individual and the organization” which “connects both employee and employer to work toward a common goal, characterized by moral commitment between employees and the organization.”

Am I suggesting you should throw your security dashboards away? No, but I think we can agree that they’re a limited view, which can exacerbate conflict between teams. Instead of being the end of a conversation, they should be the beginning, a dialog tool that encourages a collaborative discussion between teams about risk.

Tagged , , , , , , , ,

Introducing: Security’s Social Problem

I’m releasing new video series on the interpersonal challenges in cybersecurity and how this issue becomes the biggest hurdle to reducing risk in an organization. According to the 2023 Verizon Data Breach Investigations Report (DBIR) 74% of all breaches include a human element. For this reason, I think it’s time we start to address how to build the relational skills needed to manage more effective programs. Over the coming weeks I’ll be discussing some of the challenges in this area and approaches organizations can use.

Tagged ,

Supply Chain Security Jumps the Shark

Can we collectively agree that the supply chain security discussion has grown tiresome? Ten years ago, I couldn’t get anyone to pay attention to the supply chain outside of the federal government crowd, but now it continues to be the security topic du jour. And while this might seem like a good thing, it’s increasingly becoming a distraction from other topics of product security, crowding out meaningful discussions about secure software development. So like a once-loved, long-running TV show that has worn out its welcome but looks for gimmicks to keep everyone’s attention, I’m officially declaring that Supply Chain Security has jumped the shark.

First, let’s clarify the meaning of the term Supply Chain Security. Contrary to what some believe, it’s not synonymous with the software development lifecycle (SDLC). That’s right, it’s time for a NIST definition! NIST, or the National Institute of Standards and Technology, defines supply chain security broadly because this term refers to anything acquired by an organization.

…the term supply chain refers to the linked set of resources and processes between and among multiple levels of an enterprise, each of which is an acquirer that begins with the sourcing of products and services and extends through the product and service life cycle.

Given the definition of supply chain, cybersecurity risks throughout the supply chain refers to the potential for harm or compromise that may arise from suppliers, their supply chains, their products, or their services. Cybersecurity risks throughout the supply chain are the results of threats that exploit vulnerabilities or exposures within products and services that traverse the supply chain or threats that exploit vulnerabilities or exposures within the supply chain itself.

(If you’re annoyed by the US-centric discussion, I encourage you to review ISO 28000 series, supply chain security management, which I haven’t included here because they charge you > $600 for downloading the standard.)

Typically, supply chain security refers to third parties, which is why the term is most often used in relation to open source software (OSS). You didn’t create the OSS you’re using, and it exists outside your own SDLC, so you need processes and capabilities in place to evaluate it for risk. But you also need to consider the commercial off-the-shelf software (COTS) you acquire as well. Consider SolarWinds. A series of attacks against the public and private sectors was caused by a breach against a commercial product. This compromise is what allowed malicious parties into SolarWinds customers’ internal networks. This isn’t a new concept, it just gained widespread attention due to the pervasive use of SolarWinds as an enterprise monitoring system. Most organizations that have procurement processes include robust third party security programs for this reason, but they aren’t perfect.

If supply chain security isn’t a novel topic and isn’t inclusive of the entire SDLC, then why does it continue to captivate the attention of security leaders? Maybe because it presents a measurable, systematic approach to addressing application security issues. Vulnerability management is attractive because it offers the comforting illusion that if you do the right things, like updating OSS, you’ll beat the security game. Unfortunately, the truth is far more complicated. Just take a look at the following diagram that illustrates the typical elements of a product security program:

Transforming_product_security_EXTERNAL

Executives want uncomplicated answers when they ask, “Are we secure.” They often feel overwhelmed by security discussions because they want to focus on what they were hired for: to run a business. As security professionals, we need to remember this motivation as we build programs to comprehensively address security risk. We should be giving our organizations what they need, not more empty security promises based on the latest trends.

Tagged , , , , , , ,

Compliance As Property

In engineering, a common approach to security concerns is to address those requirements after delivery. This is inefficient for the following reasons:

  • Fails to consider how the requirement(s) can be integrated during development, thereby avoiding reengineering to accommodate the requirement 
  • Disempowers engineering teams by outsourcing compliance and the understanding of the requirements to another group.

To improve individual and team accountability, it is recommended to borrow a key concept from Restorative Justice, Conflict as Property. This concept asserts that the disempowerment of individuals in western criminal justice systems is the result of ceding ownership of conflict to a third-party. Similarly, enterprise security programs often operate as “policing” systems, with engineering teams considering security requirements as owned by a compliance group. While appearing to be efficient, this results in the siloing of compliance activities and infantilization of engineering teams. 

Does this mean that engineering teams must become deep experts in all aspects of information security? How can they own security requirements without a full grounding in these concepts? Ownership does not necessarily imply expertise. While one may own a house or vehicle and be responsible for maintenance, most owners will understand when outside expertise is required.

The ownership of all requirements by an engineering team is critical for accountability. To proactively address security concerns, a team must see these requirements as their “property” to address them efficiently during the design and development phases. It is neither effective nor scalable to hand off the management of security requirements to another group. While an information security office can and should validate that requirements have been met in support of Separation of Duties (SoD), ownership for implementation and understanding belongs to the engineering team. 

Tagged , , ,

DevSecOps Decisioning Principles

I know you’ve heard this before, but DevOps is not about tools. At its core, DevOps is really a supply chain for efficiently delivering software. At various stages of the process, you need testing and validation to ensure the delivery of a quality product. With that in mind, DevSecOps should adhere to certain principles to best support the automated SDLC process. To this end, I’ve developed a set of fundamental propositions for the practice of good DevSecOps.

  • Security tools should integrate as decision points in a DevOps pipeline aka DevSecOps.
  • DevSecOps tool(s) should have a policy engine that can respond with a pass/fail decision for the pipeline. 
    • This optimizes response time.
    • Supports separation of duties (SoD) by externalizing security decisions outside the pipeline.
    • “Fast and frugal” decisioning is preferred over customized scoring to better support velocity and consistency. 
    • Does not exclude the need for detailed information provided as pipeline output.
  • Full inspection of the supply chain element to be decisioned, aka “slow path,” should be used when an element is unknown to the pipeline decisioner. 
  • Minimal or incremental inspection of the supply chain element to be decisioned, aka “fast path,” should be used when an element is recognized (e.g. hash) by the pipeline decisioner.
  • Decision points should have a “fast path” available, where possible, to minimize any latency introduced from security decisioning.
  • There should be no attempt to use customized risk scores in the pipeline. While temporal and contextual elements are useful in reporting and judging how to mitigate operational risk, attempts to use custom scores in a pipeline could unnecessarily complicate the decisioning process, create inconsistency and decrease performance of the pipeline.  
  • Security policy engines should not be managed by the pipeline team, but externally by a security SME, to comply with SoD and reduce opportunities for subversion of security policy decisions during automation.

Using a master policy engine, such as the Open Policy Agent (OPA), is an ideal way to “shift left” by providing a validation capability-as-a-service that can be integrated at different phases into the development and deployment of applications. Ideally, this allows the decoupling of compliance from control, reducing bottlenecks and inconsistency in the process from faulty security criteria integrated into pipeline code. By using security policy-as-code that is created and managed by security teams, DevSecOps will align more closely with the rest of the SDLC. Because at the end of the day, the supply chain is only as good as the product it delivers.

Tagged , , , , , ,

Your Pets Don’t Belong in the Cloud

At too many organizations, I’ve seen a dangerous pattern when trying to migrate to public Infrastructure as a Service (IaaS) i.e. Cloud. It’s often approached like a colo or a data center hosting service and the result is eventual failure in the initiative due to massive cost overruns and terrible performance. Essentially, this can be attributed to inexperience on the side of the organization and a cloud provider business model based on consumption. The end result is usually layoffs and reorgs while senior leadership shakes its head, “But it worked for Netflix!”

Based on my experience with various public and hybrid cloud initiatives, I can offer the following advice.

  1. Treat public cloud like an application platform, not traditional infrastructure. That means you should have reference models and Infrastructure-as-Code (IaC) templates for the deployment of architecture and application components that have undergone security and peer reviews in advance. Practice “policy as code” by working with cloud engineers to build security requirements into IaC.
  2. Use public cloud like an ephemeral ecosystem with immutable components. Translation: your “pets” don’t belong there, only cattle. Deploy resources to meet demand and establish expiration dates. Don’t attempt to migrate your monolithic application without significant refactoring to make it cloud-friendly. If you need to change a configuration or resize, then redeploy. Identify validation points in your cloud supply chain, where you can catch vulnerable systems/components prior to deploy, because it reduces your attack surface AND it’s cheaper. You should also have monitoring in place (AWS Config or a 3rd party app) that catches any deviation and  automatically remediates. You want cloud infrastructure that is standardized, secure and repeatable.
  3. Become an expert in understanding the cost of services in public cloud. Remember, it’s a consumption model and the cloud provider isn’t going to lose any sleep over customers hemorrhaging money due to bad design.
  4. Hybrid cloud doesn’t mean creating inefficient design patterns based on dependencies between public cloud and on-premise infrastructure. You don’t do this with traditional data centers, why would you do it with hybrid could?
  5. Hire experienced automation engineers/developers to lead your cloud migration and train staff who believe in the initiative. Send the saboteurs home early on or you’ll have organizational chaos.

If software ate the world, it burped out the Cloud. If you don’t approach this initiative with the right architecture, processes and people, there aren’t enough fancy tools in the world to help you clean up the result: organizational indigestion.

burping_cloud

 

Tagged , , , , , , , ,

The Five Stages of Cloud Grief

Over the last five years as a security architect, I’ve been at organizations in various phases of cloud adoption. During that time, I’ve noticed that the most significant barrier isn’t technical. In many cases, public cloud is actually a step up from an organization’s on-premise technical debt.

One of the main obstacles to migration is emotional and can derail a cloud strategy faster than any technical roadblock. This is because our organizations are still filled with carbon units that have messy emotions who can quietly sabotage the initiative.

The emotional trajectory of an organization attempting to move to the public cloud can be illustrated through the Five Stages of Cloud Grief, which I’ve based on the Kubler-Ross Grief Cycle.

  1. Denial – Senior Leadership tells the IT organization they’re spending too much money and that they need to move everything to the cloud, because it’s cheaper. The CIO curls into fetal position under his desk. Infrastructure staff eventually hear about the new strategy and run screaming to the data center, grabbing onto random servers and switches. Other staff hug each other and cry tears of joy hoping that they can finally get new services deployed before they retire.
  2. Anger – IT staff shows up at all-hands meeting with torches and pitchforks calling for the CIO’s blood and demanding to know if there will be layoffs. The security team predicts a compliance apocalypse. Administrative staff distracts them with free donuts and pizza.
  3. Depression – CISO tells everyone cloud isn’t secure and violates all policies. Quietly packs a “go” bag and stocks bomb shelter with supplies. Infrastructure staff are forced to take cloud training, but continue to miss project timeline milestones while they refresh their resumes and LinkedIn pages.
  4. Bargaining – After senior leadership sets a final “drop dead” date for cloud migration, IT staff complain that they don’t have enough resources. New “cloud ready” staff is hired and enter the IT Sanctum Sanctorum like the Visigoths invading Rome. Information Security team presents threat intelligence report that shows $THREAT_ACTOR_DU_JOUR has pwned public cloud.
  5. Acceptance – 75% of cloud migration goal is met, but since there wasn’t a technical strategy or design, the Opex is higher and senior leadership starts wearing diapers in preparation for the monthly bill. Most of the “cloud ready” staff has moved on to the next job out of frustration and the only people left don’t actually understand how anything works.

AWS_consumption

Tagged , , , , , , , ,

Infrastructure-as-Code Is Still *CODE*

After working in a DevOps environment for over a year, I’ve become an automation acolyte. The future is here and I’ve seen the benefits when you get it right: improved efficiency, better control and fewer errors. However, I’ve also seen the dark side with Infrastructure-as-Code (IaC). Bad things happen because people forget that it’s still code and it should be subject to the same types of security controls you use in the rest of your SDLC.

That means including automated or manual reviews, threat modeling and architectural risk assessments. Remember, you’re not only looking for mistakes in provisioning your infrastructure or opportunities for cost control. Some of this code might introduce vulnerabilities that could be exploited by attackers. Are you storing credentials in the code? Are you calling scripts or homegrown libraries and has that code been reviewed? Do you have version control in place? Are you using open source tools that haven’t been updated recently? Are your security groups overly permissive?

IaC is CODE. Why aren’t you treating it that way?

devops_borat

Tagged , , , , , ,

NTP Rules of the Road

There’s nothing more infuriating than watching organizations screw up foundational protocols and NTP seems to be one of the most commonly misconfigured. For some reason, people seem to think the goal is to have “perfect” time, when what is really needed is consistent organizational time. You need everything within a network to be synchronized for troubleshooting and incident management purposes. Otherwise, you’re going to waste a lot of energy identifying root causes and attacks.

It’s recommended to use a public stratum one server to synchronize with a few external systems or devices at your network perimeter, but this should only be configured if you don’t have your own stratum zero GPS with a stratum one server attached. I can’t tell you how many times I’ve seen a network team go to the trouble to set this up and the systems people still point everything to ntp.org.

Everything inside a network should cascade from those perimeter devices, which is usually a router, Active Directory system or stratum one server.  This design reduces the possibility of internal time drift, the load on public NTP servers and your firewalls, and the organizational risk of opening up unnecessary ports to allow outgoing traffic to the Internet. Over the last few years, some serious vulnerabilities have been identified in the protocol and it can also be used as a data exfiltration port by attackers.

In addition to the IETF’s draft on NTP “best practices,” the SEI also has an excellent guidance document.

While it’s not realistic to have your own stratum zero device in the cloud, within AWS, it is recommended to use the designated NTP pool specified in their documentation.

Oh, and for the love of all that is holy, please use UTC. I cannot understand why I’m still having this argument with people.

Tagged , , , , ,

Security Group Poop

One of the most critical elements of an organization’s security posture in AWS, is the configuration of security groups. In some of my architectural reviews, I often see rules that are confusing, overly-permissive and without any clear business justification for the access allowed. Basically, the result is a big, steaming pile of security turds.
While I understand many shops don’t have dedicated network or infrastructure engineers to help configure their VPCs, AWS has created some excellent documentation to make it a bit easier to deploy services there. You can and should plow through the entirety of this information. But for those with short attention spans or very little time, I’ll point out some key principles and “best practices” that you must grasp when configuring security groups.
  • A VPC automatically comes with a default security group and each instance created in that VPC will be associated with it, unless you create a new security group.
  • “Allow” rules are explicit, “deny” rules are implicit. With no rules, the default behavior is “deny.” If you want to authorize ingress or egress access you add a rule, if you remove a rule, you’re revoking access.
  • The default rule for a security group denies all inbound traffic and permits all outbound traffic. It is a “best practice” to remove this default rule, replacing it with more granular rules that allow outbound traffic specifically needed for the functionality of the systems and services in the VPC.
  • Security groups are stateful. This means that if you allow inbound traffic to an instance on a specific port, the return traffic is automatically allowed, regardless of outbound rules.
  • The use-cases requiring inbound and outbound rules for application functionality would be:
    • ELB/ALBs – If the default outbound rule has been removed from the security group containing an ELB/ALB, an outbound rule must be configured to forward traffic to the instances hosting the service(s) being load balanced.
    • If the instance must forward traffic to a system/service outside the configured security group.
AWS documentation, including security group templates, covering multiple use-cases:
Security groups are more effective when layered with Network ACLs, providing an additional control to help protect your resources in the event of a misconfiguration. But there are some important differences to keep in mind according to AWS:
Security Group
Network ACL
Operates at the instance level (first layer of defense)
Operates at the subnet level (second layer of defense)
Supports allow rules only
Supports allow rules and deny rules
Is stateful: Return traffic is automatically allowed, regardless of any rules
Is stateless: Return traffic must be explicitly allowed by rules
We evaluate all rules before deciding whether to allow traffic
We process rules in number order when deciding whether to allow traffic
Applies to an instance only if someone specifies the security group when launching the instance, or associates the security group with the instance later on
Automatically applies to all instances in the subnets it’s associated with (backup layer of defense, so you don’t have to rely on someone specifying the security group)
Additionally, the AWS Security Best Practices document, makes the following recommendations:
  • Always use security groups: They provide stateful firewalls for Amazon EC2 instances at the hypervisor level. You can apply multiple security groups to a single instance, and to a single ENI.
  • Augment security groups with Network ACLs: They are stateless but they provide fast and efficient controls. Network ACLs are not instance-specific so they can provide another layer of control in addition to security groups. You can apply separation of duties to ACLs management and security group management.
  • For large-scale deployments, design network security in layers. Instead of creating a single layer of network security protection, apply network security at external, DMZ, and internal layers. 

For those who believe the purchase of some vendor magic beans (i.e. a product) will instantly fix the problem, get ready for disappointment. You’re not going to be able to configure that tool properly for enforcement until you comprehend how security groups work and what the rules should be for your environment.

aws_poop

Tagged , , , , ,