Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Software vulnerability reports and reports of software exploitations continue to grow at an alarming rate, and a significant number of these reports result in technical security alerts. To address this growing threat to the government, corporations, educational institutions, and individuals, systems that are free of software vulnerabilities must be developed.

Coding errors cause the majority of software vulnerabilities. For example, 64 percent of the nearly 2,500 vulnerabilities in the National Vulnerability Database in 2004 were caused by programming errors [Heffley 2004].

Java is a relatively secure language. It has no explicit pointer manipulation; array and string bounds are automatically checked; attempts at referencing a null pointer are trapped; the arithmetic operations are well defined and platform independent, as are the type conversions. The built-in bytecode verifier ensures that these checks are always in place. Moreover, Java provides comprehensive, fine-grained security mechanisms that can control access to individual files, sockets, and other sensitive resources.

Java program safety, however, can be compromised. The remainder of this chapter describes use cases under which Java programs might be exploited and examples of rules that mitigate against these attacks. Not all of the rules apply to all Java language programs; frequently, their applicability depends on how the software is deployed and your assumptions concerning trust.

Input Validation and Data Sanitization

Leaking Sensitive Data

Type Safety

Leaking Capabilities

Denial of Service

Libraries

Concurrency, Visibility, and Memory

Privilege Escalation

Intro and stuff on SCALe.

The Myth of Trust

Wiki Markup
Software programs often contain multiple components that act as subsystems, where each component operates in one or more trusted domains. For example, one component may have access to the file system but lack access to the network, while another component has access to the network but lacks access to the file system. _Distrustful decomposition_ and  _privilege separation_ \[[Dougherty 2009|AA. Bibliography#Dougherty 2009]\]  are examples of secure design patterns that recommend reducing the amount of code that runs with special privileges by designing the system using mutually untrusting components.

When components with differing degrees of trust share data, the data are said to flow across a trust boundary. Because Java allows components under different trusted domains to communicate with each other, data can be transmitted across a trust boundary. Furthermore, a Java program can contain both internally developed and third-party code. Data that are transmitted to or accepted from third-party code also flow across a trust boundary.

While software components can obey policies that allow them to transmit data across trust boundaries, they cannot specify the level of trust given to any component. The deployer of the application must define the trust boundaries with the help of a system-wide security policy. A security auditor can use that definition to determine whether the software adequately supports the security objectives of the application.

Third-party code should operate in its own trusted domain; any code potentially exported to a third-party — such as libraries — should be deployable in well-defined trusted domains. The public API of the potentially-exported code can be considered to be a trust boundary. Data flowing across a trust boundary should be validated when the publisher lacks guarantees of validation. A subscriber or client may omit validation when the data flowing into its trust boundary is appropriate for use as is. In all other cases, inbound data must be validated.

Injection Attacks

Data received by a component from a source outside the component's trust boundary may be malicious. Consequently, the program must take steps to ensure that the data are both genuine and appropriate.

Image Removed

These steps can include the following:

Validation: Validation is the process of ensuring that input data fall within the expected domain of valid program input. For example, not only must method arguments conform to the type and numeric range requirements of a method or subsystem, but also they must contain data that conform to the required input invariants for that method.

Sanitization: In many cases, the data may be passed directly to a component in a different trusted domain. Data sanitization is the process of ensuring that data conforms to the requirements of the subsystem to which they are passed. Sanitization also involves ensuring that data also conforms to security-related requirements regarding leaking or exposure of sensitive data when output across a trust boundary. Sanitization may include the elimination of unwanted characters from the input by means of removal, replacement, encoding or escaping the characters. Sanitization may occur following input (input sanitize) or before the data is passed to across a trust boundary (output sanitization). Data sanitization and input validation may coexist and complement each other. Refer to the related guideline IDS01-J. Sanitize data passed across a trust boundary for more details on data sanitization.

Canonicalization and Normalization: Canonicalization is the process of lossless reduction of the input to its equivalent simplest known form. Normalization is the process of lossy conversion of input data to the simplest known (and anticipated) form. Canonicalization and normalization must occur before validation to prevent attackers from exploiting the validation routine to strip away illegal characters and thus constructing a forbidden (and potentially malicious) character sequence. Refer to the guideline IDS02-J. Normalize strings before validating them for more details. In addition, ensure that normalization is performed only on fully assembled user input. Never normalize partial input or combine normalized input with non-normalized input.

For example, POSIX file systems provide a syntax for expressing file names on the system using paths. A path is a string which indicates how to find any file by starting at a particular directory (usually the current working directory), and traversing down directories until the file is found. Canonical paths lack both symbolic links and special entries such as '.' or '..', which are handled specially on POSIX systems. Each file accessible from a directory has exactly one canonical path, along with many non-canonical paths.

In particular, complex subsystems are often components that accept string data that specifies commands or instructions to a the component. String data passed to these components may contain special characters that can trigger commands or actions, resulting in a software vulnerability.

Examples of components which can interpret commands or instructions:

Many rules address proper filtering of untrusted input, especially when such input is passed to a component that can interpret commands or instructions. For example, see IDS08-J. Prevent XML Injection.

When data must be sent to a component in a different trusted domain, the sender must ensure that the data is suitable for the receiver's trust boundary by properly encoding and escaping any data flowing across the trust boundary. For example, if a system is infiltrated by malicious code or data, many attacks are rendered ineffective if the system's output is appropriately escaped and encoded. Refer to the guideline IDS04-J. Properly encode or escape output for more details.

Capabilities

Wiki Markup
A capability (known in some systems as a key) is a communicable, unforgeable token of authority. It refers to a value that references an object along with an associated set of access rights. A user program on a capability-based operating system must use a capability to access an object \[Wikipedia 2011\]. 

Wiki Markup
The term capability was introduced by Dennis and Van Horn \[Dennis 1966\]. The basic idea is that for a program to access an object it must have a special token. This token designates an object and gives the program the authority to perform a specific set of actions (such as reading or writing) on that object. Such a token is known as a capability.

Some rules that involve capabilities include:

Leaking Sensitive Data

A system's security policy determines which information is sensitive. Sensitive data may include user information such as social security or credit card numbers, passwords, or private keys.

Image Removed

Java software components provide many opportunities to output sensitive information. Several rules address the mitigation of sensitive information disclosure, including EXC06-J. Do not allow exceptions to expose sensitive information and FIO08-J. Do not log sensitive information.

Wiki Markup\[Dennis 1966\] Jack B. Dennis and Earl C. Van Horn. 1966. Programming semantics for multiprogrammed computations. Commun. ACM 9, 3 (March 1966), 143-155. DOI=10.1145/365230.365252 http://doi.acm.org/10.1145/365230.365252