SQL injection vulnerabilities arise in applications where elements of a SQL query originate from an untrusted source. Without precautions, the untrusted data may maliciously alter the query, resulting in information leaks or data modification. The primary means of preventing SQL injection are sanitization and validation, which are typically implemented as parameterized queries and stored procedures.
Suppose a system authenticates users by issuing the following query to a SQL database. If the query returns any results, authentication succeeds; otherwise, authentication fails.
Code Block | ||
---|---|---|
| ||
SELECT * FROM db_user WHERE username='<USERNAME>' AND
password='<PASSWORD>' |
Suppose an attacker can substitute arbitrary strings for <USERNAME>
and <PASSWORD>
. In that case, the authentication mechanism can be bypassed by supplying the following <USERNAME>
with an arbitrary password:
Code Block | ||
---|---|---|
| ||
validuser' OR '1'='1
|
The authentication routine dynamically constructs the following query:
Code Block | ||
---|---|---|
| ||
SELECT * FROM db_user WHERE username='validuser' OR '1'='1' AND password='<PASSWORD>'
|
If validuser
is a valid user name, this SELECT
statement yields the validuser
record in the table. The password is never checked because username='validuser'
is true; consequently, the items after the OR
are not tested. As long as the components after the OR
generate a syntactically correct SQL expression, the attacker is granted the access of validuser
.
Similarly, an attacker could supply the following string for <PASSWORD>
with an arbitrary username:
Code Block | ||
---|---|---|
| ||
' OR '1'='1
|
producing the following query:
Code Block | ||
---|---|---|
| ||
SELECT * FROM db_user WHERE username='<USERNAME>' AND password='' OR '1'='1'
|
'1'='1'
always evaluates to true, causing the query to yield every row in the database. In this scenario, the attacker would be authenticated without needing a valid username or password.
Noncompliant Code Example
This noncompliant code example shows JDBC code to authenticate a user to a system. The password is passed as a char
array, the database connection is created, and then the passwords are hashed.
Unfortunately, this code example permits a SQL injection attack by incorporating the unsanitized input argument username
into the SQL command, allowing an attacker to inject validuser' OR '1'='1
. The password
argument cannot be used to attack this program because it is passed to the hashPassword()
function, which also sanitizes the input.
Code Block | ||||
---|---|---|---|---|
| ||||
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
class Login {
public Connection getConnection() throws SQLException {
DriverManager.registerDriver(new
com.microsoft.sqlserver.jdbc.SQLServerDriver());
String dbConnection =
PropertyManager.getProperty("db.connection");
// Can hold some value like
// "jdbc:microsoft:sqlserver://<HOST>:1433,<UID>,<PWD>"
return DriverManager.getConnection(dbConnection);
}
String hashPassword(char[] password) {
// Create hash of password
}
public void doPrivilegedAction(String username, char[] password)
throws SQLException {
Connection connection = getConnection();
if (connection == null) {
// Handle error
}
try {
String pwd = hashPassword(password);
String sqlString = "SELECT * FROM db_user WHERE username = '"
+ username +
"' AND password = '" + pwd + "'";
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery(sqlString);
if (!rs.next()) {
throw new SecurityException(
"User name or password incorrect"
);
}
// Authenticated; proceed
} finally {
try {
connection.close();
} catch (SQLException x) {
// Forward to handler
}
}
}
}
|
Noncompliant Code Example (PreparedStatement
)
The JDBC library provides an API for building SQL commands that sanitize untrusted data. The java.sql.PreparedStatement
class properly escapes input strings, preventing SQL injection when used correctly. This code example modifies the doPrivilegedAction()
method to use a PreparedStatement
instead of java.sql.Statement
. However, the prepared statement still permits a SQL injection attack by incorporating the unsanitized input argument username
into the prepared statement.
Code Block | ||||
---|---|---|---|---|
| ||||
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
class Login {
public Connection getConnection() throws SQLException {
DriverManager.registerDriver(new
com.microsoft.sqlserver.jdbc.SQLServerDriver());
String dbConnection =
PropertyManager.getProperty("db.connection");
// Can hold some value like
// "jdbc:microsoft:sqlserver://<HOST>:1433,<UID>,<PWD>"
return DriverManager.getConnection(dbConnection);
}
String hashPassword(char[] password) {
// Create hash of password
}
public void doPrivilegedAction(
String username, char[] password
) throws SQLException {
Connection connection = getConnection();
if (connection == null) {
// Handle error
}
try {
String pwd = hashPassword(password);
String sqlString = "select * from db_user where username=" +
username + " and password =" + pwd;
PreparedStatement stmt = connection.prepareStatement(sqlString);
ResultSet rs = stmt.executeQuery();
if (!rs.next()) {
throw new SecurityException("User name or password incorrect");
}
// Authenticated; proceed
} finally {
try {
connection.close();
} catch (SQLException x) {
// Forward to handler
}
}
}
}
|
Compliant Solution (PreparedStatement
)
This compliant solution uses a parametric query with a ?
character as a placeholder for the argument. This code also validates the length of the username
argument, preventing an attacker from submitting an arbitrarily long user name.
Code Block | ||||
---|---|---|---|---|
| ||||
public void doPrivilegedAction(
String username, char[] password
) throws SQLException {
Connection connection = getConnection();
if (connection == null) {
// Handle error
}
try {
String pwd = hashPassword(password);
// Validate username length
if (username.length() > 8) {
// Handle error
}
String sqlString =
"select * from db_user where username=? and password=?";
PreparedStatement stmt = connection.prepareStatement(sqlString);
stmt.setString(1, username);
stmt.setString(2, pwd);
ResultSet rs = stmt.executeQuery();
if (!rs.next()) {
throw new SecurityException("User name or password incorrect");
}
// Authenticated; proceed
} finally {
try {
connection.close();
} catch (SQLException x) {
// Forward to handler
}
}
}
|
Use the set*()
methods of the PreparedStatement
class to enforce strong type checking. This technique mitigates the SQL injection vulnerability because the input is properly escaped by automatic entrapment within double quotes. Note that prepared statements must be used even with queries that insert data into the database.
Risk Assessment
Failure to sanitize user input before processing or storing it can result in injection attacks.
Rule |
---|
Every program operates in several security domains. Any system can be divided into several subsystem component, where each component has a specific security domain. For instance, one component may have access to the filesystem, but not the network, while another component can access the network but not the filesystem. When two components share data, if they have different degrees of trust, the data is said to flow across a trust boundary.
Since Java allows untrusted code to run alongside trusted code, it is perfectly possible for a Java program to maintain different security domains, and to have internal trust boundaries. Every Java program contains both locally-written code (that is, local to the developers) as well as code whose implementation is beyond the developers' control. Similarly, even within code whose implementation is controlled by the developers, there may be a distinction between trusted and untrusted code. Alternatively, the locally developed portions of a program may include both application-specific code and also locally-developed library code that is shared with other programs (either local or external). We define a trust boundary to be the points at which control or data pass from one aggregation of locally-developed code to another aggregation of code, without regard to whether the second aggregation is locally or externally developed.
The trust boundaries of any system are mandated by the system administrator. While the system components can provide support for trust boundaries, they cannot dictate what trust is given to any component. Consequently, the deployer of a system must define the trust boundaries as part or the system's security policy. Any security auditor can then take that definition and ensure that the software adequately supports the policy. The definition of security domains — and consequently of the boundaries between those domains — is necessarily application-specific.
We recommend that any library that may be exported outside its home organization should consider itself to be in its own unique security domain (in the sense that its client code cannot be known when the library is written). Consequently the library should consider its API to be a trust boundary. Similarly, we recommend that libraries acquired from outside organizations should be considered to be separate security domains from in-house code. Circumstances will differ; the important thing is to identify and respect the necessary security domains in your application.
Any program that maintains a trust boundary must deal with data coming in over that trust boundary...that is, from a process with a differing security domain. Likewise, any program with must also deal with data going out over a trust boundary. It is imperative that data that crosses a trust boundary undergo filtering. We shall examine the two cases in turn:
Data Output
Data that is output by a program may be sent to some component of the system with a different security domain. In this case, the program must ensure that the data is suitable to the remote component's trust boundary. The system must do this by eamining the data and filtering out any sensitive information.
Like trust boundaries, the question of what information is /sensitive/ is resolved by the system's security policy. Therefore, a program cannot define which information is sensitive, it can only provide support for handling information that may potentially be declared sensitive by the system administrator.
For instance, if malicious code manages to infiltrate a system, many attacks will be futile if the system's output is appropriately escaped and encoded. Refer to the guideline IDS04-J. Properly encode or escape output for more details.
Java programs have many opportunities to output sensitive information. Several rules address the mitigation of sensitive information disclosure, including EXC06-J. Do not allow exceptions to expose sensitive information and FIO08-J. Do not log sensitive information.
Data Input
Data that is received by a program from a source outside the program's trust boundary may, in fact, be malicious. The program must therefore take steps to ensure the data is valid and not malicious.
These steps can include the following:
/Validation/, in the broadest sense, is the process of ensuring that input data falls within the expected range of valid program input. For example, method arguments must conform not only to the type and numeric range requirements of a method or subsystem, but also contain data that conforms to the required input invariants for that method.
/Sanitization/: In many cases, the data may be fed directly to some subsystem. Data sanitization is the process of ensuring that data conforms to the requirements of the subsystem to which it are passed. Sanitization also involves ensuring that data also conforms to any security-related requirements regarding leaking or exposure of sensitive data when it is output across a trust boundary. Refer to the related guideline IDS01-J. Carefully filter any data that passes through a trust boundary for more details on data sanitization. Data sanitization and input validation may coexist and complement each other.
/Canonicalization/ and /Normalization/: Canonicalization is the process of lossless reduction of the input to its equivalent simplest known form. Normalization is the process of lossy conversion of the data to its simplest known (and anticipated) form. Canonicalization and normalization must occur before validation to prevent attackers from exploiting the validation routine to strip away illegal characters and thus construct a forbidden character sequence. Refer to the guideline IDS02-J. Validate strings after performing normalization for more details. In addition, ensure that normalization is performed only on fully assembled user input. Never normalize partial input or combine normalized input with non-normalized input.
For instance, POSIX filesystems provide a syntax for expressing file names on the system using paths. A path is a string which indicates how to find any file by starting at a particular directory (usually the current working directory), and traversing down directories until the file is found. A path is canonical if it contains no symbolic links, and no special entries, such as '.' or '..'; as these are handled specially on POSIX systems. Every file accessible from a directory has exactly one canonical path, along with many non-canonical paths.
Many rules address proper filtering of untrusted input, especially when such input is passed to a complex subsystem. For example, see IDS08-J. Prevent XML Injection.
Risk Assessment
Failure to properly filter data that crosses a trust boundary can cause information leakage and execution of arbitrary code.
Severity | Likelihood | Remediation Cost | Priority | Level | |
---|---|---|---|---|---|
IDS00-J |
High |
Likely |
Medium |
P12
L1
P18 | L1 |
Automated Detection
Tool | Version | Checker | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
The Checker Framework |
| Tainting Checker | Trust and security errors (see Chapter 8) | ||||||
CodeSonar |
| JAVA.IO.INJ.SQL | SQL Injection (Java) | ||||||
Coverity | 7.5 | SQLI | Implemented | ||||||
Findbugs | 1.0 | SQL_NONCONSTANT_STRING_PASSED_TO_EXECUTE | Implemented | ||||||
Fortify | 1.0 | HTTP_Response_Splitting | Implemented | ||||||
Klocwork |
| SV.DATA.DB | Implemented | ||||||
Parasoft Jtest |
| CERT.IDS00.TDSQL | Protect against SQL injection | ||||||
SonarQube |
| ||||||||
SpotBugs |
| SQL_NONCONSTANT_STRING_PASSED_TO_EXECUTE | Implemented |
Related Vulnerabilities
Bibliography
Wiki Markup |
---|
\[[OWASP 2005|AA. Bibliography#OWASP 05]\]
\[[OWASP 2007|AA. Bibliography#OWASP 07]\] |
describes a vulnerability in Apache Tomcat 4.1.0 through 4.1.37, 5.5.0 through 5.5.26, and 6.0.0 through 6.0.16. When a RequestDispatcher
is used, Tomcat performs path normalization before removing the query string from the URI, which allows remote attackers to conduct directory traversal attacks and read arbitrary files via a ..
(dot dot) in a request parameter.
Related Guidelines
SEI CERT Perl Coding Standard | IDS33-PL. Sanitize untrusted data passed across a trust boundary |
Injection [RST] | |
CWE-116, Improper Encoding or Escaping of Output |
Android Implementation Details
This rule uses Microsoft SQL Server as an example to show a database connection. However, on Android, DatabaseHelper
from SQLite is used for a database connection. Because Android apps may receive untrusted data via network connections, the rule is applicable.
Bibliography
A Guide to Building Secure Web Applications and Web Services | |
[Seacord 2015] | |
[W3C 2008] | Section 4.4.3, "Included If Validating" |
...
13. Input Validation and Data Sanitization (IDS) 13. Input Validation and Data Sanitization (IDS) IDS01-J. Carefully filter any data that passes through a trust boundary