A log injection vulnerability arises when a log entry contains unsanitized user input. A malicious user can insert fake log data and consequently deceive system administrators as to the system's behavior [OWASP 2008]. For example, an attacker might split a legitimate log entry into two log entries by entering a carriage return and line feed (CRLF) sequence to mislead an auditor. Log injection attacks can be prevented by sanitizing and validating any untrusted input sent to a log.
Logging unsanitized user input can also result in leaking sensitive data across a trust boundary. For example, an attacker might inject a script into a log file such that when the file is viewed using a web browser, the browser could provide the attacker with a copy of the administrator's cookie so that the attacker might gain access as the administrator.
Noncompliant Code Example
This noncompliant code example logs untrusted data from an unauthenticated user without data sanitization.
if (loginSuccessful) { logger.severe("User login succeeded for: " + username); } else { logger.severe("User login failed for: " + username); }
Without sanitization, a log injection attack is possible. A standard log message when username
is guest
might look like this:
May 15, 2011 2:19:10 PM java.util.logging.LogManager$RootLogger log SEVERE: User login failed for: guest
If the username
that is used in a log message is not guest
but rather a multiline string like this:
guest May 15, 2011 2:25:52 PM java.util.logging.LogManager$RootLogger log SEVERE: User login succeeded for: administrator
the log would contain the following misleading data:
May 15, 2011 2:19:10 PM java.util.logging.LogManager$RootLogger log SEVERE: User login failed for: guest May 15, 2011 2:25:52 PM java.util.logging.LogManager log SEVERE: User login succeeded for: administrator
Compliant Solution (Sanitized User)
This compliant solution sanitizes the username
before logging it, preventing injection attacks.
if (loginSuccessful) { logger.severe("User login succeeded for: " + sanitizeUser(username)); } else { logger.severe("User login failed for: " + sanitizeUser(username)); }
The sanitization is done by a dedicated method for sanitizing user names:
public String sanitizeUser(String username) { return Pattern.matches("[A-Za-z0-9_]+", username)) ? username : "unauthorized user"; }
Compliant Solution (Sanitized Logger)
This compliant solution uses a text logger that automatically sanitizes its input. A sanitized logger saves the developer from having to worry about unsanitized log messages.
Logger sanLogger = new SanitizedTextLogger(logger); if (loginSuccessful) { sanLogger.severe("User login succeeded for: " + username); } else { sanLogger.severe("User login failed for: " + username); }
The sanitized text logger takes as delegate an actual logger. We assume the logger outputs text log messages to a file, network, or the console, and each log message has no indented lines. The sanitized text logger sanitizes all text to be logged by indenting every line except the first by two spaces. While a malicious user can indent text by more, a malicious user cannot create a fake log entry because all of her output will be indented, except for the real log output.
class SanitizedTextLogger extends Logger { Logger delegate; public SanitizedTextLogger(Logger delegate) { super(delegate.getName(), delegate.getResourceBundleName()); this.delegate = delegate; } public String sanitize(String msg) { Pattern newline = Pattern.compile("\n"); Matcher matcher = newline.matcher(msg); return matcher.replaceAll("\n "); } public void severe(String msg) { delegate.severe(sanitize(msg)); } // .. Other Logger methods which must also sanitize their log messages }
Risk Assessment
Allowing unvalidated user input to be logged can result in forging of log entries, leaking secure information, or storing sensitive data in a manner that violates a local law or regulation.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
IDS03-J | Medium | Probable | Medium | P8 | L2 |
Automated Detection
Tool | Version | Checker | Description |
---|---|---|---|
The Checker Framework | 2.1.3 | Tainting Checker | Trust and security errors (see Chapter 8) |
CodeSonar | 8.1p0 | JAVA.IO.TAINT.LOG | Tainted Log (Java) |
Fortify | Log_Forging | Implemented | |
Klocwork | 2024.4 | SVLOG_FORGING | Implemented |
Parasoft Jtest | 2024.1 | CERT.IDS03.TDLOG | Protect against log forging |
Related Guidelines
Injection [RST] | |
CWE-144, Improper neutralization of line delimiters | |
MITRE CAPEC | CAPEC-93, Log Injection-Tampering-Forging |
Bibliography
[API 2006] | Java Platform, Standard Edition 6 API Specification |
[Seacord 2015] | IDS03-J. Do not log unsanitized user input LiveLesson |
13 Comments
A Bishop
I don't see any rule about having a consistent sanitization policy. There are two main ways to ensure sanitization:
(1) Sanitize at the point of receiving untrusted input.
(2) Sanitize at the point of using untrusted input.
Some projects will specify to use both.
(1) is usually a lot more performant as usually you only capture input once but use it multiple times.
e.g. registering a new user will capture the username once, but it will be stored in a DB, queried for on login, displayed in html, logged on login failure etc. Having to sanitize the username when storing, querying, generating html, logging etc. is a high development burden and potential performance issue that someone would need to assess those costs against any additional security benefit.
What is the security benefit of doing (2) everywhere instead of (1) as most of these sanitisation rules seem to specify?
David Svoboda
Well, I would argue that IDS00-J. Sanitize untrusted data passed across a trust boundaryindicates that both should be used. That is, sanitize when you receive untrusted input (eg data crosses a trust boundary), and sanitize again when you send data to an untrusted output sink (eg data croses a trust boundary).
I disagree that these rules prefer output sanitization vs input sanitization. Clearly sanitizing input is faster than sanitizing output. But sometimes it is more difficult to do correctly, esp if you don't know where the input is going. For instance, receiving some text from the user requires different sanitization if the text is going to an SQL database, a web browser, or a log file. In that case, you may choose to sanitize the text only when it gets output to whichever sink is chosen.
Robert Seacord
This sentence is really a mess:
It is redundant, and also sort of ends badly. Fred, you touched this last... can you repair it?
Robert Seacord (Manager)
OK, I fixed.
Adam Walczak
In my opinion the proposed solution here has two flaws:
The point of having logging API's is to abstract you from the complexities of property sanitizing, serializing and storing of log entries. Because of this your logging framework should be configured to handle and encode any data passed to the loggers in a secure manner. Escaping characters etc. is dependent on the log storage format you configure under the logging API's. File encoders / appenders should escape new line charachtes, JSON encoders / appenders on the other hand should escape " character and so on.
Besides the above, not only your code uses those logging APIs but also third party libraries. JSON demarshalleres, HTTP servlet containers, can log used data without your knowledge and before you even get a chance to validate it.
There for the code to sanitize the logging data should not be messed up with the business logic, but in the code under the logging API which knows the destination log format and knows how to encode and sanitize them securely.
This is the so called Output encoding strategy mentioned in: https://cwe.mitre.org/data/definitions/117.html
David Svoboda
I added CWE-117 to the list of associated CWEs, thanks.
As to the problems you cite with the CS, I do feel your pain. We do recommend that systems that accept string output provide mechanisms for allowing callers to sanitize their output. In this case, that means the logger's class should provide a sanitize() method to prevent log injections.
Unfortunately in this world, many such systems fail to provide any sanitization. Java's Logger package provides none. Furthermore, while it is trivial to sanitize data going to an XML-based log file, or JSON-based log file, how would you sanitize a plain text format such as what java.util.Logger uses? Eliminating newlines renders the file almost unreadable. I've seen plenty of log messages with multiple lines; usually they contain a Java exception. I suspect there are makeshift solutions you could implement (such as enforcing indentation on all user-supplied newlines), but I have not seen any standard solutions (outside of XML or some similar format).
Adam Walczak
Well because of their limitation plain text file loges files are becoming a thing of the past in larger more mature systems. Especially in web systems which scale horizontally its not practical to have text files on each server. If used at all text log files are only a last resort mechanism if a single server instance could not connect to logstash or any other centralized logs repository.
I agree with you that you should not trust the default configuration of your logging framework. But you can cleanly enable stanatization in the loggers configuration via custom encoders, formaters, addapters, etc. You don't have to mix it up in business code like in the example given here.
Relaying only on input validation in the applications code still leaves a potential security hole - the logging done by third party libraries which process and log data before it even gets to the application code. This is why I think that relaying solely the output encoding strategy when it comes to logging is more secure and provides cleaner code.
Adam Walczak
Could we collaborate on a output encoding based example as another compliant solution?
David Svoboda
Hello, Adam.
I'm happy to work with you to help fix this rule. (I'm not sure if a 2nd compliant solution is the answer, will think about it over the weekend.)
The more I think about this the more I'm convinced that if a logger accepts data that can create a fake log message, the logger is faulty by design. Ideally your code should do nothing differently than today, but the logger itself automatically sanitizes log data.
However, that would prevent an attacker from creating a fake log message. It would not prevent other bad usernames such as "your mom". Sanitizing something that should be a valid username should definitely not be done in the Logger, it should definitely be done at least in the same class as the compliant soluiton. In that context, the current CS has sound business logic.
That is independent of the general sanitization to prevent fake log messages.
David Svoboda
I have added a new compliant solution that implements my suggestion of indenting subsequent log lines to distinguish them from 'real' new log entries.
Josh Cummings
Is the replaceAll in the second compliant example correct? I believe replaceAll takes two parameters, and it appears the second example passes only one.
David Svoboda
replaceAll() takes only one argument, as shown here: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#replaceAll(java.lang.String)
Josh Cummings
Perfect, thanks for the clarification. (I had initially read it as String#replaceAll)