Wiki Markup |
---|
Regular expressions are widely used to match strings of text. For example, the POSIX {{grep}} utility supports regular expressions for finding patterns in the specified text. For introductory information on regular expressions, see the Java Tutorials \[[Tutorials 08|AA. Bibliography#Tutorials 08]\]. The {{java.util.regex}} package provides the {{Pattern}} class that encapsulates a compiled representation of a regular expression and the {{Matcher}} class, thatwhich is an engine that uses a {{Pattern}} to perform matching operations on a {{CharSequence}}. |
Java's powerful regular expression (regex) facilities must be protected from misuse. An attacker may supply a malicious input that modifies the original regular expression in such a way that the regex fails to comply with the program's specification. This attack vector, referred to as called a regex injection, might affect control flow, cause information leaks, or result in denial-of-service vulnerabilities (DoS) vulnerabilities.
Certain constructs and properties of Java regular expressions are susceptible to exploitation:
...
Suppose a system log file contains messages output by various system processes. Some processes produce public messages and some processes produce sensitive messages marked "private." Here is an example log file:
Code Block |
---|
4/8/11 10:47:03 AM private[423] Successful logout name: somenameusr1 ssn: 111223333 4/8/11 10:47:04 AM public[48964] Failed to resolve network service using name = Scipio type = _afpovertcp._tcp domain = local. 4/8/11 10:47:04 AM public[1] (public.message[49367]) Exited with exit code: 255 4/8/11 10:47:43 AM private[423] Successful login name: somename_elseusr2 ssn: 444556666 4/8/11 10:48:08 AM public[48964] Backup failed with error: 19 |
A user wishes to search the log file for interesting messages but is restricted must be prevented from seeing the private onesmessages. A program might accomplish this by permitting the user to provide search text which that becomes part of the following regex:
...
Code Block | ||
---|---|---|
| ||
public class Keywords { private static ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor(); private static CharBuffer log; private static final Object lock = new Object(); // Map log file into memory, and periodically reload static { try { FileChannel channel = new FileInputStream( "path").getChannel(); // Get the file's size and map it into memory int size = (int) channel.size(); final MappedByteBuffer mappedBuffer = channel.map( FileChannel.MapMode.READ_ONLY, 0, size); Charset charset = Charset.forName("ISO-8859-15"); final CharsetDecoder decoder = charset.newDecoder(); log = decoder.decode(mappedBuffer); // Read file into char buffer Runnable periodicLogRead = new Runnable() { @Override public void run() { synchronized (lock) { try { log = decoder.decode(mappedBuffer); } catch (CharacterCodingException e) { // Forward to handler } } } }; scheduler.scheduleAtFixedRate(periodicLogRead, 0, 5, TimeUnit.SECONDS); } catch (Throwable t) { // Forward to handler } } public static Set<String> suggestSearches(String search) { synchronized (lock) { Set<String> searches = new HashSet<String>(); // Construct regex dynamically from user string String regex = "(.*? +public\\[\\d+\\] +.*" + search + ".*)"; Pattern keywordPattern = Pattern.compile(regex); Matcher logMatcher = keywordPattern.matcher(log); while (logMatcher.find()) { String found = logMatcher.group(1); searches.add(found); } return searches; } } } |
This code permits a trusted user to search for public log messages such as "error." However, it also allows a malicious attacker to perform the regex injection outlined above.
Compliant Solution (Whitelisting)
This compliant solution filters out non-alphanumeric nonalphanumeric characters (except space and single quote) from the search string, which prevents regex injection.
...
This solution also limits the set of valid search terms. For instance, a user may no longer search for "name =
" because the =
character would be sanitized out of the regex.
...
Another method of mitigating this vulnerability is to filter out the sensitive information prior to matching. Such a solution would require the filtering to be done every time the log file is periodically refreshed, incurring extra complexity and a performance penalty. Sensitive information may still be exposed if the log format changes but the class is not also refactored to accommodate these changes.
Risk Assessment
Violating this rule may Failing to sanitize untrusted data included as part of a regular expression can result in the disclosure of sensitive information.
...
Related Guidelines
Bibliography
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="948206ca374873ae-5296e1b4-4392463b-9b55a6e5-dc9327b912818294c2c49933"><ac:plain-text-body><![CDATA[ | [[Tutorials 08 | AA. Bibliography#Tutorials 08]] | [Regular Expressions | http://java.sun.com/docs/books/tutorial/essential/regex/index.html] | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="6bc1e1f7c095d053-0d5eecbe-41c94657-9120beff-5135d6dc90a6121b783a1cef"><ac:plain-text-body><![CDATA[ | [[CVE 05 | AA. Bibliography#CVE]] | [CVE-2005-1949 | http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2005-1949] | ]]></ac:plain-text-body></ac:structured-macro> |
...