Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Java's regular expression facilities are wide ranging and powerful which can lead to unwanted modification of the original regular expression string to form a pattern that matches too widely, possibly resulting in far too much information being matched.

The primary means of preventing this vulnerability is to sanitize a regular expression string coming from untrusted input. Additionally, the programmer should look into ways of avoiding using regular expressions from untrusted input, or perhaps provide only a very limited subset of regular expression functionality to the user

Constructs and properties of Java regular expressions to watch out for include:

  • match flags used in non-capturing groups (These override matching options that may or may not have been passed into the compile() method.
  • Greediness

Since Java regular expressions are similar to Perl, it is a good idea to apply lessons learned from Perl regex.

Noncompliant Code Example

This class does not sanitize the incoming regular expression, and as a result, exposes too much information to the user.

This program searches a database of users for usernames that match a regular expression. A non-malicious example would be to search for 'John.'. A malicious example would be to search for '(?s)John.'

Code Block
import java.util.regex.Pattern;
import java.util.regex.Matcher;

/* Usage Test1 <regex>
 * Regex is used directly without santization causing sensitive data to be exposed
 *
 * Imagine this program searches a database of users for usernames that match a regex
 * Non malicious usage: Test1 John.*
 * Malicious usage: (?s)John.*
 */
public class Test1
{
    public static void main(String[] args)
    {
        if (args.length < 1) {
            System.err.println("Failed to specify a regex");
            return;
        }

        String sensitiveData; //represents sensitive data from a file or something
        int flags;
        String regex;
        Pattern p;
        Matcher m;

        //imagine a CSV style database: user,password
        sensitiveData = "JohnPaul,HearsGodsVoice\nJohnJackson,OlympicBobsleder\nJohnMayer,MakesBadMusic\n";

        String  regexregex = args[0];
        //regex = "(?s)John.*";

        flags = 0;

        regex += ","; //supposedly this forces the regex to only match names
        System.out.println("Pattern: \'" + regex + "\'");
        Pattern  pp = Pattern.compile(regex, flags0);
        m Matcher m = p.matcher(sensitiveData);

        while (m.find())
            System.out.println("Found \'" + m.group() + "\'");
        System.err.println("DONE");
    }
}

When searching using the regex '(?s)John.*', the program returns all the users' passwords. The (?s) turns on single-line matching support, which means new lines are ignored.

Compliant Solution

It is very difficult to filter out bad regular expressions. It might be easier and more secure to rewrite the application to limit the usage of regular expressions.

For the above code sample, the easy solution is to parse the CSV into a class.

Risk Assessment

Rule

Severity

Liklihood

Remediation Cost

Priority

Level

IDS18-J

medium

unlikely

high

 

 

References

CWE ID 625 Permissive Regular Expressions