IDS50-JG. Properly encode or escape output

Proper input sanitization can prevent insertion of malicious data into a subsystem such as a database. However, different subsystems require different types of sanitization. Fortunately, it is usually obvious which subsystems will receive input and consequently what sanitization is required.

Several subsystems exist for the purpose of showing output. An HTML renderer, as part of a web browser, is one common subsystem for displaying output. Data sent to an output subsystem may appear to originate from a trusted source; consequently, it is tempting to assume that output sanitization is unnecessary. However, data sent to an output subsystem may indirectly originate from an untrusted source and may include malicious content. Failure to properly sanitize data for output subsystems can enable several types of attacks. For example, HTML renderers can be prone to HTML injection and cross-site scripting (XSS) [OWASP 2011] attacks. (Note, however, that the term cross-site scripting attack is generally applied to such attacks even when they involve only one site.) Output sanitization to prevent such attacks is as vital as input sanitization.

As with input validation, normalize data before sanitizing for malicious characters. Properly encode all output characters other than those known to be safe to avoid vulnerabilities caused by data that bypasses validation. See IDS01-J. Normalize strings before validating them for more information.

Noncompliant Code Example

This noncompliant code example uses the model-view-controller (MVC) concept of the Java EE–based Spring Framework to display data to the user without encoding or escaping it. Because the data is sent to a web browser, the code is subject to both HTML injection and XSS attacks.

@RequestMapping("/getnotifications.htm")
public ModelAndView getNotifications(HttpServletRequest request, HttpServletResponse response) {
  ModelAndView mv = new ModelAndView();
  try {
    UserInfo userDetails = getUserInfo();
    List<Map<String,Object>> list = new ArrayList<Map<String,Object>>();
    List<Notification> notificationList = 
        NotificationService.getNotificationsForUserId(userDetails.getPersonId());
           
    for (Notification notification: notificationList) {
      Map<String,Object> map = new HashMap<String,Object>();
      map.put("id",notification.getId());
      map.put("message", notification.getMessage());
      list.add(map);
    }
            
     mv.addObject("Notifications",list);
  }
  catch(Throwable t){
    // Log to file and handle
  }
 
  return mv;
}

Compliant Solution

This compliant solution defines a ValidateOutput class that normalizes the output to a known character set, performs output sanitization using a whitelist, and encodes any nonspecified data values to enforce a double-checking mechanism. Note that required whitelisting patterns may vary according to the specific needs of different fields [OWASP 2008].

public class ValidateOutput {
  // Allows only alphanumeric characters and spaces
  private static final Pattern pattern = Pattern.compile("^[a-zA-Z0-9\\s]{0,20}$");

  // Validates and encodes the input field based on a whitelist
  public String validate(String name, String input) throws ValidationException {
    String canonical = normalize(input);

    if (!pattern.matcher(canonical).matches()) {
      throw new ValidationException("Improper format in " + name + " field");
    }
    
    // Performs output encoding for nonvalid characters 
    canonical = HTMLEntityEncode(canonical);
    return canonical;
  }

  // Normalizes to known instances 	
  private String normalize(String input) {
    String canonical = java.text.Normalizer.normalize(input, Normalizer.Form.NFKC);
    return canonical;
  }

  // Encodes nonvalid data
  private static String HTMLEntityEncode(String input) {
    StringBuffer sb = new StringBuffer();

    for (int i = 0; i < input.length(); i++) {
      char ch = input.charAt(i);
      if (Character.isLetterOrDigit(ch) || Character.isWhitespace(ch)) {
        sb.append(ch);
      } else {
        sb.append("&#" + (int)ch + ";");
      }
    }
    return sb.toString();
  }
}
 
// ...
 
@RequestMapping("/getnotifications.htm")
public ModelAndView getNotifications(HttpServletRequest request, HttpServletResponse response) {
  ValidateOutput vo = new ValidateOutput();

  ModelAndView mv = new ModelAndView();
  try {
    UserInfo userDetails = getUserInfo();
    List<Map<String,Object>> list = new ArrayList<Map<String,Object>>();
    List<Notification> notificationList = 
        NotificationService.getNotificationsForUserId(userDetails.getPersonId());
           
    for (Notification notification: notificationList) {
      Map<String,Object> map = new HashMap<String,Object>();
      map.put("id", vo.validate("id" ,notification.getId()));
      map.put("message", vo.validate("message", notification.getMessage()));
      list.add(map);
    }
            
     mv.addObject("Notifications",list);
  }
  catch(Throwable t){
    // Log to file and handle
  }
 
  return mv;
}

Also, see the method weblogic.servlet.security.Utils.encodeXSS() for more information on preventing XSS attacks.

Applicability

Failure to encode or escape output before it is displayed or passed across a trust boundary can result in the execution of arbitrary code.

Related Vulnerabilities

The Apache GERONIMO-1474 vulnerability, reported in January 2006, allowed attackers to submit URLs containing JavaScript. The Web-Access-Log viewer failed to sanitize the data it forwarded to the administrator console, thereby enabling a classic XSS attack.

Related Guidelines

MITRE CWE

CWE-116, Improper encoding or escaping of output

Bibliography

[OWASP 2008]	How to Add Validation Logic to HttpServletRequest XSS (Cross Site Scripting) Prevention Cheat Sheet
[OWASP 2011]	Cross-site Scripting (XSS)

Space shortcuts

Page tree