Proper input sanitization can prevent insertion of malicious data into a subsystem such as a database. However, different subsystems require different types of sanitization. Fortunately, it is usually obvious which subsystems will receive input and consequently what sanitization is required.
Several subsystems exist for the purpose of showing output. An HTML renderer, as part of a web browser, is one common subsystem for displaying output. Since data that is sent to an output subsystem might not come directly from an untrusted source, it is tempting to assume that no sanitization is required. Data that is not properly sanitized for these subsystems can enable several types of attacks. For example, an HTML renderer can be prone to HTML injection and Cross-Site Scripting (XSS) [OWASP 2011]. (Note that, although this is not necessarily an attack from one site to another, the term Cross-Site Scripting attack is still applied to such attacks.) Therefore, output sanitization to prevent such attacks is as vital as input sanitization.
As with input validation, normalize data before sanitizing for malicious characters. All output characters other than those known to be safe should be encoded to avoid vulnerabilities caused by data that bypasses validation. See IDS01-J. Normalize strings before validating them for more information.
Noncompliant Code Example
This noncompliant code example uses the MVC concept of the Java EE based Spring Framework to display data to the user without encoding or escaping it.
@RequestMapping("/getnotifications.htm") public ModelAndView getNotifications(HttpServletRequest request, HttpServletResponse response) { ModelAndView mv = new ModelAndView(); try { UserInfo userDetails = getUserInfo(); List<Map<String,Object>> list = new ArrayList<Map<String,Object>>(); List<Notification> notificationList = notificationService.getNotificationsForUserId(userDetails.getPersonId()); for (Notification notification: notificationList) { Map<String,Object>map = new HashMap<String,Object>(); map.put("id",notification.getId()); map.put("message", notification.getMessage()); list.add(map); } mv.addObject("Notifications",list); } catch(Throwable t){ // log to file and handle } return mv; }
Compliant Solution
This compliant solution defines a ValidateOutput
class that normalizes the output to a known character set, performs output sanitization using a white-list and encodes any non-specified data values to enforce a double checking mechanism. Note that required white-listing patterns may vary according to the specific needs of different fields [OWASP 2008].
public class ValidateOutput { // Allows only alphanumeric characters and spaces private Pattern pattern = Pattern.compile("^[a-zA-Z0-9\\s]{0,20}$"); // Validates and encodes the input field based on a whitelist private String validate(String name, String input) throws ValidationException { String canonical = normalize(input); if (!pattern.matcher(canonical).matches()) { throw new ValidationException("Improper format in " + name + " field"); } // Performs output encoding for non valid characters canonical = HTMLEntityEncode(canonical); return canonical; } // Normalizes to known instances private String normalize(String input) { String canonical = java.text.Normalizer.normalize(input, Normalizer.Form.NFKC); return canonical; } // Encodes non valid data public static String HTMLEntityEncode(String input) { StringBuffer sb = new StringBuffer(); for (int i = 0; i < input.length(); i++) { char ch = input.charAt(i); if (Character.isLetterOrDigit(ch) || Character.isWhitespace(ch)) { sb.append(ch); } else { sb.append("&#" + (int)ch + ";"); } } return sb.toString(); } // description and input are String variables containing values obtained from a database // description = "description" and input = "2 items available" public static void display(String description, String input) throws ValidationException { ValidateOutput vo = new ValidateOutput(); vo.validate(description, input); // Pass to another system or display to the user } }
See, also, the method weblogic.servlet.security.Utils.encodeXSS().
Applicability
Failure to encode or escape output before it is displayed or passed across a trust boundary can result in the execution of arbitrary code.
Related Vulnerabilities
The Apache GERONIMO-1474 vulnerability, reported in January 2006, allowed attackers to submit URLs containing JavaScript. The Web-Access-Log viewer did not sanitize the data it forwarded to the administrator console, thereby enabling a classic Cross-Site Scripting attack.
Bibliography
[MITRE 2009] CWE ID 116 "Improper Encoding or Escaping of Output"
[OWASP 2008] How to add validation logic to HttpServletRequest, XSS (Cross Site Scripting) Prevention Cheat Sheet
[OWASP 2011] Cross-site Scripting (XSS)