Entity declarations define shortcuts to commonly used text or special characters and may. An entity declaration may define either an internal or external entity. For internal entities, the content of the entity is given in the declaration. For external entities, the content is specified by a Uniform Resource Identifier (URI).
Entities may be either parsed or unparsed. The contents of a parsed entity are referred to as its replacement text. An unparsed entity is a resource whose contents may or may not be text, and if text, may be other than XML. Parsed entities are invoked by name using an entity reference; unparsed entities by nameAn XML document can be dynamically constructed from smaller logical blocks called entities. Entities can be internal, external, or parameter-based. External entities allow the inclusion of XML data from external files.
According to XML W3C Recommendation [W3C 2008], section 4.4.3, "Included If Validating":
When an XML processor recognizes a reference to a parsed entity, to validate the document, the processor MUST include its replacement text. If the entity is external, and the processor is not attempting to validate the XML document, the processor MAY, but need not, include the entity's replacement text.
An attacker may attempt to cause denial of service or program crashes An XML external entity (XXE) attack occurs when XML input containing a reference to an external entity is processed by a improperly configured XML parser. An attacker might use an XXE attack to gain access to sensitive information by manipulating the URI of the entity to refer to special files existing on the local file system . For example, containing sensitive data such as passwords or private user data. An attacker might launch a denial-of-service attack, for example, by specifying /dev/random
or /dev/tty
as input URIs can crash or indefinitely block a program. This is called an XML external entity (XXE) attack. Because inclusion of replacement text from an external entity is optional, not all XML processors are vulnerable to external entity attacks.
...
This noncompliant code example attempts to parse the file evil.xml
, report any errors, and exit. However, a SAX (Simple API for XML) or a DOM (Document Object Model) parser will attempt to access the URL URI specified by the SYSTEM
attribute, which means it will attempt to read the contents of the local /dev/tty
file. On POSIX systems, reading this file causes the program to block until input data is supplied to the machine's console. Consequently, an attacker can use this malicious XML file to cause the program to hang.
...