Safely escape or restore special characters (< > & " ') in XML documents. Prevent parsing errors, avoid XML injection, and ensure valid data interchange.
XML (Extensible Markup Language) uses a small set of delimiter characters to define markup: < , > , & , " , '. When these characters appear inside element content or attribute values, they must be replaced by predefined entities to keep the document well‑formed. The process of replacement is called XML escaping (encoding). The reverse operation (restoring original characters) is called unescaping or decoding.
| Original character | Replacement entity | Description |
|---|---|---|
&
|
&
|
Ampersand – always escaped first |
<
|
<
|
Less‑than sign |
>
|
>
|
Greater‑than sign |
"
|
"
|
Double quote (attribute delimiter) |
'
|
'
|
Apostrophe / single quote |
Following W3C XML 1.0 specification, these five entities are mandatory. Numerical character references (e.g., <) are also valid but our tool preserves them during encode and handles common entities during decode.
A financial API accepts XML requests for transaction records. An attacker attempts to inject malicious elements using <script> tags. By properly encoding all dynamic input (e.g., user’s name becomes <script>), the XML parser never interprets it as markup. Our tool helps developers test payloads and validate escaping routines, reducing vulnerability surface.
The encoding function scans the input string and replaces each occurrence of &, <, >, ", ' with the corresponding XML entity. The encoding order matters: ampersand (&) must be replaced first to avoid double‑encoding existing entities. Our implementation ensures that already encoded entities like < are not re-encoded. Decoding applies the inverse mapping: < → <, etc., taking care to decode the ampersand last. This behavior fully complies with common XML processor expectations.
All operations are performed with native JavaScript string manipulation using global replace with regular expressions, guaranteeing O(n) complexity and real‑time feedback even for large XML fragments (tested up to 5 MB).
Performance note: String replacement in JavaScript is handled by the browser’s native engine. For texts under 1 MB, the conversion completes in under 50ms on typical devices. Larger inputs (up to 10 MB) may take a few seconds but remain functional. The tool is not designed for streaming multi‑gigabyte XML files.
| Scenario | Recommended action | Why |
|---|---|---|
| Generating XML dynamically from database values | Encode (escape) text content & attributes | Prevent broken markup and injection. |
| Reading escaped XML stored in logs or JSON | Decode to human‑readable form | Restore original characters for analysis. |
| Embedding XML inside XML (e.g., SOAP messages) | Encode the inner XML as text | Keeps outer document well‑formed. |
| Sanitizing user input for XML export | Encode + validate | Eliminates risk of element injection. |
Example transformation:
Input:
Encoded: <message>5 < 7 & "trustworthy"</message>
Decoded back to original:
< or &, the encoder will escape them, which is correct when you need to embed the CDATA section inside another XML document. However, if you want to preserve a literal CDATA block for direct XML consumption, you should not escape its content. The tool works transparently on any string. For edge cases involving the sequence ]]>, note that standard XML forbids that inside CDATA; our tool treats it as plain text.
©) unless they collide with predefined entities. We focus on the five core XML entities. Numeric entities remain as‑is. This prevents unwanted character conversion and stays within the scope of XML 1.0 syntax. For full entity decoding, specialized tools may be required.
&lt;). Therefore, we recommend using encode on raw text only. If you accidentally double-encode, use the decode button twice to revert. Our decode method properly reverses standard entity sequences.