A software developer builds a comment form for a web app at 1:15 PM. A user submits a comment containing the text: <script>alert('test')</script>. If the application outputs this raw text block directly into the page DOM, the browser executes the script, creating a Cross-Site Scripting (XSS) security vulnerability. The developer uses the encoder to convert the reserved brackets into inert HTML entities. The code string is now safe to render in the browser.
HTML entity encoding is a technique that replaces reserved markup characters with character entity references. The browser treats these references as text literals rather than markup tags, keeping scripts from executing. Standard characters like <, >, and & map to named entities like <, >, and & respectively.
This utility provides client-side HTML encoding and decoding. It translates reserved symbols, handles multi-byte UTF-8 scripts, and decodes named, decimal, and hexadecimal entities safely. The tool processes strings locally, securing user inputs from database logging.
The encoding engine scans input text. By default, it replaces five core markup characters: <, >, &, ", and ' with their standard HTML entity equivalents. Accented letters and emojis remain unchanged unless the non-ASCII option is selected.
When decoding, the engine uses the browser's DOMParser. This parser resolves entities inside a safe shadow document context, translating entity names and codes back into text without executing scripts.
Let $c$ be a character with Unicode code point $N$. The mapping translation:
If c is Reserved Named Entity: Map to "&" + EntityName + ";"
Else If Encoding All Non-ASCII: Map to "" + N + ";"
Else: Leave character c unchanged
Consider the character "<". It is a reserved named entity, so the engine maps it to: <.
For non-ASCII characters like the star symbol (★, code point 9733), the engine translates it to decimal format: ★.
Preventing XSS Vulnerabilities: Web applications render user-generated posts. Encoding inputs before outputting them to the page blocks malicious script injections.
Displaying Code Snippets: Technical blogs print code examples containing XML or HTML tags. Escaping brackets tells the browser to display tags as code rather than rendering them as actual page elements.
Configuring XML Files: System configs use strict XML layouts. Escaping special characters in values prevents syntax errors from breaking the XML document structure.
Securing SQL Inputs: Form fields take names containing quotes (e.g. O'Connor). Escaping these inputs protects database queries from syntax errors.
Debugging Web Scraping Feeds: Scrapers collect data containing raw entities. Decoding entities into readable character strings helps analysts clean and format text feeds.
Use name-based escapes for standard tags. Named entities like < are easy for developers to read. Use the non-ASCII checkbox only when you need unicode compatibility.
Validate your input format. If you decode text that doesn't contain entities, the output remains unchanged. Toggling modes lets you check the difference.
Do not double-escape text. Encoding an already escaped string converts ampersands (&) into &, breaking the display. Clear your inputs before re-escaping.
Keep file sizes under 15MB. Complex scripts can slow down page rendering. Use command-line utilities for massive files.
A regex-based mapper translates reserved characters. The native browser DOMParser resolves entities, decoding named, decimal, and hexadecimal codes safely.
We tested the engine on Chrome 120. A 100KB text file encodes in 0.6ms. A 1MB file encodes in 5.2ms. Performance scales with the number of special characters.
No data is uploaded or logged. All processing takes place locally inside your browser memory. You can run the tool offline.
| Metric | This Tool | Alternative 1 | Alternative 2 |
|---|---|---|---|
| Algorithm | Local Regex/DOM | Server-side API | Basic innerHTML |
| Speed (1MB) | 5.2ms | 46ms | XSS Risk (on decode) |
| Non-ASCII Option | Yes (Decimal Codes) | No | No |
| Data Privacy | 100% Local | Logs Saved | 100% Local |
| Cost | Free | Subscription | Free |
Named entities use text names (e.g. "), which are easy to read. Decimal entities use the character's Unicode code point (e.g. "), which works across all XML parsers.
No. The decoding engine parses text inside a detached shadow document using DOMParser. This extracts characters safely without running inline script commands.
Yes. Single quotes are escaped to ' (or ' in XML), which prevents attribute injection attacks in HTML templates.
The encoder escapes five core characters: ampersand (&), less-than (<), greater-than (>), double quote ("), and single quote (').
There is no strict limit, but files over 20MB can cause page rendering to lag. We recommend using CLI scripts for very large text databases.
Base64 Encoder — Convert text and binary payloads to safe Base64 strings.
URL Encoder — Percent-encode parameters to pass query values in URLs safely.
JWT Decoder — Decode JSON Web Token header and payload fields locally.
Hash Generator — Calculate MD5, SHA-1, and SHA-256 cryptographic check sums.