Convert international text (Unicode) to Punycode — the ASCII‑compatible encoding used for internationalized domain names (IDN). Decode Punycode back to readable characters.
Punycode is a standardized encoding (RFC 3492) that converts Unicode strings into the limited ASCII character set allowed in domain names. It was created to support internationalized domain names (IDNs), so that users can type domain names in their native scripts (e.g. 例子.中国, münchen.de). The encoded form always starts with the prefix xn--. For example, "bücher" becomes "xn--bcher-kva".
Punycode uses the Bootstring algorithm with parameters (base = 36, tmin = 1, tmax = 26, skew = 38, damp = 700, initial_bias = 72, initial_n = 128).
It's a variable‑length, lossless encoding that inserts code points as digits and uses a bias‑adjusted adaptive system.
Before Punycode, domain names were limited to A‑Z, 0‑9, and hyphen. The Internet Engineering Task Force (IETF) recognized the need for multilingual domains and adopted Punycode in 2003 as part of IDNA (Internationalizing Domain Names in Applications). The algorithm was designed by Adam M. Costello and is now implemented in every modern browser, email client, and DNS resolver. Understanding Punycode helps developers debug IDN issues, test email validation, and handle user input with international characters.
The Bootstring algorithm used by Punycode can be summarized in these steps (simplified):
The core idea: represent the difference between successive code points, and adapt the base according to the frequency of lower values — making the encoding compact for small scripts.
| Step | Description | Result |
|---|---|---|
| 1. | Basic characters (ASCII) : m, n, c, h, e, n → kept | m+nchen |
| 2. | Non‑basic: ü (U+00FC) at position 1 (0‑based) | ü = code point 252 |
| 3. | Encode delta: 252 - 128 = 124, adapt bias → digits "3ya" | inserted after delimiter |
| 4. | Final: "mnchen" + "-" + "3ya" = "mnchen-3ya", plus prefix "xn--" | xn--mnchen-3ya |
All examples are verified using the punycode.js library.
The city of Munich's official website uses "muenchen.de" (ASCII) for broader compatibility. But many locals prefer "münchen.de". When you type "münchen.de" in a browser, it is converted to "xn--mnchen-3ya.de" behind the scenes. The DNS sees only the Punycode version, while users see the readable Unicode in the address bar (thanks to IDNA2008). This tool allows you to preview exactly what string is sent to the DNS.
| Encoding | Character set | Use case |
|---|---|---|
| Punycode | ASCII (a‑z, 0‑9, -) | Domain labels (IDN) |
| UTF‑8 | Any Unicode, multi‑byte | General text, web pages |
| Percent‑encoding | ASCII + %xx | URL components (RFC 3986) |
Punycode's Bootstring uses a set of carefully chosen parameters to make the encoding compact for small scripts (like Latin with diacritics) while still handling large CJK ideographs. The damp parameter slows down bias adjustment at the beginning, and skew controls how quickly the bias adapts to higher code points. The full specification in RFC 3492 provides a complete C implementation, and our JavaScript library follows it exactly.