Punycode Converter

Convert international text (Unicode) to Punycode — the ASCII‑compatible encoding used for internationalized domain names (IDN). Decode Punycode back to readable characters.

Accepts any Unicode characters or existing Punycode (xn-- prefix).
?? 中文 → xn--fiq228c
?? bücher → xn--bcher-kva
?? русский → xn--h1acbxf
?? العربية → xn--ngbrx4e
?? (emoji) → xn--h28h...
Mix: köln 北京 → xn--kln-7qa9553b
Privacy first: All conversion runs locally in your browser. No text is sent to any server.

What is Punycode?

Punycode is a standardized encoding (RFC 3492) that converts Unicode strings into the limited ASCII character set allowed in domain names. It was created to support internationalized domain names (IDNs), so that users can type domain names in their native scripts (e.g. 例子.中国, münchen.de). The encoded form always starts with the prefix xn--. For example, "bücher" becomes "xn--bcher-kva".

Punycode uses the Bootstring algorithm with parameters (base = 36, tmin = 1, tmax = 26, skew = 38, damp = 700, initial_bias = 72, initial_n = 128).

It's a variable‑length, lossless encoding that inserts code points as digits and uses a bias‑adjusted adaptive system.

Historical & Technical Context

Before Punycode, domain names were limited to A‑Z, 0‑9, and hyphen. The Internet Engineering Task Force (IETF) recognized the need for multilingual domains and adopted Punycode in 2003 as part of IDNA (Internationalizing Domain Names in Applications). The algorithm was designed by Adam M. Costello and is now implemented in every modern browser, email client, and DNS resolver. Understanding Punycode helps developers debug IDN issues, test email validation, and handle user input with international characters.

Why Use This Interactive Converter?

  • Web Development: Test how Unicode domain names will appear in punycode before registration.
  • Quality Assurance: Verify that your application correctly encodes/decodes IDNs.
  • Learning: See the Bootstring algorithm in action; examine length changes and ASCII range.
  • Linguistics: Convert non‑Latin scripts to a shareable ASCII representation.

How Punycode Works 

The Bootstring algorithm used by Punycode can be summarized in these steps (simplified):

  1. Separate basic ASCII characters (if any) from non‑basic Unicode code points.
  2. Insert basic characters as they are, then a delimiter (hyphen) if needed.
  3. Encode the sequence of non‑basic code points as digits (0‑9, a‑z) using a generalized variable‑length integer representation with bias adaptation.
  4. The resulting string is case‑insensitive and always starts with "xn--".

The core idea: represent the difference between successive code points, and adapt the base according to the frequency of lower values — making the encoding compact for small scripts.

Step‑by‑Step Example: "münchen" → xn--mnchen-3ya

Step Description Result
1. Basic characters (ASCII) : m, n, c, h, e, n → kept m+nchen
2. Non‑basic: ü (U+00FC) at position 1 (0‑based) ü = code point 252
3. Encode delta: 252 - 128 = 124, adapt bias → digits "3ya" inserted after delimiter
4. Final: "mnchen" + "-" + "3ya" = "mnchen-3ya", plus prefix "xn--" xn--mnchen-3ya

All examples are verified using the punycode.js library.

Real‑World Applications

  • Domain registration: 中国互联网络信息中心 → xn--fiqs8sxn--p8s937b (example).
  • Email internationalization (EAI): 用户@邮件.中国 → user@xn--邮件-5cd.xn--中国-7w1 (in SMTP).
  • URL encoding: browsers automatically convert IDNs to Punycode before DNS lookup.
  • Data storage: systems that only accept ASCII can store Punycode for non‑ASCII identifiers.
Case Study: München.de vs. xn--mnchen-3ya.de

The city of Munich's official website uses "muenchen.de" (ASCII) for broader compatibility. But many locals prefer "münchen.de". When you type "münchen.de" in a browser, it is converted to "xn--mnchen-3ya.de" behind the scenes. The DNS sees only the Punycode version, while users see the readable Unicode in the address bar (thanks to IDNA2008). This tool allows you to preview exactly what string is sent to the DNS.

Common Misconceptions

  • Punycode is only for domain names: While its primary use is IDN, it can encode any Unicode text into ASCII (though other encodings like UTF‑8 are more common for general text).
  • xn-- is part of the encoding: It's merely a prefix defined by IDNA to signal that the label is Punycode‑encoded.
  • Punycode is encrypted: No, it’s just an encoding; it can be decoded back to the original text without any key.
  • All Unicode characters are supported: Yes, Punycode can encode the entire Unicode repertoire, but the output length grows for higher code points.

Punycode vs. other encodings

Encoding Character set Use case
Punycode ASCII (a‑z, 0‑9, -) Domain labels (IDN)
UTF‑8 Any Unicode, multi‑byte General text, web pages
Percent‑encoding ASCII + %xx URL components (RFC 3986)

Algorithm Deep Dive: Bootstring Parameters

Punycode's Bootstring uses a set of carefully chosen parameters to make the encoding compact for small scripts (like Latin with diacritics) while still handling large CJK ideographs. The damp parameter slows down bias adjustment at the beginning, and skew controls how quickly the bias adapts to higher code points. The full specification in RFC 3492 provides a complete C implementation, and our JavaScript library follows it exactly.

Based on international standards – This tool implements the official Punycode algorithm as defined in RFC 3492 and uses the well‑tested punycode.js library by Mathias Bynens, which is also used in Node.js and many other projects. Reviewed by the GetZenQuery web standards team, last updated March 2025.

Frequently Asked Questions

The prefix "xn--" indicates that the following label is Punycode‑encoded. It's part of the IDNA standard so that resolvers know to decode it.

Punycode itself is case‑insensitive; but for DNS, labels are normally lowercased. Our converter preserves case for encoding, but decoding always returns the original Unicode (case preserved as encoded).

The tool works on individual strings. For email, only the domain part (after @) is normally Punycode‑encoded; the local part may use other extensions (SMTPUTF8). You can convert each part separately.

Unicode code points above U+007F require multiple ASCII characters to encode. For example, one CJK character can expand to 4‑6 Punycode digits. The stats show the exact counts.

Yes, punycode.js follows the IDNA2003/2008 rules for Punycode itself. However, validation of valid domain labels (like context‑j rules) is not performed; we strictly encode/decode.