UTF-8 Encoder/Decoder
Convert text to its UTF-8 byte representation and back.
Input
Output
How the UTF-8 Encoder/Decoder Works
This tool demonstrates how text is represented under the UTF-8 encoding scheme, which is the dominant character encoding for the World Wide Web. It converts characters to their hexadecimal byte sequences and vice versa.
- Encode to UTF-8: Enter any text, including emojis or international characters, into the input box and click "Encode". The tool will display the sequence of hexadecimal bytes that represent that text in UTF-8.
- Decode from UTF-8: Enter a space-separated sequence of hexadecimal bytes (e.g., `48 65 6c 6c 6f`) and click "Decode". The tool will interpret the bytes and display the corresponding text.
Frequently Asked Questions (FAQ)
Q: What is UTF-8?
A: UTF-8 is a variable-width character encoding that can represent every character in the Unicode character set. Its main advantage is that it is backward-compatible with ASCII, meaning that the first 128 characters (standard English letters, numbers, and symbols) are represented by a single byte, just like in ASCII.
Q: Why do some characters use more bytes than others?
A: This is the "variable-width" nature of UTF-8. Basic Latin letters use 1 byte. Other Latin-based characters, like `é`, might use 2 bytes. More complex characters, like those in Chinese or Japanese, can use 3 bytes, and emojis often use 4 bytes. This makes it a very efficient encoding for text that is mostly English.