1. Computing

Beginning Python: Python Encodings

From , former About.com Guide

7 of 7

Python Makes ASCII "Hello" Read as "Hello" in Unicode

In order to show the difference more clearly, let's get a bit more technical. ASCII key values bridge a range of 128 hexadecimal values from 0x00 to 0x7f ("0x00" is the hexadecimal form of "0"; "0x7f" is the hexadecimal equivalent of "127"; the integers from 0 to 127 are 128 in number). As I stated earlier, non-English keybindings cannot use this range nor this set. For many years, different encodings were used as a foil. This is whence the ISO 8859 series developed. However, many of these encodings could not "talk" to each other. This caused programming to become incredibly complex and subsequently impinged upon the ability of users to interact internationally.

Then the Unicode Consortium developed a system whereby every possible alphabet (save for Old Chinese) had its own unique identifier. So, the ASCII set became the beginning of a series that now includes over 100,000 possible characters. The ASCII set is reflected in addresses U+0000 through U+007F of that series. So, when "Hello" is converted from ASCII to Unicode, the computer stops "seeing" ASCII (that range from 0x00 to 0x7f) and starts reading Unicode. This offers greater interoperability with other languages and protects the program itself from premature obsolescence. Instead of multiple encodings, Unicode effectively offers different parts of one giant encoding.

Other tutorials in this series: 1 | 2 | 3 | 4 | 5 | 6

  1. About.com
  2. Computing
  3. Python
  4. Beginning Python
  5. "Hello" in ASCII is "Hello" in Unicode

©2013 About.com. All rights reserved.