1. Home
  2. Computing & Technology
  3. Python

How Python Processes Non-English RSS Feeds

From , former About.com Guide

3 of 5

Troubleshooting the Encoding Problem in Python

When you try to access one of the feeds, the browser will seem to stall for a bit but then will not return any data. If you have been following this series and have been through the last tutorial, your program should take the error in stride.

While we know what is wrong, it is important to know how to tell. In this instance, the application simply fails to output meaningful data. This is because there was no problem in retrieving and processing the feed; the program did not output the data correctly.

It is up to the programmer to troubleshoot the application behind-the-scenes. To do this, you may want to hardwire one of the feeds into the program and then run it. You may find that the data is coming out perfectly fine in the shell.

What is happening here is that, while Python is receiving the non-English verbage, it is not printing it as non-English verbage. Given the HTML headers created by the PHP script, the browser is expecting a character set called 'UTF-8', an encoding of Unicode. Before we talk about how Python speaks UTF-8 to the browser, let's first talk about Unicode.

Explore Python
About.com Special Features

Holiday Central

What to eat, where to go, fun things to do and how to save money on the perfect gifts. More >

Family Tech Center

Stay connected and entertained with reviews on tips on the latest HDTVs, cellphones and more. More >

  1. Home
  2. Computing & Technology
  3. Python

©2009 About.com, a part of The New York Times Company.

All rights reserved.