Python

  1. Home
  2. Computing & Technology
  3. Python

How to Analyze a File Line-By-Line With Python

From Al Lukaszewski, for About.com

One of the primary reasons people use Python is for analyzing and manipulating texts. If your program needs to work through a file, it is usually best to read in the file one line at a time for reasons of memory space and processing speed. This is best done with a while loop as follows:

fileIN = open(sys.argv[1], "r")
line = fileIN.readline()

while line:
[some bit of analysis here]
line = fileIN.readline()

This code takes the first command line argument as the name of the file to be processed. The first line opens it and initiates a file object, 'fileIN'. The second line then reads the first line of that file object and assigns it to a string variable, 'line'. The while loop then executes based on the constancy of 'line'. When 'line' changes, the loop restarts. This continues until there are no more lines of the file to be read. The program then exits.

Reading the file in this way, the program does not bite off more data than it is set to process. It thus processes the data it does input faster, giving its output incrementally. In this way, the memory footprint of the program is also kept low, and the processing speed of the computer does not take a hit. This can be important if one is writing a CGI script that may see a few hundred instances of itself running at a time. To read more about while loops, see the tutorial Beginning Python.

More Python Quick Tips

Explore Python

About.com Special Features

Build Your Own Website

Step-by-step advice on how to do everything from choosing a Web host to promoting your content. More >

Connect Your Home Computers

Easy ways to connect two computers for networking purposes. More >

Python

  1. Home
  2. Computing & Technology
  3. Python
  4. ScripTips
  5. How to Analyze a File Line-By-Line With Python

©2009 About.com, a part of The New York Times Company.

All rights reserved.