Sometimes when a program analyzes data, it is necessary to look at and compare various places of a file or files at the same time. If you read each file one line at a time, your program would execute very slowly. Each time a different location of a file was needed, the whole file may need to be re-read. If you are processing multiple files, the processing speed of the program would decrease exponentially by the number of files involved. For this reason, if you are processing the files locally (i.e., not on a network server), it is sometimes better to read the file into memory (RAM) completely and analyze it from there.
The best way to do this is with a dictionary. One can then access the values of the dictionary as necessary. The basic code for this is very similar to reading a file line by line; the difference is that, instead of analyzing the data, we assign the data to a value in the dictionary.
fileIN = open(sys.argv[1], "r")
line = fileIN.readline()
record = {}
keycounter = 1
while line:
key = str(keycounter)
record[key] = line
keycounter = keycounter + 1
line = fileIN.readline()
In the first block of this code, the program opens the file object fileIN based upon the first command line argument of the program. It then reads in one line of that file object and assigns it to 'line'. Next, a dictionary names 'record' is initiated with no values. Finally, an integer value of 1 is assigned to 'keycounter'.
The second block is a while loop in which first the value of the counter is assigned to 'key'. This key is used to assign the value of 'line' to the dictionary 'record'. After assigning the value of 'line', the counter is incremented and a new line is read.
When the loop has stopped executing, the dictionary 'record' will represent the entire file with each key:value pair holding the value of one line. One can then iterate over the dictionary or selectively call out values at will.
Naturally, one could repeat this process to process another file and file object; one would then use slightly different or reinitiated variable names (save, of course, for the dictionary). Then the two dictionaries could be compared as needed.
Reading the file this way is memory intensive and should not be done on a network server. However, it is the best way to compare texts over a wider context than a single line.

