So far all the data in our programs has either been hardcoded into the program itself or else it came from the user who typed it in at the keyboard. This is pretty limiting and it is fairly clear that we will want programs that can read data from files.
In this lesson we'll talk about what we can do with text files. Text files are files that use one of a number of standard encoding schemes where the file can be interpretted as printable characters. Later we might learn about binary files where we can't view the file as characters but for most of our purposes we can assume we have text.
First we will need a string to specify the name of our file.
We could have a variable for this and could:
input()
and save it in a variableNote: avoid using 'file' as a variable name, because it is a type.
Next, we use the command open
and the name of the file
f = open('story.txt', 'r')
f
This opens the file named story.txt
from the current directory. It is open for reading (that's the r
mode) and the type of object is io.TextIOWrapper
. Don't stress about the type at all. Just think of it as an open file. The important conceptual idea here is that this object not only knows the contents of the file, but it knows our current position in the file. So once we start reading, it knows how much we've read and is able to keep giving us the next piece.
myfile = open('story.txt', 'r')
s = myfile.readline() # Read a line into s.
print(s)
s # Notice the \n that you only see when you look
# at the contents of the variable.
The \n
(backslash n) character is a single character representing a new line.
s = myfile.readline() # The next call continues where we left off.
print(s)
s = myfile.readline() # And so on...
print(s)
myfile.close()
I can use this to read an entire file, bit by bit, under my control.
filename = 'story.txt'
myfile = open(filename)
s = myfile.read(10) # Read 10 characters into s.
print(s)
s = myfile.read(10) # Read the next 10 characters into s.
print(s)
myfile.close()
I can also use this to read an entire file, bit by bit, under my control.
If I know I want to read line by line through to the end, a for
loop makes this easy. This is probably the most common way to read a file. Use this unless you have a reason not to.
f = open('story.txt')
for line in f:
print(line) # Or do whatever you wish to line
myfile.close() # Good habit: close a file when you are done with it.
Question: Why is the output from the for loop double-spaced?
Answer: print
gives you a \n
and there was one on the end of each line.
Question: How can you single space the output?
Strip the newline character from the end of each line before you print.
f = open('story.txt')
for line in f:
line = line.strip('\n')
print(line)
(4) Read everything in the file into one string
filename = "story.txt"
myfile = open(filename)
s = myfile.read() # Read the whole file into string s.
print(s)
myfile.close()
s
(5) Use readlines()
to read the file into a list
of lines.
myfile = open('story.txt')
contents = myfile.readlines()
type(contents)
contents
Beginners often do one of these last two approaches because they seem easy.
Don't use this technique unless you really need access to the whole file at once.
Usually, we can read a piece, deal with it, and toss it out.
With the for
loop approach, the loop automatically stops when the end of the file is encountered. Or never even iterates once if the file is empty!
But what happens if you are at the end of the file when you call read
or readline
?
You get the empty string. You then know you can stop trying to read more.
# Detecting the end of the file while reading line by line
myfile = open('story.txt')
next_line = myfile.readline()
while next_line != "":
print(next_line)
next_line = myfile.readline()
This example introduces a new kind of loop -- a while
loop
while (condition):
while-body
What it does
Check to see if the condition is true.
If it is, execute the entire body of the loop and go back to the top
Check again ...
Important Note: If the condition becomes false during the body of the loop, the loop does not stop at that moment. The only time it decides whether to continue or stop is at the top of the loop on each iteration.
Write a function yes_or_no
that asks a user to enter either 'yes'
or 'no'
and keeps looping asking again and
again until the user enters one of these two options.
If you finish that exercise, change your function so that it accepts any case variation such as 'Yes', 'YES' or even 'nO' and then returns the lowercase version of what the user provided. But if the user says 'nope' or 'maybe', it doesn't return and asks again for 'yes' or 'no'.
def yes_or_no():
answer = input("Please enter yes or no ")
while answer.lower() != 'yes' and answer.lower != 'no':
answer = input("Please enter yes or no ")
return answer.lower()
yes_or_no()
The file january06.txt
contains data from the UTM weather station for January 2006. Download it from the C4M website
to your local machine and put it in the same directory as where Pyzo is storing your programs. Figuring out where
to store the files or how to specify the paths to your file is half the battle!
Open it up in Pyzo to see what it looks like.
Write a Python program to open the file and read only the first line
Read the second line (this is still a header)
Read the third line into a variable line
.
What is the type of line?
Call the method split()
on line and save the return value. What is the type that is returned by
this method?
Look up the method split()
in the Python 3 documentation.
f = open('../january06.txt')
f.readline() # notice that I didn't bother to save the returned string from readline()
f.readline()
line = f.readline() # this time I saved the string because I want to use it
print(type(line))
print(line)
result = line.split()
print(result)
Some more questions and steps to do:
Which element is the temperature?
Write a program that:
Run your program and make sure it works. Once it works, show a TA or instructor