Friday, January 30, 2015

Using files to file stuff...

Everything we’ve dealt with up to this point has had no real permanence. That is to say that all of the data we’ve processed only exists during the life of the program, once the program is complete, the data is destroyed. This is ok for programs like calculators, but most programs require data to be contained between uses. In order to do that the data needs to be saved to disk in the form of a file. Luckily, reading and writing from files is pretty easy using Python. Let’s begin by opening the interpreter and making a new file like this:
f = open(<path to file/filename>, “w”)
Notice that this immediately creates a file in my file system. I created a text file, but you can add any extension you want to. However, adding a jpg or mov extension won’t automatically make your file an image or a video. Certain file types require special readers and writers to work properly. This method only allows you to save data in readable form, such as text or numbers. I will discuss more advanced methods of file reading and writing in a later post. For now, we’ll stick with the text file.

The open( ) function takes the filepath as its first argument, followed by an optional argument, which in our case was “w”. The “w” stands for write, because we wanted to write a file at the path that we specified. Without the “w” the open function would have errored because the file didn’t already exist on disk. So when creating a new file, use the “w” argument. 

Now we have a blank file open in a file object. Let’s add some text into it using the write( ) method that is built in to our file object.
Remember the “\n” represents an “end of line” character. Now that we’ve added some data to our file object we can close the file using the close( ) method. This will write the data to disk because we invoked the “w” argument when we created the file object.
Now we can re-open our file by using the open( ) function again, but this time we will pass it “r” as a second argument. “r” stands for read and will prevent the file from being modified inadvertently by opening it as a read-only file stream.
Now we can iterate through our file object using a for loop to read the data from it:
If we repeat the for loop we will get nothing.
This is because the file has already been read through and our file “cursor” is sitting at the end of the file. We can confirm this by using the tell( ) method, which tells us what position our cursor is at in the file.
Now we can see that we are at the 103rd position in the file. In order to move the cursor back to the beginning of the file we can use the seek( ) method.
The seek( ) method will set the file cursor to the position specified. Using an argument of 0 will set the cursor back to the beginning of the file.

Now we can invoke a different method to read our file. The read( ) method can be used to assign the entire contents of the file to a variable.
There are lots of methods that can be used on file objects. The last one I’ll talk about is the readline( ) method. This method, as you might guess, reads the file until it finds a newline character “\n”.
Putting it all together
We now have the tools to save information from a program and read it back in again later. There are many ways we can go about this. For instance, let’s pretend we are writing a video game in which our player’s character has different attributes like health, speed, strength, and items. Each time we load our game we can read the attributes from a file, and each time we close the program we can write the data out to a file.
Let’s assume that the file looks something like this:
Name:Bob
Health:50
Strength:10
Speed:5
Handsomeness:2
We could load these lines in using a file object and read through each line to get the attribute and it’s value. Attribute:value should give you an idea of how this data might be stored once we load it into the program. We could load it into a dictionary. We can do this in the following manner:
file = open(“D:/character.txt”,”r”)
character = {}
for line in file:
    attributes = line.split(“:”)
    key = attributes[0]
    value = attributes[1].replace(“\n”,””)
    if key == “Name”:
        character[key] = value
    else:
        character[key] = int(value)

Notice that each value has the newline character “\n” removed before loading. Also, if the item to be loaded is numeric it is converted from a string to an integer using int(value) when being assigned to the dictionary value.
Once the data is loaded into a dictionary it can be manipulated using standard methods.
Then when the program is closed or the user prompts a save if can be written out using a similar method:
file = open(“D:/character.txt”,”w”)
for key in character.keys():
    line = key + ’:’ + str(character[key]) + “\n”
    file.write(line)

file.close()

 
Now the data your programs generate can live on long after your program has been closed. There are many pre-existing methods to read and write files of different types which I will cover in future posts, but for now, go wild with the possibilities inherit in long term data storage and retrieval.

Quick recipes for the master caster

Now that you know a few things about objects. I thought I throw out a few suggestions on how to use that knowledge in a practical way. So this is just a quick post to show you how to do some clever things with the knowledge you already have. I also want to introduce the idea of “type casting”. Type casting in Python means exactly the opposite of what it means in acting. When you “cast” an object, you change its type from one thing to another. Meaning you can change a float object to an int object, or an int object to a string object, a string to a list, and so on. Not all object types are interchangeable, although you can usually find a circuitous path to get an object of one type to another if you try hard enough.
Python has built in functions for converting between object types making casting really easy to do. The functions work as follows:
<proposed object type>(<object>)
Check out the image below for some examples of casting objects from one type to another.


As you can see, I began with an integer type object. I then converted it to a new float object. I then created a new string object from the float, followed by a list from the string. Float objects and Int objects can’t be converted to a list, but strings can because they are the same style object (collection/sequence). So in order to convert my Int to a list I first have to convert it to a string then convert the string to the list. Note, in the above example I assigned the converted objects to new variables. I did this because type casting is non-destructive, meaning that even when I assign b equal to float(a), the new object b is a float but object a is still an int.

So here is a list of the conversion functions for the objects you know so far and the types of objects that each conversion will accept.

Casting Function Accepted Object Types
int() – Integer cast Float objects,
String objects (must only have numeric characters, meaning no decimals, letters or special characters)

It is possible to convert letter characters to int objects using the int() cast function, but it involves using non-base ten numbering systems which I won’t go into here.
float() – Float cast Int objects,
String objects (must only have numeric characters and one or less decimal points, no letters or special characters)

Float cast accepts numbers and strings. The string can have a decimal but doesn’t have to, so ‘500’ and ‘500.0’ are ok. ‘50.0.0’ will not work.
str() – String cast Any object.

Any object can be converted to a string, though the results might not always be easily intelligible.
list() – List cast Any ‘iterable’ object.

List() cast accepts any ‘collection’ style object, including strings and tuples.
tuple() – Tuple cast Any ‘iterable’ object.

Tuple() cast accepts any ‘collection’ style object, including strings and lists.

So what can we do with this information? I’ll demonstrate a few bits of code that use some of the list functions I discussed in my last post and some type casting to do common programming tasks.

Finding an average value
If you have a list of numbers, but the length of the list might be different from time to time, you can find the average of the numbers in the list with a single line of code. For example, if I have a list myList = [5,7,2,100,3,1,1,23] I can find the average with the following line:
sum(myList) / float(len(myList))
Notice what happens when I don’t use the float() cast function. Remember, when a calculation is performed on two integers, an integer is returned. Averages are rarely integer values so in this case, we cast one of the values in the equation to a float type object in order to force the resulting value to have floating point precision. We could have also done the following:
float(sum(myList)) / len(myList)
Regardless of the side that we cast to float the outcome will be the same. If we don’t cast either side to a float object then we are performing an int divided by int calculation, which will produce an int result.

Rounding a float to the nearest integer
When you cast a float object to an int the decimal value of the float is just dropped. So 9.9 and 9.000001 both become 9. If you want to round a float to the nearest integer ( meaning a number with a decimal value of .5 or greater will be rounded up to the next highest integer ) you can do the following:
int(<float object> + 0.5)
This bit of code will cause 9.9 to be 10.4 prior to the integer cast and 9.000001 to be 9.500001. So 9.9 will be truncated to 10 and 9.000001 will be truncated to 9.