Classes and objects

Object-oriented programming

Everything in Python is an object. But what is an object, really? In the most abstract sense, an object has some kind of data and provides a means to manipulate that data—to change the data, store new data, or retrieve the data in a different format. An object is a data structure with an interface. A good example is the list:

>>> foo = list() # create the object
>>> foo.append(1) # manipulate its data
>>> foo.append(2)
>>> foo.append(3)
>>> foo.pop() # get data: pop returns and removes last list element
3

The list foo has the ability to store data (a bunch of integers, in this case), and provides methods (append, pop, among others) to access that data.

There are a number of benefits to this arrangement:

  • The data itself is hidden from view: we can only access it through methods. This means that even if Python decides to change the underlying code that implements append, our programs will still work. (This is called encapsulation.)
  • The close association of data and code that operates on that data provides a helpful abstraction: we can cognitively treat the combination of the two as a unit. This is generally considered more helpful than non-object-oriented code, wherein the functions that operate on data and the data itself are separate.
  • An encapsulated abstraction can be more easily reused. Once we have a class that does the job we want it to, it’s easy to use it again in other programs that we write.

Defining our own classes

The idea of abstraction—of creating something in code that behaves like whatever it is we’re trying to model in the real world—is the main reason we might want to make our own kinds of objects. Python provides lists, dictionaries, sets, strings—but what if my program is about sentences, or poems, or people, or space rockets? It would be nice to be able to make our own objects, with data appropriate for those objects and methods that operate on them in an intuitive way. Something like:

>>> rocket = Rocket()
>>> rocket.power = 4
>>> rocket.awesomeness = 5000
>>> rocket.launch()
KABOOM!

Fortunately, Python provides a mechanism for us to define our own types of objects: the class.

If an object is a cookie, then the class is the cookie cutter. If an object is a car, then the class is the factory. If an object is a piece of Ikea furniture, then the class is the instructions that tell you how to put it together. Building, blueprint; vineyard, wine. Etc. Insert your own metaphor here.

Let’s make the simplest class possible:

>>> class Rocket(object):
...     pass
... 
>>>

As you can see, class definitions in Python begin with the keyword class, followed by the name of the class, followed (in parentheses) by the name of the class that this class should “inherit” from (see below for more details on inheritance). We’re not inheriting from any class we’ve already defined; Python requires in this case that you inherit from the built-in class object.

That’s all you need to make a class. (The pass keyword just means “nothing’s here”—it’s like empty curly brackets in other languages.) It won’t be a very interesting class—it has no properties or methods—but it’s a fully-fledged class nonetheless.

Now that we’ve defined a class, we can instantiate an object of that class. This is done by calling the class as though it were a function. The resulting object can’t do much, but it will respond to the type function (which returns the type of the object, i.e., the class from which it was instantiated), and the isinstance function (which returns true if the first parameter is an instance of the class specified as the second parameter):

>>> rocket = Rocket()
>>> type(rocket)
<class '__main__.Rocket'>
>>> isinstance(rocket, Rocket)
True
>>> isinstance(rocket, list)
False

Attributes, methods, and __init__

Class definitions can include more than just pass, of course. For the most part, when you define a class, you’ll be defining class methods: a special kind of function that is designed to work with objects of a particular class. Let’s redefine our Rocket class with a bit more functionality:

>>> class Rocket(object):
...     def __init__(self, power, awesomeness):
...             self.power = power
...             self.awesomeness = awesomeness
...     def launch(self):
...             output = "kab" + ('o' * self.power) + "m!"
...             if self.awesomeness >= 5000:
...                     output = output.upper()
...             return output
... 
>>> rocket = Rocket(5, 7000)
>>> rocket.power
5
>>> rocket.awesomeness
7000
>>> rocket.launch()
'KABOOOOOM!'

Here we’ve defined a class with two methods. The first thing you’ll notice is that both of these methods take self as a first parameter. You can think of self as analogous to this in Processing/Java: it simply means “whichever object this method was called on.” Python automatically passes this parameter to the method, so while you don’t have to include it when calling the method, you do have to include it when you define the method.

The __init__ (short for “initialize”) method is a special method: it’s called automatically whenever an object is instantiated. Any arguments that are passed when the object is instantiated (e.g., 5, 7000 in the transcript above) are passed along to __init__.

Attributes

Inside __init__, we use the parameters to set attributes on the object. You can set and access attributes by putting a dot (.) between the object and some name (e.g., rocket.power retrieves the power attribute of the rocket object). You can think of an object’s attributes as a strange-looking dictionary that holds the data associated with that object.

Methods

The launch method then accesses those attributes to determine what it should output. Here we make a string that varies depending on the rocket’s power and awesomeness.

Example in context: sentence.py

import conjunctions

class Sentence(object):

  # initialize function provides default arguments for subject and verb
  def __init__(self, subj="No one", verb="ignored", direct_obj="",
      prep_phrase=""):
    self.subj = subj
    self.verb = verb
    self.direct_obj = direct_obj
    self.prep_phrase = prep_phrase

  # render puts all the parts of the sentence together, only adding
  # direct object and prepositional phrase if present
  def render(self):
    elems = [self.subj, self.verb]
    if len(self.direct_obj) > 0:
      elems.append(self.direct_obj)
    if len(self.prep_phrase) > 0:
      elems.append(self.prep_phrase)
    output = ' '.join(elems)
    return output

  # statement: renders this sentence as a statement
  def statement(self):
    output = self.render()
    output += "."
    return output

  # uses the conjunctions module to randomly conjoin this sentence with
  # another
  def statement_conjoined_with(self, other_sentence):
    output = conjunctions.random_conjoin(self.render(), other_sentence.render())
    output += "."
    return output

if __name__ == '__main__':
  sentence1 = Sentence('John', 'ate', 'cheese', 'in a sack')
  print sentence1.statement()
  sentence2 = Sentence('George', 'slept')
  print sentence2.statement()
  print sentence1.statement_conjoined_with(sentence2)

- explain if __name__ == ‘__main__’

(notes forthcoming)

An object-oriented concordance

In previous weeks, we’ve looked at how to write a program that makes a simple concordance. This program works along the same model as many of the other programs we’ve made: it reads in some input, operates on that input, and produces some output. The thing about a concordance, however, is that in an abstract sense it does more than just filter text and print out text: once the data structure has been built, there might be any number of ways we want to access and print out that data. We’ve built the data structure; now we need to get all Donald Norman on that and design an interface to the data.

In the following program, we define a class called Concordance, which implements the same kind of simple word count concordance as programs of weeks past. We’ve decided that our interface to the concordance data will include functions to:

  • add a line of text to be processed and added to the concordance
  • get the count for a particular word
  • get a list of all unique words
  • return word/count pairs from the concordance, sorted in reverse order by word frequency
  • return the N most common words from the concordance

Implementation below.

class Concordance(object):

  def __init__(self):
    self.concord = dict()

  def tokenize(self, line):
    return line.split(" ")

  def feed(self, line):
    words = self.tokenize(line)
    for word in words:
      if word not in self.concord:
        self.concord[word] = 0
      self.concord[word] += 1

  def count_for_word(self, word):
    if word in self.concord:
      return self.concord[word]
    else:
      return 0

  def unique_words(self):
    return self.concord.keys()

  def reverse_sorted_pairs(self):
    # magic words for getting a list of word/count pairs, sorted in reverse
    # order by count
    return list(sorted(self.concord.iteritems(), key=lambda x: x[1],
      reverse=True))

  def most_common_words(self, num):
    pairs = self.reverse_sorted_pairs()
    # return a list with just the first item (the word) from the reverse sorted
    # list of word/count pairs (up to num items)
    return [p[0] for p in pairs[:num]]

  def get_concordance(self):
    return self.concord

if __name__ == '__main__':

  import sys
  concordance = Concordance()
  for line in sys.stdin:
    line = line.strip()
    concordance.feed(line)

  unique_word_count = len(concordance.unique_words())
  print "number of unique words in input: " + str(unique_word_count)

  the_count = concordance.count_for_word('the')
  print "the word 'the' appears " + str(the_count) + " times"

  print "top twenty words, in order of frequency:"
  pairs = concordance.reverse_sorted_pairs()
  for pair in pairs[:20]:
    print "\t" + pair[0] + ": " + str(pair[1])

The section after if __name__ == '__main__': shows one possible way to use this class: read lines from standard input, feed them to a concordance object, then call various methods to get useful data from the object. The great thing about this program is that because (a) we’ve defined the functionality in a class and (b) we’ve put our “test” code after if __name__ == '__main__':, we could potentially import Concordance in some other program, and use our concordance class as-is, with data that comes from any source.

Encapsulating and abstracting a poem generator

Poem generators might also have more sophisticated interfaces than just reading lines of text and printing out lines of text. Let’s write a very simple haiku generator:

(WorstPossibleHaikuGenerator.py)

The program in combo.py shows how we might use both of these classes together to create a program that generates terrible ersatz haiku from a combination of randomly selected unique words and the 100 most common words from a given source text:

from WorstPossibleHaikuGenerator import WorstPossibleHaikuGenerator
from Concordance import Concordance
import sys
import random

concord = Concordance()
for line in sys.stdin:
  line = line.strip()
  concord.feed(line)

haikugen = WorstPossibleHaikuGenerator(5, 7, 5)
unique_words = concord.unique_words()
for i in range(100):
  haikugen.add_word(random.choice(unique_words))

most_common = concord.most_common_words(100)
for word in most_common:
  haikugen.add_word(word)

print haikugen.generate()

Running this program against, e.g., lovecraft.txt might produce the following output:

space, sailed rays is men
rays bearded saw Then his night not
village thirteen when lands spears

Inheritance, adding functionality and overriding methods

(following material for reference only, we’re not talking about this stuff in lecture)

Another benefit of object-oriented programming is the concept of inheritance. Inheritance is a way to create a new class that behaves in all ways like another class, except for whatever differences that we define. Here’s how to do it, using sentence.py as the class we’re inheriting from (the “base class”):

>>> import sentence
>>> class Sentence2(sentence.Sentence):
...     pass
... 
>>> s = Sentence2('Joe', 'shrank', 'his sweater')
>>> s.statement()
'Joe shrank his sweater.'

The Sentence2 class here inherits from the Sentence class in sentence.py. That means that objects of class Sentence2 behave exactly like objects of class Sentence, without our having to define all of the same methods in class Sentence2.

We can easily add new functionality to our Sentence2 class by defining a function that isn’t present in the base class:

>>> class Sentence3(sentence.Sentence):
...     def hedge(self):
...             output = self.render() + ", or at least that's what I heard."
...             return output
... 
>>> s = Sentence3('Joe', 'shrank', 'his sweater')
>>> s.statement()
'Joe shrank his sweater.'
>>> s.hedge()
"Joe shrank his sweater, or at least that's what I heard."

The Sentence3 class supports all methods that the Sentence class supports, in addition to the new hedge method.

But the real power of inheritance becomes manifest when we override functions in the original class. This lets us replace parts of the functionality of the base class, while leaving the rest of the functionality intact. Behold, weird_sentence.py:

import conjunctions
import sentence
import random

class WeirdSentence(sentence.Sentence):

  def render(self):
    elems = [self.subj, self.verb]
    if len(self.direct_obj) > 0:
      elems.append(self.direct_obj)
    if len(self.prep_phrase) > 0:
      elems.append(self.prep_phrase)
    random.shuffle(elems)
    output = ' '.join(elems)
    return output

if __name__ == '__main__':
  weird = WeirdSentence('John', 'ate', 'a sandwich', 'yesterday')
  print weird.statement()
  normal = sentence.Sentence('George', 'bought', 'flowers')
  print normal.statement()
  print weird.statement_conjoined_with(normal)

Exercises

  1. Make a class that models a person. Specifically, instances of this class should have two attributes: first_name and last_name. The class should also define a full_name function which, when called, returns the first name and last name concatenated.
  2. Modify (or extend) the Sentence class in sentence.py to print out sentences in the form of a question. You can either define a new method (e.g., render_as_question) or override the existing render method.
  3. Take one of your homework assignments and re-envision it as an object-oriented program.

Helpful resources

Reply