On this page:
11.5.1   Modifying Variables in Memory
11.5.2   Variable Updates and Aliasing
11.5.3   Updating Variables versus Updating Data Fields
11.5.4   Updating Parameters in Function Calls
11.5.5   Updating Top-Level Variables within Function Calls
11.5.6   The Many Roles of Variables

11.5   Modifying Variables🔗

    11.5.1 Modifying Variables in Memory

    11.5.2 Variable Updates and Aliasing

    11.5.3 Updating Variables versus Updating Data Fields

    11.5.4 Updating Parameters in Function Calls

    11.5.5 Updating Top-Level Variables within Function Calls

    11.5.6 The Many Roles of Variables

11.5.1   Modifying Variables in Memory🔗

Now that we have introduced the idea of the heap, let’s revisit our use of a variable to compute the sum of elements in a list. Here again is our code from earlier: If you haven’t seen Pyret for each loops, you may want to review the section where they were introduced.

Python

Pyret

total = 0
for num in [5, 1, 7, 3]:
   total = total + num

var total = 0
for each(num from [list: 5, 1, 7, 3]):
  total := total + num
end

Let’s see how the directory and heap update as we run this code. In Basic Data and the Heap, we pointed out that basic data (such as numbers, strings, and booleans) don’t get put in the heap because they have no internal structure. Those values are stored in the directory itself. Therefore, the initial value for total is stored within the directory.

Directory

  • total

      

    0

The for or for each loop also sets up a directory entry, this time for the variable num that is used to refer to the list elements. When the loop starts, num takes on the first value in the list. Thus, the directory appears as:

Directory

  • total

      

    0

  • num

      

    5

Inside the for / for each loop, we compute a new value for total. The use of = tells Python to modify the value of total; in Pyret, we have to use := to indicate we are modifying a variable (and this will error if the variable has not been declared with var).

Do Now!

Does this modification get made in the directory or the heap?

Since basic data values are stored only in the directory, this update modifies the contents of the directory. The heap isn’t involved:

Directory

  • total

      

    5

  • num

      

    5

This process continues: both Python and Pyret advance num to the next list element

Directory

  • total

      

    5

  • num

      

    1

then modifies the value of total

Directory

  • total

      

    6

  • num

      

    1

This process continues until all of the list elements have been processed. When the for/for each-loop ends, the directory contents are:

Directory

  • total

      

    16

  • num

      

    3

There are two takeaways from this example:

Exercise

Draw the sequence of directory contents for the following program:

Python

Pyret

score = 0
score = score + 4
score = 10

var score = 0
score := score + 4
score := 10

Exercise

Draw the sequence of directory contents for the following program:

Python

Pyret

count_long = 0
words = ["here", "are",
         "some", "words"]
for word in words:
  if len(word) > 4:
    count_long = count_long + 1

count-long = 0
words = [list: "here", "are",
         "some", "words"]
for each(word from words):
  if string-len(word) > 4:
    count-long := count-long + 1
  end
end

11.5.2   Variable Updates and Aliasing🔗

In State, Change, and Testing, we saw how a statement of the form elena.acct.balance = 500 resulted in a change to jorge.acct.balance. Does this same effect occur if we update the value of a variable directly, rather than a field? Consider the following example:

Python

Pyret

y = 5
x = y

var y = 5
x = y

Do Now!

What do the directory and heap look like after running this code?

Since x and y are assigned basic values, there are no values in the heap:

Directory

  • y

      

    5

  • x

      

    5

Do Now!

If we now evaluate y = 3 in Python, or y := 3 in Pyret, does the value of x change?

It does not. The value associated with y in the directory changes, but there is no connection between x and y in the directory. The statement x = y says "get the value of y and associate it with x in the directory". Immediately after this statement, y and x refer to the same value, but this relationship is neither tracked nor maintained. If we associate either variable with a new value, as we do with y = 3, the directory entry for that variable—and only the directory entry for that variable—are changed to reflect the new value. Thus, the directory after we evaluate y = 3 appears as follows:

Directory

  • y

      

    3

  • x

      

    5

This example highlights that aliasing occurs only when two variables refer to the same piece of data with components, not when variables refer to basic data. This is because data with components are stored in the heap, with heap address stored in the directory. Note, though, that uses of varname = ... still affect the directory, even when the values are data with components.

Do Now!

After running the following code, what is the value of ac2.balance?

Python

Pyret

ac1 = Account(8623, 600)
ac2 = ac1
ac1 = Account(8721, 350)

var ac1 = Account(8623, 600)
ac2 = ac1
ac1 := Account(8721, 350)

Draw the directory and heap contents for this program and check your prediction.

All three of these lines results in changes in the directory; the first two result in changes in the heap, but only because we made new pieces of data. ac1 and ac2 are alises immediately after running the second line, but the third line breaks that relationship.

Do Now!

After running the following code, what is the value of ac1.balance?

Python

Pyret

savings = 475
ac3 = Account(8722, savings)
savings = 500

var savings = 475
ac3 = Account(8722, savings)
savings := 500

Draw the directory and heap contents for this program and check your prediction.

Since the value of savings is stored in ac3.balance, and not the name savings itself, updating the value of savings on the third line does not affect ac3.balance.

11.5.3   Updating Variables versus Updating Data Fields🔗

We’ve now seen two different forms of updates in programs: updates to fields of structured data in State, Change, and Testing, and updates to the values associated with names when computing over lists with for loops. At a quick glance, especially in Python, these two forms of update look similar:

Python

Pyret

acct1.balance = acct1.balance - 50
run_total = run_total + fst

acct1!{balance: acct1!balance - 50}
run-total := run-total + fst

While Pyret makes these two operations look quite distinct, in Python, both use the = operator and compute a new value on the right side. The left sides, however, are subtly different: one is a field within structured data, while the other is a name in the directory. This difference turns out to be significant: the first form changes a value stored in the heap but leaves the directory unchanged (hence using the heap modification syntax in Pyret), while the second updates the directory but leaves the heap unchanged (hence using the := directory updating syntax in Pyret).

At this point, you might not appreciate why this difference is significant. But for now, let’s summarize how each of these forms impacts each of the directory and the heap.

Strategy: Rules for updating the directory and the heap

Summarizing, the rules for how the directory and memory update are as follows:

  • We add to the heap when a data constructor is used

  • We update the heap when a field of existing data is reassigned

  • We add to the directory when a name is used for the first time (this includes parameters and internal variables when a function is called)

  • We update the directory when a name that is already in the directory is subsequently assigned a new value)

Do Now!

After running the following code, what is the value of ac3.balance?

Python

Pyret

ac2 = Account(8728, 200)
ac3 = ac2
print(ac3.balance)
ac2.balance = 500
print(ac3.balance)
ac2 = Account(8734, 350)
ac2.balance = 700
print(ac3.balance)

var ac2 = Account(8728, 200)
ac3 = ac2
print(ac3.balance)
ac2!{balance: 500}
print(ac3.balance)
ac2 := Account(8734, 350)
ac2!{balance: 700}
print(ac3.balance)

Draw the directory and heap contents for this program and check your prediction.

This example combines updates to variables and updates to fields. On the third line, ac2 and ac3 refer to the same address in the heap (which contains the Account with id 8728. Immediately after updating ac2.balance on the fourth line, the balance in both ac2 and ac3 is 500. Line six, however, creates a new Account in the heap and updates the directory to have ac2 refer to that new Account. From that point on, ac2 and ac3 refer to different accounts, so the update to the balance in ac2 on the seventh line does not affect ac3.

This example illustrates the subtleties and impacts of different uses of = in Python. Programs behave differently depending on whether the left side of the = is a variable name or a field reference, and on whether the right side is basic data or data with components. We will continue to work with these various combinations to build your understanding of when and how to use each one.

11.5.4   Updating Parameters in Function Calls🔗

When we first learned about the directory in The Program Directory, we showed how function calls created their own local directory segments to store any names that got introduced while running the function. Now that we have the ability to update the values associated with variables, we should revisit this topic to understand what happens when these updates occur within functions.

Consider the following two functions, presented only in Python this time:

def add10(num: int):
  num = num + 10

def deposit10(ac: Account)
  ac.balance = ac.balance + 10

Let’s use these two functions in a program:

x = 15
a = Account(8435, 500)
add10(x)
deposit10(a)

Do Now!

What are the values of x and a when the program has finished?

Let’s draw out the directory and heap for this program.

We need a way to distinguish local directories from the global one – easiest for now might be to add a form for local-env-with-heap that uses the label "Local Directory (fun name)".

After the first two lines but before the function calls, we have the following:

Directory

  • x

      15

  • a

      1014

Heap

  • 1014: 

    Account(8435, 500)

Calling add10 creates a local directory containing the name of the parameter:

Directory

  • num

      15

Heap

  • 1014: 

    Account(8435, 500)

Wait – why is the heap listed alongside the local directory? Only the directory gets localized during function calls. The same heap is used at all times.

The body of add10 now updates the value of num in the directory to 25. Note that, in order for this to work in Pyret, the variable num would have had to be declared as mutable (using var), but this is not possible with function parameters, hence only showing these examples in Python. This change does not affect the value of x in the top-level directory, for the same reasons we explained in Variable Updates and Aliasing regarding the lack of aliasing between variables that refer to basic data. Thus, once the function finishes and the local directory is deleted, the value associated with x is unchanged.

Now, let’s evaluate the call deposit10(a). As with add10, we create a local directory and create an entry for the parameter. What gets associated with that parameter in the directory, however?

Directory

  • ac

      1014

Heap

  • 1014: 

    Account(8435, 500)

Do Now!

Why didn’t we create a new Account datum when we made the function call?

Remember our rule for when we create new data in the heap: we only create heap data when we explicitly use a constructor. The function call does not involve creating a new Account. Whatever is associated with the name a gets associated with the parameter name ac. In other words, we have created an alias between a and ac.

In the body of deposit10, we update the balance of ac, which is also the balance of a due to the aliasing. Since there is no local heap, when the function call is over, the new balance persists in a.

All we’ve done here is put together pieces that we’ve already seen, just in a new context. We’re passing parameters and updating either the (local) directory or the heap according to how we have used =. But this example highlights a detail that initially confuses many people when they start writing functions that update variables.

Strategy: Updating Values within Functions

If you want a function to update a value and have that update persist after the function completes, you must put that value inside a piece of data. You cannot have it be basic data associated with a variable name.

11.5.5   Updating Top-Level Variables within Function Calls🔗

Let’s return to our banking example to illustrate a situation where the ability to update variables is extremely useful. Consider our current process for creating new accounts in the bank by looking at the following example:

ac5 = Account(8702, 435)
ac6 = Account(8703, 280)
ac7 = Account(8704, 375)

Notice that each time we create an Account we have to take care to increase the id number? What if we made a typo or accidentally forgot to do this?

ac5 = Account(8702, 435)
ac6 = Account(8703, 280)
ac7 = Account(8703, 375)

Now we’d have multiple accounts with the same ID number, when we really need these numbers to be unique across all accounts. To avoid such problems, we should instead have a function for creating accounts that takes the initial balance as input and uses a guaranteed-unique ID number.

How might we write such a function? The challenge is to be able to generate unique ID numbers each time. What if we used a variable to store the next available ID number, updating it each time we created a new account? That function might look at follows:

Python

Pyret

# stores the next available ID number
nextID = 8000

def createacct(initbal: float) -> Account:
  newacct = Account(nextID, initbal)
  nextID = nextID + 1
  return(newacct)

# stores the next available ID number
var nextID = 8000

fun createacct(initbal: Number) -> Account block:
  newacct = Account(nextID, initbal)
  nextID := nextID + 1
  newacct
end

Let’s run this program, creating new accounts as follows:

ac5 = createacct(435)
ac6 = createacct(280)
ac7 = createacct(375)

Do Now!

Copy this code into Pyret and run it. Check that each of ac5, ac6, and ac7 have unique ID numbers. Now try the same in Python.

What happened? All three of these have the same ID of 8000. It looks like our update to nextID, in Python, just didn’t work. Actually, it did work, but to understand how, we have to look at what happened in the directory.

Do Now!

Draw the memory diagram for this example.

After we set up nextID and define the function, our memory diagram appears as:

Directory

  • nextID

      8000

Now, let’s evaluate ac5 = createacct(435). We call createacct, which yields the following local directory after creating the Account but before updating nextID.

Directory

  • initbal

      435

  • newacct

      1015

Heap

  • 1015: 

    Account(8000, 435)

Do Now!

What do you think happens when we run nextID = nextID + 1 in Python?

Let’s run this carefully. Python first evaluates the right side of the = (nextID + 1). nextID is not in the local directory, so Python retrieves its value (8000) from the top-level directory. Thus, this computation becomes nextID = 8001.

The question here is how Python treats nextID = 8001: we currently have both the local directory for the function call and the top-level directory. Which one should get the new value of nextID? Since the local directory is active, Python sets the value of nextID there.

Directory

  • initbal

      435

  • newacct

      1015

  • nextID

      8001

Heap

  • 1015: 

    Account(8000, 435)

Let’s repeat that: Python computed nextID + 1 using the nextID value from the top-level directory since there was no value for nextID in the local directory. But the setting of the value of nextID could and did occur in the local directory. Thus, when createacct finishes, the value of nextID in the top-level directory is unchanged. As a result, all of the accounts get the same value.

The computuation we are trying to do—updating the top-level variable—is just fine. The problem is that Python defaults to the local directory. To make this work, we need to tell Python that we want to make updates to nextID in the top-level directory. Here’s the version of createacct that does that:

def createacct(initbal: float) -> Account:
  global nextID
  newacct = Account(nextID, initbal)
  nextID = nextID + 1
  return(newacct)

The global keyword tells Python to make updates to the given variable in the top-level directory, not the local directory. Once we make this modification, each account we create will get a unique ID number.

Why is this not an issue with Pyret? Since := is used to modify, but not add, a value to the program directory, Pyret has no choice but to choose the version from the top-level directory when there is no local version. Indeed, even if we wanted to have a local version as well, declaring var nextID = ... within createacct, Pyret would complain due to the local variable shadowing the top-level one. By disallowing shadowing and separating variable declaration from modification, Pyret eliminates this potential source of confusion.

Responsible Computing: Keeping IDs Unpredictable

While this general pattern of generating unique IDs works, in practice we shouldn’t use consecutive numbers. Consecutive numbers are guessable: if there is an account 8000 there must be an account 8001, and so on. Guessable account numbers could make it easier for someone who keeps trying to guess valid IDs to use to log into websites or otherwise access information.

Instead, we would use a computation that is less predictable than “add 1” when storing the nextID value. For now, the pattern we have shown you is fine. If you were building a real system, however, you’d want to make that computation a bit more sophisticated.

11.5.6   The Many Roles of Variables🔗

At this point, we have used the single coding construct of a variable in the directory for multiple purposes. It’s worth stepping back and calling those out explicitly. In general, variables serve one of the following purposes:

  1. Tracking progress of a computation (e.g., the running value of a result in a for-loop)

  2. Maintaining information across multiple calls to a single function (e.g., the nextID variable)

  3. Naming a local or intermediate value in a computation

Each of these uses involves a different programming pattern. The first creates a variable locally within a function. The second creates a top-level variable and, in Python, requires using global in functions that modify the contents. However, the variable is only meant to be used by a single function. Ideally, there would be a way to not expose the variable to all functions. Indeed, many programming languages (including Pyret) make it easy to do that. This is harder to achieve with introductory-level concepts in Python, however. The third is more about local names rather than variables, in that our code never updates the value after the variable is created.

We call out these three roles precisely because they invoke different code patterns, despite using the same fine-grained concept (assigning a new value to a variable). When you look at a new programming problem, you can ask yourself whether the problem involves one of these purposes, and use that to guide your choice of pattern to use.