Thursday, October 1, 2015

Copying lists (shallow copy vs. deep copy)

This post is an excerpt from my upcoming eBook "Mastering Python Lists".


Introduction

Python supports two types of list copying: “shallow” and “deep”. A shallow copy shares all of its items with the original list. In contrast, all the items in a deep copy are completely independent from the originals.

Shallow copies

A shallow copy can be created in 3 different ways

  • list() — passing the original list to the “list” function
  • [:] — taking a full slice of the original list
  • copy.copy() — using the “copy” function from the “copy” module
These all make a “shallow” copy of the original list.

original_list = [1,2,3]
shallow_copy_1 = original_list[:]
shallow_copy_2 = list(original_list)
shallow_copy_3 = copy.copy(original_list)

Shallow copy example

Shallow copies can lead to surprising behavior if you don’t understand the difference between a “shallow” copy and a “deep” copy. Let’s create a list containing an inner list, then copy it and see what happens.

outer_list = [1,2,[’a’,’b’,’c’]]
copy_1 = list(outer_list)
copy_2 = outer_list[:]




Notice how the inner list [’a’,’b’,’c’] is shared between the original list and the two copies. This means any change to outer_list[2] is also reflected in copy_1[2] and copy_2[2].

Deep copy example

If we want a copy of a list that is truly independent of the original list, we must use the “deepcopy” function from the “copy” module. This will create an independent copy of the original.

Compare this “deep copy” diagram to the previous “shallow copy” diagram and notice that the inner list is no longer shared.

import copy
outer_list = [1,2,['a','b','c']]
copy_1 = copy.deepcopy(outer_list)
copy_2 = copy.deepcopy(outer_list)



Summary

Shallow copying is Python’s default behavior since creating a shallow copy is much faster than creating a deep copy. Use caution whenever you modify a shallow copy because this may cause hard to find side effects in other parts of your program.