In this module, we will see how Julia deals with collections when they are passed as arguments to a function, why this can be terrifying when coming from other languages that are less concerned with economy of memory, and how we can use this behavior to write more efficient code.
Let’s wrap the numbers from one to 25 in a matrix:
A = reshape(collect(1:25), (5, 5))
5×5 Matrix{Int64}:
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
And now, let’s double them:
2A
5×5 Matrix{Int64}:
2 12 22 32 42
4 14 24 34 44
6 16 26 36 46
8 18 28 38 48
10 20 30 40 50
Very well. Now let’s do this within a function:
function twice(M)
for i in eachindex(M)
M[i] = 2M[i]
end
return M
end
twice (generic function with 1 method)
We can call this function on A
:
twice(A)
5×5 Matrix{Int64}:
2 12 22 32 42
4 14 24 34 44
6 16 26 36 46
8 18 28 38 48
10 20 30 40 50
And let’s have a look at A
again:
A
5×5 Matrix{Int64}:
2 12 22 32 42
4 14 24 34 44
6 16 26 36 46
8 18 28 38 48
10 20 30 40 50
A
has been changed
because we passed it through the twice
function. This is super dangerous.
But also really useful. With great power, and all that.So what is going on?
When you pass a collection to a Julia function, you do not pass a copy of this object. You pass the location of this collection in memory, for the function to work with. It might make sense if you know C, and no sense if you know R, but this is what Julia does, and the net benefit is: regardless of the size of the array you work on, it “costs” the same memory footprint to send it to a function.
For this reason, it is very easy (in fact, it is the default) for Julia functions
to perform mutations on their arguments.
By convention, this behavior is indicated by a !
at the end of the
function name. Think of it as the language going Hey, listen!, and asking
you to pay attention.
So let’s start again, with a collection of values, and we will get a matrix with the remainder of their integer division by any other integer:
A = reshape(collect(1:25), (5, 5))
5×5 Matrix{Int64}:
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
The mutating function will have a !
at the end of its name:
function remainder!(X::Matrix{T}, d::T) where {T <: Integer}
for i in eachindex(X)
X[i] = X[i] % d
end
return X
end
remainder! (generic function with 1 method)
eachindex
to move through the matrix linearly, as we discussed
in the module on iteration. We would get the same behavior iterating over rows and columns
explicitly, but this notation is more concise, and therefore more readable and “better”.This function will overwrite its argument X
. How do we avoid this? We cannot. It is
a consequence of functions receiving the memory location of collections, as opposed to
a copy of the collection. But wait! This is our solution! If we do not want the function
to modify our original collection, we can point it towards a copy of this collection!
function remainder(X::Matrix{T}, d::T) where {T <: Integer}
Y = copy(X)
remainder!(Y, d)
return Y
end
remainder (generic function with 1 method)
This function does not end with a !
: this is because it is not modifying
its first argument. Instead, we create a copy of this argument. This is when
we re-use the function that changes its argument, to modify the copy, which we
then return. This forms the basis of a very powerful (and common) design pattern in
Julia: write a function foo!
, then write a function foo
who copies its first
argument and calls foo!
on the copy.
We can call our new function:
remainder(A, 3)
5×5 Matrix{Int64}:
1 0 2 1 0
2 1 0 2 1
0 2 1 0 2
1 0 2 1 0
2 1 0 2 1
We can check that A
has not been affected:
A
5×5 Matrix{Int64}:
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
Success!
remainder
and remainder!
are different functions, even though they belong to the
same “family”. This may be important if you look at the methods for
remainder
, as it will not list the methods for remainder!
.And now, here is how we turn this into a very powerful way to be efficient
about memory management. We will write another method for remainder!
that
over-writes, not its original argument, but an arbitrary array:
Z = similar(A)
5×5 Matrix{Int64}:
140312103748688 140313103692272 140312103749776 9 140312516550384
140312145116736 140312147945328 140312103749840 140312103749904 1
140312145116576 140312147945376 140312147945280 140312145116576 31799
140312144570448 140312147945424 140312934503200 140312533780016 140312147945472
140311136661328 140312103749712 31799 140312533780048 140311116135040
similar
function will create an object that is similar to its argument in
the sense that it will have the same shape and type, but the value inside this object are
not predictible.This might be a placeholder array we will use to store results temporarily. For example, when we perform thousands of iterations, we might not need to re-allocate the memory every time: in this case, it is beneficial to over-write the same memory locations over and over again.
function remainder!(x::Matrix{T}, X::Matrix{T}, d::T) where {T <: Integer}
for i in eachindex(X)
x[i] = X[i] % d
end
return x
end
remainder! (generic function with 2 methods)
The way to call this function is:
remainder!(Z, A, 2)
5×5 Matrix{Int64}:
1 0 1 0 1
0 1 0 1 0
1 0 1 0 1
0 1 0 1 0
1 0 1 0 1
remainder!(Z, A, 3)
5×5 Matrix{Int64}:
1 0 2 1 0
2 1 0 2 1
0 2 1 0 2
1 0 2 1 0
2 1 0 2 1
But wait! The new version of remainder!
we wrote is suspiciously similar
to the earliest version of remainder!
. In fact, with the exception of a
single change, they are the same function. It is therefore time to revisit our
initial remainder!
function:
function remainder!(X::Matrix{T}, d::T) where {T <: Integer}
return remainder!(X, X, d)
end
remainder! (generic function with 2 methods)
See, writing read from X and write in Z and read from X and write in X is
essentially the same thing, because we are giving Julia memory locations.
And because remainder!
is a function with different methods, the dispatch
system is now working for us.
@assert
, or more explicitly with checks
and the throwing of exceptions.So let’s summarize, and write our little remainder “library”, from the most basal to the most abstract function:
function remainder!(x::Matrix{T}, X::Matrix{T}, d::T) where {T <: Integer}
for i in eachindex(X)
x[i] = X[i] % d
end
return x
end
function remainder!(X::Matrix{T}, d::T) where {T <: Integer}
return remainder!(X, X, d)
end
function remainder(X::Matrix{T}, d::T) where {T <: Integer}
Y = copy(X)
remainder!(Y, d)
return Y
end
remainder (generic function with 1 method)
What happens when we do the following?
remainder(A, 3)
5×5 Matrix{Int64}:
1 0 2 1 0
2 1 0 2 1
0 2 1 0 2
1 0 2 1 0
2 1 0 2 1
We start by calling remainder
, which makes a copy of the first argument,
then calls remainder!
(using a single argument); this then calls
remainder!
using the two arguments, and gives us the result without
modifying A
:
A
5×5 Matrix{Int64}:
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
Let’s take a step back. Working with this design pattern is very useful when
we sometimes have to overwrite some memory. In other cases, we only care about
getting an object returned. Or we definitely want to overwrite the object we
pass as an argument. To accomodate these use-cases, writing functions with and
without !
, and that operate on themselves or similar objects, give us the
flexibility we need, and is a very Julian code design.