Julia programming language tutorial is an introduction to Julia for Python programmers. It will go through the most important Python features (such as functions, basic types, list comprehensions, exceptions, generators, modules, packages, and so on) and show you how to code them in Julia IDE. By the end of this Julia tutorial, you will have a fair mental model of what coding in Julia is all about.
Julia looks and feels a lot like Python, only much faster. It’s dynamic, expressive, extensible, with batteries included, in particular for Data Science.
1. Running This Code Locally
If you prefer to run this code on your machine, then:
- Install Julia
- Run the following command in a terminal (or command prompt for windows) to install
IJulia
(the Jupyter kernel for Julia), and a few packages we will use:
julia -e 'using Pkg
pkg"add IJulia; precompile;"
pkg"add BenchmarkTools; precompile;"
pkg"add PyCall; precompile;"
pkg"add PyPlot; precompile;"'
Next, go to the directory containing this notebook:
cd /path/to/notebook/directory
Start Jupyter Notebook:
julia -e 'using IJulia; IJulia.notebook()'
Or replace notebook()
with jupyterlab()
if you prefer JupyterLab.
If you do not already have Jupyter installed, IJulia
will propose to install it. If you agree, it will automatically install a private Miniconda (just for Julia), and install Jupyter and Python inside it.
2. Checking the Installation
The versioninfo()
function should print your Julia version and some other info about the system (if you ever ask for help or file an issue about Julia, you should always provide this information).
versioninfo()
Output:
Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)
Environment:
JULIA_NUM_THREADS = 4
3. Getting Help
To get help on any module, function, variable, or just about anything else, just type ?
followed by what you’re interested in. For example:
?versioninfo
#> search: versioninfo
#> versioninfo(io::IO=stdout; verbose::Bool=false)
Check version info.
versioninfo(io::IO=stdout; verbose::Bool=false)
#> Print information about the version of Julia in use. The output is controlled with boolean keyword arguments:
#> `verbose`: print all additional information
This works in interactive mode only: in Jupyter, Colab and in the Julia shell (called the REPL).
Here are a few more ways to get help and inspect objects in interactive mode:
Julia | Python |
---|---|
?obj |
help(obj) |
dump(obj) |
print(repr(obj)) |
names(FooModule) |
dir(foo_module) |
methodswith(SomeType) |
dir(SomeType) |
@which func |
func.__module__ |
apropos("bar") |
Search for "bar" in docstrings of all installed packages |
typeof(obj) |
type(obj) |
obj isa SomeType or isa(obj, SomeType) |
isinstance(obj, SomeType) |
If you ever ask for help or file an issue about Julia, you should generally provide the output of versioninfo()
.
And of course, you can also learn and get help here:
- Learning
- Documentation
- Questions & Discussions:
4. A First Look at Julia
This section will give you an idea of what Julia looks like and what some of its major qualities are: it’s expressive, dynamic, flexible, and most of all, super fast.
Estimating π
Let’s write our first function. It will estimate π using the equation:
π = 4 x (1 – 1/3 + 1/5 – 1/7 + 1/9 – 1/11 + . .)
There are much better ways to estimate π, but this one is easy to implement.
function estimate_pi(n)
s = 1.0
for i in 1:n
s += (isodd(i) ? -1 : 1) / (2i + 1)
end
4s
end
p = estimate_pi(100_000_000)
println("π ≈ $p")
println("Error is $(p - π)")
#> π ≈ 3.141592663589326
#> Error is 9.999532757376528e-9
Compare this with the equivalent Python 3 code:
import math
def estimate_pi(n):
s = 1.0
for i in range(1, n + 1):
s += (-1 if i % 2 else 1) / (2 * i + 1)
return 4 * s
p = estimate_pi(100_000_000)
print(f"π ≈ {p}") # f-strings are available in Python 3.6+
print(f"Error is {p - math.pi}")
Pretty similar, right? But notice the small differences:
Julia | Python |
---|---|
function |
def |
for i in X ... end |
for i in X: ... |
1:n |
range(1, n+1) |
cond ? a : b |
a if cond else b |
2i + 1 |
2 * i + 1 |
4s |
return 4 * s |
println(a, b) |
print(a, b, sep="") |
print(a, b) |
print(a, b, sep="", end="") |
"$p" |
f"{p}" |
"$(p - π)" |
f"{p - math.pi}" |
This example shows that:
1. Julia can be just as concise and readable as Python.
2. Indentation in Julia is not meaningful like it is in Python. Instead, blocks end with end
.
3. Many math features are built in Julia and need no imports.
4. There’s some mathy syntactic sugar, such as 2i
(but you can write 2 * i
if you prefer).
5. In Julia, the return
keyword is optional at the end of a function. The result of the last expression is returned (4s
in this example).
6. Julia loves Unicode and does not hesitate to use Unicode characters like π
. However, there are generally plain-ASCII equivalents (e.g., π == pi
).
5. Typing Unicode Characters
Typing Unicode characters is easy: for latex symbols like π, just type \pi
. For emojis like 😃, type \:smiley:
.
This works in the REPL, in Jupyter, but unfortunately not in Colab (yet?). As a workaround, you can run the following code to print the character you want, then copy/paste it:
using REPL.REPLCompletions: latex_symbols, emoji_symbols
latex_symbols["\\pi"]
#> "π"
Emoji
emoji_symbols["\\:smiley:"]
#> "😃"
In Julia, using Foo.Bar: a, b
corresponds to running from foo.bar import a, b
in Python.
Julia | Python |
---|---|
using Foo |
from foo import *; import foo |
using Foo.Bar |
from foo.bar import *; from foo import bar |
using Foo.Bar: a, b |
from foo.bar import a, b |
using Foo: Bar |
from foo import bar |
6. Running Python code in Julia
Julia lets you easily run Python code using the PyCall
module. We installed it earlier, so we just need to import it:
using PyCall
Now that we have imported PyCall
, we can use the pyimport()
function to import a Python module directly in Julia! For example, let’s check which Python version we are using:
sys = pyimport("sys")
sys.version
#> "3.6.9 (default, Apr 18 2020, 01:56:04) \n[GCC 8.4.0]"
In fact, let’s run the Python code we discussed earlier (this will take about 15 seconds to run, because Python is so slow… ):
# Run Python code in Julia
py"""
import math
def estimate_pi(n):
s = 1.0
for i in range(1, n + 1):
s += (-1 if i % 2 else 1) / (2 * i + 1)
return 4 * s
p = estimate_pi(100_000_000)
print(f"π ≈ {p}") # f-strings are available in Python 3.6+
print(f"Error is {p - math.pi}")
"""
As you can see, running arbitrary Python code is as simple as using py-strings (py"..."
). Note that py-strings are not part of the Julia language itself: they are defined by the PyCall
module (we will see how this works later).
Unfortunately, Python’s print()
function writes to the standard output, which is not captured by Colab, so we can’t see the output of this code. That’s okay, we can look at the value of p
:
# Python 'p'
py"p"
#> 3.141592663589326
Let’s compare this to the value we calculated above using Julia:
# subtract Julia 'p' from Python 'p'
py"p" - p
#> 0.0
Perfect, they are exactly equal!
As you can see, it’s very easy to mix Julia and Python code. So if there’s a module you really love in Python, you can keep using it as long as you want! For example, let’s use NumPy:
np = pyimport("numpy")
a = np.random.rand(2, 3)
#> 2×3 Array{Float64,2}:
#> 0.326131 0.337986 0.475167
#> 0.537621 0.912136 0.792325
Notice that PyCall
automatically converts some Python types to Julia types, including NumPy arrays. That’s really quite convenient! Note that Julia supports multi-dimensional arrays (analog to NumPy arrays) out of the box. Array{Float64, 2}
means that it’s a 2-dimensional array of 64-bit floats.
PyCall
also converts Julia arrays to NumPy arrays when needed:
exp_a = np.exp(a)
#> 2×3 Array{Float64,2}:
#> 1.3856 1.40212 1.60828
#> 1.71193 2.48963 2.20852
If you want to use some Julia variable in a py-string, for example exp_a
, you can do so by writing $exp_a
like this:
py"""
import numpy as np
result = np.log($exp_a)
"""
py"result"
#> 2×3 Array{Float64,2}:
#> 0.326131 0.337986 0.475167
#> 0.537621 0.912136 0.792325
If you want to keep using Matplotlib, it’s best to use the PyPlot
module (which we installed earlier), rather than trying to use pyimport("matplotlib")
, as PyPlot
provides a more straightforward interface with Julia, and it plays nicely with Jupyter and Colab:
using PyPlot
x = range(-5π, 5π, length=100)
plt.plot(x, sin.(x) ./ x) # we'll discuss this syntax in the next section
plt.title("sin(x) / x")
plt.grid("True")
plt.show()
That said, Julia has its own plotting libraries, such as the Plots
library, which you may want to check out.
As you can see, Julia’s range()
function acts much like NumPy’s linspace()
function, when you use the length
argument.
However, it acts like Python’s range()
function when you use the step
argument instead (except the upper bound is inclusive). Julia’s range()
function returns an object which behaves just like an array, except it doesn’t actually use any RAM for its elements, it just stores the range parameters. If you want to collect all of the elements into an array, use the collect()
function (similar to Python’s list()
function):
println(collect(range(10, 80, step=20)))
#> [10, 30, 50, 70]
println(collect(10:20:80)) # 10:20:80 is equivalent to the previous range
#> [10, 30, 50, 70]
println(collect(range(10, 80, length=5))) # similar to NumPy's linspace()
#> [10.0, 27.5, 45.0, 62.5, 80.0]
step = (80-10)/(5-1) # 17.5
println(collect(10:step:80)) # equivalent to the previous range
#> [10.0, 27.5, 45.0, 62.5, 80.0]
The equivalent Python code is:
# PYTHON
print(list(range(10, 80+1, 20)))
# there's no short-hand for range() in Python
print(np.linspace(10, 80, 5))
step = (80-10)/(5-1) # 17.5
print([i*step + 10 for i in range(5)])
Julia | Python |
---|---|
np = pyimport("numpy") |
import numpy as np |
using PyPlot |
from pylab import * |
1:10 |
range(1, 11) |
1:2:10 or range(1, 11, 2) |
range(1, 11, 2) |
1.2:0.5:10.3 or range(1.2, 10.3, step=0.5) |
np.arange(1.2, 10.3, 0.5) |
range(1, 10, length=3) |
np.linspace(1, 10, 3) |
collect(1:5) or [i for i in 1:5] |
list(range(1, 6)) or [i for i in range(1, 6)] |
7. Loop Fusion (Similar to Python’s List comprehension)
Did you notice that we wrote sin.(x) ./ x
(not sin(x) / x
)? This is equivalent to [sin(i) / i for i in x]
.
a = sin.(x) ./ x
b = [sin(i) / i for i in x]
@assert a == b
This is called a ‘dot’ operation.
This is not just syntactic sugar: it’s actually a very powerful Julia feature. Indeed, notice that the array only gets traversed once. Even if we chained more than two dotted operations, the array would still only get traversed once. This is called loop fusion.
This is significantly faster than NumPy, though NumPy is written in C. Why?
Because, when using NumPy arrays, sin(x) / x
first computes a temporary array containing sin(x)
and then it computes the final array. Two loops and two arrays instead of one. NumPy is implemented in C, and has been heavily optimized, but if you chain many operations, it still ends up being slower and using more RAM than Julia.
However, all the extra dots can sometimes make the code a bit harder to read. To avoid that, you can write @.
before an expression: every operation will be “dotted” automatically, like this:
a = @. sin(x) / x
b = sin.(x) ./ x
@assert a == b
Note: Julia’s @assert
statement starts with an @
sign, just like @.
, which means that they are macros.
In Julia, macros are very powerful metaprogramming tools. A macro is evaluated at parse time, and it can inspect the expression that follows it and then transform it, or even replace it. In practice, you will often use macros, but you will rarely define your own. I’ll come back to macros later.
8. Julia is fast!
Let’s compare the Julia and Python implementations of the estimate_pi()
function:
@time estimate_pi(100_000_000);
#> 0.140922 seconds
To get a more precise benchmark, it’s preferable to use the BenchmarkTools
module. Just like Python’s timeit
module, it provides tools to benchmark code by running it multiple times. This provides a better estimate of how long each call takes.
using BenchmarkTools
@benchmark estimate_pi(100_000_000)
Output:
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 133.074 ms (0.00% GC)
median time: 137.283 ms (0.00% GC)
mean time: 137.457 ms (0.00% GC)
maximum time: 145.218 ms (0.00% GC)
--------------
samples: 37
evals/sample: 1
If this output is too verbose for you, simply use @btime
instead:
@btime estimate_pi(100_000_000)
#> 132.646 ms (0 allocations: 0 bytes)
Now let’s time the Python version. Since the call is so slow, we just run it once (it will take about 15 seconds):
py"""
from timeit import timeit
duration = timeit("estimate_pi(100_000_000)", number=1, globals=globals())
"""
py"duration"
#> 14.16427015499994
It looks like Julia is close to 100 times faster than Python in this case! To be fair, PyCall
does add some overhead, but even if you run this code in a separate Python shell, you will see that Julia crushes (pure) Python when it comes to speed.
So why is Julia so much faster than Python?
Well, Julia compiles the code on the fly as it runs it.
Okay, let’s summarize what we learned so far:
Julia is a dynamic language that looks and feels a lot like Python, you can even execute Python code super easily, and pure Julia code runs much faster than pure Python code, because it is compiled on the fly. I hope this convinces you to read on!
Next, let’s continue to see how Python’s main constructs can be implemented in Julia.
9. Working with Numbers
i = 42 # 64-bit integer
f = 3.14 # 64-bit float
c = 3.4 + 4.5im # 128-bit complex number
bi = BigInt(2)^1000 # arbitrarily long integer
bf = BigFloat(1) / 7 # arbitrary precision
r = 15//6 * 9//20 # rational number
#> 9//8
And the equivalent Python code:
# PYTHON
i = 42
f = 3.14
c = 3.4 + 4.5j
bi = 2**1000 # integers are seemlessly promoted to long integers
from decimal import Decimal
bf = Decimal(1) / 7
from fractions import Fraction
r = Fraction(15, 6) * Fraction(9, 20)
Dividing integers gives floats, like in Python:
5 / 2
#> 2.5
For integer division, use ÷
or div()
:
5 ÷ 2
#> 2
Or use div()
for division
div(5, 2)
#> 2
The %
operator is the remainder, not the modulo like in Python. These differ only for negative numbers:
# remainder
57 % 10
#> 7
Julia | Python |
---|---|
3.4 + 4.5im |
3.4 + 4.5j |
BigInt(2)^1000 |
2**1000 |
BigFloat(3.14) |
from decimal import Decimal Decimal(3.14) |
9//8 |
from fractions import Fraction Fraction(9, 8) |
5/2 == 2.5 |
5/2 == 2.5 |
5÷2 == 2 or div(5, 2) |
5//2 == 2 |
57%10 == 7 |
57%10 == 7 |
(-57)%10 == -7 |
(-57)%10 == 3 |
10. Strings
Julia strings use double quotes "
or triple quotes """
, but not single quotes '
:
s = "ångström" # Julia strings are UTF-8 encoded by default
println(s)
#> ångström
s = "Julia strings
can span
several lines\n\n
and they support the \"usual\" escapes like
\x41, \u5bb6, and \U0001f60a!"
println(s)
#> Julia strings
#> can span
#> several lines
#>
#>
#> and they support the "usual" escapes like
#> A, 家, and 😊!
Use repeat()
instead of *
to repeat a string, and use *
instead of +
for concatenation:
s = repeat("tick, ", 10) * "BOOM!"
println(s)
#> tick, tick, tick, tick, tick, tick, tick, tick, tick, tick, BOOM!
The equivalent Python code is:
# Python
s = "tick, " * 10 + "BOOM!"
print(s)
Use join(a, s)
instead of s.join(a)
:
s = join([i for i in 1:4], ", ")
println(s)
#> 1, 2, 3, 4
You can also specify a string for the last join:
s = join([i for i in 1:4], ", ", " and ")
#> "1, 2, 3 and 4"
split()
works as you might expect:
split(" one three four ")
#> 3-element Array{SubString{String},1}:
#> "one"
#> "three"
#> "four"
You can specify a separator as well.
split("one,,three,four!", ",")
#> 4-element Array{SubString{String},1}:
#> "one"
#> ""
#> "three"
#> "four!"
Check if a pattern occurs in a string.
occursin("sip", "Mississippi")
#> true
Replace a string with another.
replace("I like coffee", "coffee" => "tea")
#> "I like tea"
Triple quotes work a bit like in Python, but they also remove indentation and ignore the first line feed:
s = """
1. the first line feed is ignored if it immediately follows \"""
2. triple quotes let you use "quotes" easily
3. indentation is ignored
- up to left-most character
- ignoring the first line (the one with \""")
4. the final line feed it n̲o̲t̲ ignored
"""
println("<start>")
println(s)
println("<end>")
#> 1. the first line feed is ignored if it immediately follows """
#> 2. triple quotes let you use "quotes" easily
#> 3. indentation is ignored
#> - up to left-most character
#> - ignoring the first line (the one with """)
#> 4. the final line feed it n̲o̲t̲ ignored
Let’s see some more examples.
11. String Interpolation
String interpolation uses $variable
and $(expression)
:
total = 1 + 2 + 3
s = "1 + 2 + 3 = $total = $(1 + 2 + 3)"
println(s)
#> 1 + 2 + 3 = 6 = 6
This means you must escape the $
sign:
s = "The car costs \$10,000"
println(s)
#> The car costs $10,000
12. Raw Strings
Raw strings use raw"..."
instead of the r"..."
used in Python.
s = raw"In a raw string, you only need to escape quotes \", but not
$ or \. There is one exception, however: the backslash \
must be escaped if it's just before quotes like \\\"."
println(s)
#> In a raw string, you only need to escape quotes ", but not
#> $ or \. There is one exception, however: the backslash \
#> must be escaped if it's just before quotes like \".
Another Example
s = raw"""
Triple quoted raw strings are possible too: $, \, \t, "
- They handle indentation and the first line feed like regular
triple quoted strings.
- You only need to escape triple quotes like \""", and the
backslash before quotes like \\".
"""
println(s)
#> Triple quoted raw strings are possible too: $, \, \t, "
#> - They handle indentation and the first line feed like regular
#> triple quoted strings.
#> - You only need to escape triple quotes like """, and the
#> backslash before quotes like \".
13. Characters
Single quotes are used for individual Unicode characters:
a = 'å' # Unicode code point (single quotes)
#> 'å': Unicode U+00E5 (category Ll: Letter, lowercase)
To be more precise:
1. A Julia “character” represents a single Unicode code point (sometimes called a Unicode scalar).
2. Multiple code points may be required to produce a single grapheme, i.e., something that readers would recognize as a single character. Such a sequence of code points is called a “Grapheme cluster”.
For example, the character é
can be represented either using the single code point \u00E9
, or the grapheme cluster e
+ \u0301
:
s = "café"
println(s, " has ", length(s), " code points")
#> café has 4 code points
Alternately:
s = "cafe\u0301"
println(s, " has ", length(s), " code points")
#> café has 5 code points
In a ‘For loop’:
for c in "cafe\u0301"
display(c)
end
#> 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
#> 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
#> 'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)
#> 'e': ASCII/Unicode U+0065 (category Ll: Letter, lowercase)
#> '́': Unicode U+0301 (category Mn: Mark, nonspacing)
Julia represents any individual character like 'é'
using 32-bits (4 bytes):
sizeof('é')
#> 4
But strings are represented using the UTF-8 encoding. In this encoding, code points 0 to 127 are represented using one byte, but any code point above 127 is represented using 2 to 6 bytes:
sizeof("a")
#> 1
Special characters:
sizeof("é")
#> 2
One more:
sizeof("家")
#> 3
Size of a grapheme.
sizeof("🏳️🌈") # this is a grapheme with 4 code points of 4 + 3 + 3 + 4 bytes
#> 14
Loop fusion on a grapheme.
[sizeof(string(c)) for c in "🏳️🌈"]
#> 4-element Array{Int64,1}:
#> 4
#> 3
#> 3
#> 4
You can iterate through graphemes instead of code points:
using Unicode
for g in graphemes("e\u0301🏳️🌈")
println(g)
end
#> é
#> 🏳️🌈
14. String Indexing
Characters in a string are indexed based on the position of their starting byte in the UTF-8 representation. For example, the character ê
in the string "être"
is located at index 1, but the character 't'
is located at index 3, since the UTF-8 encoding of ê
is 2 bytes long:
s = "être"
println(s[1])
println(s[3])
println(s[4])
println(s[5])
#> ê
#> t
#> r
#> e
If you try to get the character at index 2, you get an exception:
try
s[2]
catch ex
ex
end
#> StringIndexError("être", 2)
By the way, notice the exception-handling syntax (we’ll discuss exceptions later):
Julia | Python |
---|---|
try ... catch ex ... end |
try ... except Exception as ex ... end |
You can get a substring easily, using valid character indices:
s[1:3]
#> "êt"
You can iterate through a string, and it will return all the code points:
for c in s
println(c)
end
#> ê
#> t
#> r
#> e
Or you can iterate through the valid character indices:
for i in eachindex(s)
println(i, ": ", s[i])
end
#> 1: ê
#> 3: t
#> 4: r
#> 5: e
Benefits of representing strings as UTF-8:
1. All Unicode characters are supported.
2. UTF-8 is fairly compact (at least for Latin scripts).
3. It plays nicely with C libraries which expect ASCII characters only, since ASCII characters correspond to the Unicode code points 0 to 127, which UTF-8 encodes exactly like ASCII.
Drawbacks:
1. UTF-8 uses a variable number of bytes per character, which makes indexing harder.
2. However, If the language tried to hide this by making s[5]
search for the 5th character from the start of the string, then code like for i in 1:length(s); s[i]; end
would be unexpectedly inefficient, since at each iteration there would be a search from the beginning of the string, leading to O(n_2) performance instead of O(_n).
findfirst(isequal('t'), "être")
#> 3
Find last occurrence of:
findlast(isequal('p'), "Mississippi")
#> 10
Find next occurrence of:
findnext(isequal('i'), "Mississippi", 2)
#> 2
Find next occurrence of:
findnext(isequal('i'), "Mississippi", 2 + 1)
#> 5
Find previous occurrence of:
findprev(isequal('i'), "Mississippi", 5 - 1)
#> 2
Other useful string functions: ncodeunits(str)
, codeunit(str, i)
, thisind(str, i)
, nextind(str, i, n=1)
, prevind(str, i, n=1)
.
15. Regular Expressions in Julia
To create a regular expression in Julia, use the r"..."
syntax:
regex = r"c[ao]ff?(?:é|ee)"
#> r"c[ao]ff?(?:é|ee)"
The expression r"..."
is equivalent to Regex("...")
except the former is evaluated at parse time, while the latter is evaluated at runtime, so unless you need to construct a Regex dynamically, you should prefer r"..."
.
occursin(regex, "A bit more coffee?")
#> true
Return the pattern match
m = match(regex, "A bit more coffee?")
m.match
#> "coffee"
Offset position.
m.offset
#> 12
Another example.
m = match(regex, "A bit more tea?")
isnothing(m) && println("I suggest coffee instead")
#> I suggest coffee instead
One more.
regex = r"(.*)#(.+)"
line = "f(1) # nice comment"
m = match(regex, line)
code, comment = m.captures
println("code: ", repr(code))
println("comment: ", repr(comment))
#> code: "f(1) "
#> comment: " nice comment"
Print m
.
m[2]
#> " nice comment"
Show Offsets
m.offsets
#> 2-element Array{Int64,1}:
#> 1
#> 7
Matches
m = match(r"(?<code>.+)#(?<comment>.+)", line)
m[:comment]
#> " nice comment"
Replace
replace("Want more bread?", r"(?<verb>more|some)" => s"a little")
#> "Want a little bread?"
A slightly involved replace
example.
replace("Want more bread?", r"(?<verb>more|less)" => s"\g<verb> and \g<verb>")
#> "Want more and more bread?"
16. Control Flow – if
statement
Julia’s if
statement works just like in Python, with a few differences:
- Julia uses
elseif
instead of Python’selif
. - Julia’s logic operators are just like in C-like languages:
&&
meansand
,||
meansor
,!
meansnot
, and so on.
a = 1
if a == 1
println("One")
elseif a == 2
println("Two")
else
println("Other")
end
#> One
Julia also has ⊻
for exclusive or (you can type \xor
to get the ⊻ character):
@assert false ⊻ false == false
@assert false ⊻ true == true
@assert true ⊻ false == true
@assert true ⊻ true == false
Oh, and notice that true
and false
are all lowercase, unlike Python’s True
and False
.
Since &&
is lazy (like and
in Python), cond && f()
is a common shorthand for if cond; f(); end
. Think of it as “cond then f()“:
a = 2
a == 1 && println("One")
a == 2 && println("Two")
#> Two
Similarly, cond || f()
is a common shorthand for if !cond; f(); end
. Think of it as “cond else f()“:
a = 1
a == 1 || println("Not one")
a == 2 || println("Not two")
#> Not two
All expressions return a value in Julia, including if
statements. For example:
a = 1
result = if a == 1
"one"
else
"two"
end
result
#> "one"
When an expression cannot return anything, it returns nothing
:
a = 1
result = if a == 2
"two"
end
isnothing(result)
#> true
nothing
is the single instance of the type Nothing
:
typeof(nothing)
#> Nothing
17. For loops
You can use for
loops just like in Python, as we saw earlier. However, it’s also possible to create nested loops on a single line:
for a in 1:2, b in 1:3, c in 1:2
println((a, b, c))
end
#> (1, 1, 1)
#> (1, 1, 2)
#> (1, 2, 1)
#> (1, 2, 2)
#> (1, 3, 1)
#> (1, 3, 2)
#> (2, 1, 1)
#> (2, 1, 2)
#> (2, 2, 1)
#> (2, 2, 2)
#> (2, 3, 1)
#> (2, 3, 2)
The corresponding Python code would look like this:
# Python
from itertools import product
for a, b, c in product(range(1, 3), range(1, 4), range(1, 3)):
print((a, b, c))
The continue
and break
keywords work just like in Python. Note that in single-line nested loops, break
will exit all loops, not just the inner loop:
for a in 1:2, b in 1:3, c in 1:2
println((a, b, c))
(a, b, c) == (2, 1, 1) && break
end
#> (1, 1, 1)
#> (1, 1, 2)
#> (1, 2, 1)
#> (1, 2, 2)
#> (1, 3, 1)
#> (1, 3, 2)
#> (2, 1, 1)
Julia does not support the equivalent of Python’s for
/else
construct. You need to write something like this:
found = false
for person in ["Joe", "Jane", "Wally", "Jack", "Julia"] # try removing "Wally"
println("Looking at $person")
person == "Wally" && (found = true; break)
end
found || println("I did not find Wally.")
#> Looking at Joe
#> Looking at Jane
#> Looking at Wally
#> true
The equivalent Python code looks like this:
# PYTHON
for person in ["Joe", "Jane", "Wally", "Jack", "Julia"]: # try removing "Wally"
print(f"Looking at {person}")
if person == "Wally":
break
else:
print("I did not find Wally.")
Julia | Python |
---|---|
if cond1 ... elseif cond2 ... else ... end |
if cond1: ... elif cond2: ... else: ... |
&& |
and |
\|\| |
or |
! |
not |
⊻ (type \xor ) |
^ |
true |
True |
false |
False |
cond && f() |
if cond: f() |
cond \|\| f() |
if not cond: f() |
for i in 1:5 ... end |
for i in range(1, 6): ... |
for i in 1:5, j in 1:6 ... end |
from itertools import product for i, j in product(range(1, 6), range(1, 7)): ... |
while cond ... end |
while cond: ... |
continue |
continue |
break |
break |
Now lets looks at data structures, starting with tuples.
18. Tuples
Julia has tuples, very much like Python. They can contain anything:
t = (1, "Two", 3, 4, 5)
#> (1, "Two", 3, 4, 5)
Let’s look at one element:
t[1]
#> 1
Hey! Did you see that? Julia is 1-indexed, like Matlab and other math-oriented programming languages, not 0-indexed like Python and most programming languages. I found it easy to get used to, and in fact I quite like it, but your mileage may vary.
Moreover, the indexing bounds are inclusive. In Python, to get the 1st and 2nd elements of a list or tuple, you would write t[0:2]
(or just t[:2]
), while in Julia you write t[1:2]
.
t[1:2]
#> (1, "Two")
Note that end
represents the index of the last element in the tuple. So you must write t[end]
instead of t[-1]
. Similarly, you must write t[end - 1]
, not t[-2]
, and so on.
t[end]
#> 5
Last two:
t[end - 1:end]
#> (4, 5)
Like in Python, tuples are immutable:
try
t[2] = 2
catch ex
ex
end
#> MethodError(setindex!, ((1, "Two", 3, 4, 5), 2, 2), 0x0000000000006a24)
The syntax for empty and 1-element tuples is the same as in Python:
empty_tuple = ()
one_element_tuple = (42,)
#> (42,)
You can unpack a tuple, just like in Python (it’s called “destructuring” in Julia):
a, b, c, d, e = (1, "Two", 3, 4, 5)
println("a=$a, b=$b, c=$c, d=$d, e=$e")
#> a=1, b=Two, c=3, d=4, e=5
It also works with nested tuples, just like in Python:
(a, (b, c), (d, e)) = (1, ("Two", 3), (4, 5))
println("a=$a, b=$b, c=$c, d=$d, e=$e")
#> a=1, b=Two, c=3, d=4, e=5
However, consider this example:
a, b, c = (1, "Two", 3, 4, 5)
println("a=$a, b=$b, c=$c")
#> a=1, b=Two, c=3
In Python, this would cause a ValueError: too many values to unpack
. In Julia, the extra values in the tuple are just ignored.
If you want to capture the extra values in the variable c
, you need to do so explicitly:
t = (1, "Two", 3, 4, 5)
a, b = t[1:2]
c = t[3:end]
println("a=$a, b=$b, c=$c")
#> a=1, b=Two, c=(3, 4, 5)
Or more concisely:
(a, b), c = t[1:2], t[3:end]
println("a=$a, b=$b, c=$c")
#> a=1, b=Two, c=(3, 4, 5)
The corresponding Python code is:
# PYTHON
t = (1, "Two", 3, 4, 5)
a, b, *c = t
print(f"a={a}, b={b}, c={c}")
19. Named Tuples
Julia supports named tuples:
nt = (name="Julia", category="Language", stars=5)
#> (name = "Julia", category = "Language", stars = 5)
See name attribute.
nt.name
#> "Julia"
Get the full dump of info about the Tuple.
dump(nt)
#> NamedTuple{(:name, :category, :stars),Tuple{String,String,Int64}}
#> name: String "Julia"
#> category: String "Language"
#> stars: Int64 5
The corresponding Python code is:
# Python
from collections import namedtuple
Rating = namedtuple("Rating", ["name", "category", "stars"])
nt = Rating(name="Julia", category="Language", stars=5)
print(nt.name) # prints: Julia
print(nt) # prints: Rating(name='Julia', category='Language', stars=5)
20. Structs
Julia supports structs, which hold multiple named fields, a bit like named tuples:
struct Person
name
age
end
Structs have a default constructor, which expects all the field values, in order:
p = Person("Mary", 30)
Person("Mary", 30)
p.age
30
You can create other constructors by creating functions with the same name as the struct:
function Person(name)
Person(name, -1)
end
function Person()
Person("no name")
end
p = Person()
Person("no name", -1)
This creates two constructors: the second calls the first, which calls the default constructor. Notice that you can create multiple functions with the same name but different arguments. We will discuss this later.
These two constructors are called “outer constructors”, since they are defined outside of the definition of the struct. You can also define “inner constructors”:
struct Person2
name
age
function Person2(name)
new(name, -1)
end
end
function Person2()
Person2("no name")
end
p = Person2()
Person2("no name", -1)
This time, the outer constructor calls the inner constructor, which calls the new()
function. This new()
function only works in inner constructors, and of course it creates an instance of the struct.
When you define inner constructors, they replace the default constructor:
try
Person2("Bob", 40)
catch ex
ex
end
MethodError(Person2, ("Bob", 40), 0x0000000000006a29)
Structs usually have very few inner constructors (often just one), which do the heavy duty work, and the checks. Then they may have multiple outer constructors which are mostly there for convenience.
By default, structs are immutable:
try
p.name = "Someone"
catch ex
ex
end
ErrorException("setfield! immutable struct of type Person2 cannot be changed")
However, it is possible to define a mutable struct:
mutable struct Person3
name
age
end
p = Person3("Lucy", 79)
p.age += 1
p
Person3("Lucy", 80)
Structs look a lot like Python classes, with instance variables and constructors, but where are the methods? We will discuss this later, in the “Methods” section.
21. Arrays
Let’s create a small array:
a = [1, 4, 9, 16]
4-element Array{Int64,1}:
1
4
9
16
Indexing and assignments work as you would expect:
a[1] = 10
a[2:3] = [20, 30]
a
4-element Array{Int64,1}:
10
20
30
16
22. Element Type
Since we used only integers when creating the array, Julia inferred that the array is only meant to hold integers (NumPy arrays behave the same way). Let’s try adding a string:
try
a[3] = "Three"
catch ex
ex
end
MethodError(convert, (Int64, "Three"), 0x0000000000006a2a)
Nope! We get a MethodError
exception, telling us that Julia could not convert the string "Three"
to a 64-bit integer (we will discuss exceptions later). If we want an array that can hold any type, like Python’s lists can, we must prefix the array with Any
, which is Julia’s root type (like object
in Python):
a = Any[1, 4, 9, 16]
a[3] = "Three"
a
4-element Array{Any,1}:
1
4
"Three"
16
Prefixing with Float64
, or String
or any other type works as well:
Float64[1, 4, 9, 16]
4-element Array{Float64,1}:
1.0
4.0
9.0
16.0
An empty array is automatically an Any
array:
a = []
0-element Array{Any,1}
You can use the eltype()
function to get an array’s element type (the equivalent of NumPy arrays’ dtype
):
eltype([1, 4, 9, 16])
Int64
If you create an array containing objects of different types, Julia will do its best to use a type that can hold all the values as precisely as possible. For example, a mix of integers and floats results in a float array:
[1, 2, 3.0, 4.0]
4-element Array{Float64,1}:
1.0
2.0
3.0
4.0
This is similar to NumPy’s behavior:
# PYTHON
np.array([1, 2, 3.0, 4.0]) # => array([1., 2., 3., 4.])
A mix of unrelated types results in an Any
array:
[1, 2, "Three", 4]
4-element Array{Any,1}:
1
2
"Three"
4
If you want to live in a world without type constraints, you can prefix all you arrays with Any
, and you will feel like you’re coding in Python. But I don’t recommend it: the compiler can perform a bunch of optimizations when it knows exactly the type and size of the data the program will handle, so it will run much faster. So when you create an empty array but you know the type of the values it will contain, you might as well prefix it with that type (you don’t have to, but it will speed up your program).
23. Push and Pop
To append elements to an array, use the push!()
function. By convention, functions whose name ends with a bang !
may modify their arguments:
a = [1]
push!(a, 4)
push!(a, 9, 16)
4-element Array{Int64,1}:
1
4
9
16
This is similar to the following Python code:
# PYTHON
a = [1]
a.append(4)
a.extend([9, 16]) # or simply a += [9, 16]
And pop!()
works like in Python:
pop!(a)
16
Equivalent to:
# PYTHON
a.pop()
There are many more functions you can call on an array. We will see later how to find them.
24. Multidimensional Arrays
Importantly, Julia arrays can be multidimensional, just like NumPy arrays:
M = [1 2 3 4
5 6 7 8
9 10 11 12]
3×4 Array{Int64,2}:
1 2 3 4
5 6 7 8
9 10 11 12
Another syntax for this is:
M = [1 2 3 4; 5 6 7 8; 9 10 11 12]
3×4 Array{Int64,2}:
1 2 3 4
5 6 7 8
9 10 11 12
You can index them much like NumPy arrays:
M[2:3, 3:4]
2×2 Array{Int64,2}:
7 8
11 12
You can transpose a matrix using the “adjoint” operator '
:
M'
4×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
1 5 9
2 6 10
3 7 11
4 8 12
As you can see, Julia arrays are closer to NumPy arrays than to Python lists.
Arrays can be concatenated vertically using the vcat()
function:
M1 = [1 2
3 4]
M2 = [5 6
7 8]
vcat(M1, M2)
4×2 Array{Int64,2}:
1 2
3 4
5 6
7 8
Alternatively, you can use the [M1; M2]
syntax:
[M1; M2]
4×2 Array{Int64,2}:
1 2
3 4
5 6
7 8
To concatenate arrays horizontally, use hcat()
:
hcat(M1, M2)
2×4 Array{Int64,2}:
1 2 5 6
3 4 7 8
Or you can use the [M1 M2]
syntax:
[M1 M2]
2×4 Array{Int64,2}:
1 2 5 6
3 4 7 8
You can combine horizontal and vertical concatenation:
M3 = [9 10 11 12]
[M1 M2; M3]
3×4 Array{Int64,2}:
1 2 5 6
3 4 7 8
9 10 11 12
Equivalently, you can call the hvcat()
function. The first argument specifies the number of arguments to concatenate in each block row:
hvcat((2, 1), M1, M2, M3)
3×4 Array{Int64,2}:
1 2 5 6
3 4 7 8
9 10 11 12
hvcat()
is useful to create a single cell matrix:
hvcat(1, 42)
1×1 Array{Int64,2}:
42
Or a column vector (i.e., an n×1 matrix = a matrix with a single column):
hvcat((1, 1, 1), 10, 11, 12) # a column vector with values 10, 11, 12
hvcat(1, 10, 11, 12) # equivalent to the previous line
3×1 Array{Int64,2}:
10
11
12
Alternatively, you can transpose a row vector (but hvcat()
is a bit faster):
[10 11 12]'
3×1 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
10
11
12
The REPL and IJulia call display()
to print the result of the last expression in a cell (except when it is nothing
). It is fairly verbose:
display([1, 2, 3, 4])
4-element Array{Int64,1}:
1
2
3
4
The println()
function is more concise, but be careful not to confuse vectors, column vectors and row vectors (printed with commas, semi-colons and spaces, respectively):
println("Vector: ", [1, 2, 3, 4])
println("Column vector: ", hvcat(1, 1, 2, 3, 4))
println("Row vector: ", [1 2 3 4])
println("Matrix: ", [1 2 3; 4 5 6])
Vector: [1, 2, 3, 4]
Column vector: [1; 2; 3; 4]
Row vector: [1 2 3 4]
Matrix: [1 2 3; 4 5 6]
Although column vectors are printed as [1; 2; 3; 4]
, evaluating [1; 2; 3; 4]
will give you a regular vector. That’s because [x;y]
concatenates x
and y
vertically, and if x
and y
are scalars or vectors, you just get a regular vector.
Julia | Python |
---|---|
a = [1, 2, 3] |
a = [1, 2, 3] or import numpy as np np.array([1, 2, 3]) |
a[1] |
a[0] |
a[end] |
a[-1] |
a[2:end-1] |
a[1:-1] |
push!(a, 5) |
a.append(5) |
pop!(a) |
a.pop() |
M = [1 2 3] |
np.array([[1, 2, 3]]) |
M = [1 2 3]' |
np.array([[1, 2, 3]]).T |
M = hvcat(1, 1, 2, 3) |
np.array([[1], [2], [3]]) |
M = [1 2 3 4 5 6] or M = [1 2 3; 4 5 6] |
M = np.array([[1,2,3], [4,5,6]]) |
M[1:2, 2:3] |
M[0:2, 1:3] |
[M1; M2] |
np.r_[M1, M2] |
[M1 M2] |
np.c_[M1, M2] |
[M1 M2; M3] |
np.r_[np.c_[M1, M2], M3] |
25. Comprehensions
List comprehensions are available in Julia, just like in Python (they’re usually just called “comprehensions” in Julia):
a = [x^2 for x in 1:4]
4-element Array{Int64,1}:
1
4
9
16
You can filter elements using an if
clause, just like in Python:
a = [x^2 for x in 1:5 if x ∉ (2, 4)]
3-element Array{Int64,1}:
1
9
25
a ∉ b
is equivalent to!(a in b)
(ora not in b
in Python). You can type∉
with\notin
a ∈ b
is equivalent toa in b
. You can type it with\in
In Julia, comprehensions can contain nested loops, just like in Python:
a = [(i,j) for i in 1:3 for j in 1:i]
6-element Array{Tuple{Int64,Int64},1}:
(1, 1)
(2, 1)
(2, 2)
(3, 1)
(3, 2)
(3, 3)
Here’s the corresponding Python code:
# PYTHON
a = [(i, j) for i in range(1, 4) for j in range(1, i+1)]
Julia comprehensions can also create multi-dimensional arrays (note the different syntax: there is only one for
):
a = [row * col for row in 1:3, col in 1:5]
3×5 Array{Int64,2}:
1 2 3 4 5
2 4 6 8 10
3 6 9 12 15
26. Dictionaries
The syntax for dictionaries is a bit different than Python:
d = Dict("tree"=>"arbre", "love"=>"amour", "coffee"=>"café")
println(d["tree"])
arbre
println(get(d, "unknown", "pardon?"))
pardon?
keys(d)
Base.KeySet for a Dict{String,String} with 3 entries. Keys:
"coffee"
"tree"
"love"
values(d)
Base.ValueIterator for a Dict{String,String} with 3 entries. Values:
"café"
"arbre"
"amour"
haskey(d, "love")
true
"love" in keys(d) # this is slower than haskey()
true
The equivalent Python code is of course:
d = {"tree": "arbre", "love": "amour", "coffee": "café"}
d["tree"]
d.get("unknown", "pardon?")
d.keys()
d.values()
"love" in d
"love" in d.keys()
Dict comprehensions work as you would expect:
d = Dict(i=>i^2 for i in 1:5)
Dict{Int64,Int64} with 5 entries:
4 => 16
2 => 4
3 => 9
5 => 25
1 => 1
Note that the items (aka “pairs” in Julia) are shuffled, since dictionaries are hash-based, like in Python (although Python sorts them by key for display).
You can easily iterate through the dictionary’s pairs like this:
for (k, v) in d
println("$k maps to $v")
end
4 maps to 16
2 maps to 4
3 maps to 9
5 maps to 25
1 maps to 1
The equivalent code in Python is:
# PYTHON
d = {i: i**2 for i in range(1, 6)}
for k, v in d.items():
print(f"{k} maps to {v}")
And you can merge dictionaries like this:
d1 = Dict("tree"=>"arbre", "love"=>"amour", "coffee"=>"café")
d2 = Dict("car"=>"voiture", "love"=>"aimer")
d = merge(d1, d2)
Dict{String,String} with 4 entries:
"car" => "voiture"
"coffee" => "café"
"tree" => "arbre"
"love" => "aimer"
Notice that the second dictionary has priority in case of conflict (it’s "love" => "aimer"
, not "love" => "amour"
).
In Python, this would be:
# PYTHON
d1 = {"tree": "arbre", "love": "amour", "coffee": "café"}
d2 = {"car": "voiture", "love": "aimer"}
d = {**d1, **d2}
Or if you want to update the first dictionary instead of creating a new one:
merge!(d1, d2)
Dict{String,String} with 4 entries:
"car" => "voiture"
"coffee" => "café"
"tree" => "arbre"
"love" => "aimer"
In Python, that’s:
# PYTHON
d1.update(d2)
In Julia, each pair is an actual Pair
object:
p = "tree" => "arbre"
println(typeof(p))
k, v = p
println("$k maps to $v")
Pair{String,String}
tree maps to arbre
Note that any object for which a hash()
method is implemented can be used as a key in a dictionary. This includes all the basic types like integers, floats, as well as string, tuples, etc. But it also includes arrays! In Julia, you have the freedom to use arrays as keys (unlike in Python), but make sure not to mutate these arrays after insertion, or else things will break! Indeed, the pairs will be stored in memory in a location that depends on the hash of the key at insertion time, so if that key changes afterwards, you won’t be able to find the pair anymore:
a = [1, 2, 3]
d = Dict(a => "My array")
println("The dictionary is: $d")
println("Indexing works fine as long as the array is unchanged: ", d[a])
a[1] = 10
println("This is the dictionary now: $d")
try
println("Key changed, indexing is now broken: ", d[a])
catch ex
ex
end
The dictionary is: Dict([1, 2, 3] => "My array")
Indexing works fine as long as the array is unchanged: My array
This is the dictionary now: Dict([10, 2, 3] => "My array")
KeyError([10, 2, 3])
However, it’s still possible to iterate through the keys, the values or the pairs:
for pair in d
println(pair)
end
[10, 2, 3] => "My array"
Julia | Python |
---|---|
Dict("tree"=>"arbre", "love"=>"amour") |
{"tree": "arbre", "love": "amour"} |
d["arbre"] |
d["arbre"] |
get(d, "unknown", "default") |
d.get("unknown", "default") |
keys(d) |
d.keys() |
values(d) |
d.values() |
haskey(d, k) |
k in d |
Dict(i=>i^2 for i in 1:4) |
{i: i**2 for i in 1:4} |
for (k, v) in d |
for k, v in d.items(): |
merge(d1, d2) |
{**d1, **d2} |
merge!(d1, d2) |
d1.update(d2) |
27. Sets
Let’s create a couple sets:
odd = Set([1, 3, 5, 7, 9, 11])
prime = Set([2, 3, 5, 7, 11])
Set{Int64} with 5 elements:
7
2
3
11
5
The order of sets is not guaranteed, just like in Python.
Use in
or ∈
(type \in
) to check whether a set contains a given value:
5 ∈ odd
true
5 in odd
true
Both of these expressions are equivalent to:
in(5, odd)
true
Now let’s get the union of these two sets:
odd ∪ prime
Set{Int64} with 7 elements:
7
9
2
3
11
5
1
∪ is the union symbol, not a U. To type this character, type \cup
(it has the shape of a cup). Alternatively, you can just use the union()
function:
union(odd, prime)
Set{Int64} with 7 elements:
7
9
2
3
11
5
1
Now let’s get the intersection using the ∩ symbol (type \cap
):
odd ∩ prime
Set{Int64} with 4 elements:
7
3
11
5
Or use the intersect()
function:
intersect(odd, prime)
Set{Int64} with 4 elements:
7
3
11
5
Next, let’s get the set difference and the symetric difference between these two sets:
setdiff(odd, prime) # values in odd but not in prime
Set{Int64} with 2 elements:
9
1
symdiff(odd, prime) # values that are not in the intersection
Set{Int64} with 3 elements:
9
2
1
Lastly, set comprehensions work just fine:
Set([i^2 for i in 1:4])
Set{Int64} with 4 elements:
4
9
16
1
The equivalent Python code is:
# PYTHON
odds = {1, 3, 5, 7, 9, 11}
primes = {2, 3, 5, 7, 11}
5 in primes
odds | primes # union
odds.union(primes)
odds & primes # intersection
odds.intersection(primes)
odds - primes # set difference
odds.difference(primes)
odds ^ primes # symmetric difference
odds.symmetric_difference(primes)
{i**2 for i in range(1, 5)}
Note that you can store any hashable object in a Set
(i.e., any instance of a type for which the hash()
method is implemented). This includes arrays, unlike in Python. Just like for dictionary keys, you can add arrays to sets, but make sure not to mutate them after insertion.
Julia | Python |
---|---|
Set([1, 3, 5, 7]) |
{1, 3, 5, 7} |
5 in odd |
5 in odd |
Set([i^2 for i in 1:4]) |
{i**2 for i in range(1, 5)} |
odd ∪ primes |
odd | primes |
union(odd, primes) |
odd.union(primes) |
odd ∩ primes |
odd & primes |
insersect(odd, primes) |
odd.intersection(primes) |
setdiff(odd, primes) |
odd - primes or odd.difference(primes) |
symdiff(odd, primes) |
odd ^ primes or odd.symmetric_difference(primes) |
28. Enums
To create an enum, use the @enum
macro:
@enum Fruit apple=1 banana=2 orange=3
This creates the Fruit
enum, with 3 possible values. It also binds the names to the values:
banana
banana::Fruit = 2
Or you can get a Fruit
instance using the value:
Fruit(2)
banana::Fruit = 2
And you can get all the instances of the enum easily:
instances(Fruit)
(apple, banana, orange)
Julia | Python |
---|---|
@enum Fruit apple=1 banana=2 orange=3 |
from enum import Enum class Fruit(Enum): APPLE = 1 BANANA = 2 ORANGE = 3 |
Fruit(2) === banana |
Fruit(2) is Fruit.BANANA |
instances(Fruit) |
dir(Fruit) |
29. Object Identity
In the previous example, Fruit(2)
and banana
refer to the same object, not just two objects that happen to be equal. You can verify using the ===
operator, which is the equivalent of Python’s is
operator:
banana === Fruit(2)
true
You can also check this by looking at their objectid()
, which is the equivalent of Python’s id()
function:
objectid(banana)
0x360d21ab82c8ee67
objectid(Fruit(2))
0x360d21ab82c8ee67
a = [1, 2, 4]
b = [1, 2, 4]
@assert a == b # a and b are equal
@assert a !== b # but they are not the same object
Julia | Python |
---|---|
a === b |
a is b |
a !== b |
a is not b |
objectid(obj) |
id(obj) |
30. Other Collections
For the Julia equivalent of Python’s other collections, namely defaultdict
, deque
, OrderedDict
, and Counter
, check out these libraries:
- https://github.com/JuliaCollections/DataStructures.jl
- https://github.com/JuliaCollections/OrderedCollections.jl
- https://github.com/andyferris/Dictionaries.jl
Now let’s looks at various iteration constructs.
Iteration Tools
31. Generator Expressions
Just like in Python, a generator expression resembles a list comprehension, but without the square brackets, and it returns a generator instead of a list. Here’s a much shorter implementation of the estimate_pi()
function using a generator expression:
function estimate_pi2(n)
4 * sum((isodd(i) ? -1 : 1)/(2i+1) for i in 0:n)
end
@assert estimate_pi(100) == estimate_pi2(100)
That’s very similar to the corresponding Python code:
# PYTHON
def estimate_pi2(n):
return 4 * sum((-1 if i%2==1 else 1)/(2*i+1) for i in range(n+1))
assert estimate_pi(100) == estimate_pi2(100)
zip
, enumerate
, collect
The zip()
function works much like in Python:
for (i, s) in zip(10:13, ["Ten", "Eleven", "Twelve"])
println(i, ": ", s)
end
10: Ten
11: Eleven
12: Twelve
Notice that the parentheses in for (i, s)
are required in Julia, as opposed to Python.
The enumerate()
function also works like in Python, except of course it is 1-indexed:
for (i, s) in enumerate(["One", "Two", "Three"])
println(i, ": ", s)
end
1: One
2: Two
3: Three
To pull the values of a generator into an array, use collect()
:
collect(1:5)
5-element Array{Int64,1}:
1
2
3
4
5
A shorter syntax for that is:
[1:5;]
5-element Array{Int64,1}:
1
2
3
4
5
The equivalent Python code is:
# PYTHON
list(range(1, 6))
32. Generators
In Python, you can easily write a generator function to create an object that will behave like an iterator. For example, let’s create a generator for the Fibonacci sequence (where each number is the sum of the two previous numbers):
def fibonacci(n):
a, b = 1, 1
for i in range(n):
yield a
a, b = b, a + b
for f in fibonacci(10):
print(f)
This is also quite easy in Julia:
function fibonacci(n)
Channel() do ch
a, b = 1, 1
for i in 1:n
put!(ch, a)
a, b = b, a + b
end
end
end
for f in fibonacci(10)
println(f)
end
1
1
2
3
5
8
13
21
34
55
The Channel
type is part of the API for tasks and coroutines. We’ll discuss these later.
Now let’s take a closer look at functions.
33. Functions
Arguments
Julia functions supports positional arguments and default values:
function draw_face(x, y, width=3, height=4)
println("x=$x, y=$y, width=$width, height=$height")
end
draw_face(10, 20, 30)
x=10, y=20, width=30, height=4
However, unlike in Python, positional arguments must not be named when the function is called:
try
draw_face(10, 20, width=30)
catch ex
ex
end
MethodError(var"#draw_face##kw"(), ((width = 30,), draw_face, 10, 20), 0x0000000000006a3e)
Julia also supports a variable number of arguments (called “varargs”) using the syntax arg...
, which is the equivalent of Python’s *arg
:
function copy_files(target_dir, paths...)
println("target_dir=$target_dir, paths=$paths")
end
copy_files("/tmp", "a.txt", "b.txt")
target_dir=/tmp, paths=("a.txt", "b.txt")
Keyword arguments are supported, after a semicolon ;
:
function copy_files2(paths...; confirm=false, target_dir)
println("paths=$paths, confirm=$confirm, $target_dir")
end
copy_files2("a.txt", "b.txt"; target_dir="/tmp")
paths=("a.txt", "b.txt"), confirm=false, /tmp
Notes:
* target_dir
has no default value, so it is a required argument.
* The order of the keyword arguments does not matter.
You can have another vararg in the keyword section. It corresponds to Python’s **kwargs
:
function copy_files3(paths...; confirm=false, target_dir, options...)
println("paths=$paths, confirm=$confirm, $target_dir")
verbose = options[:verbose]
println("verbose=$verbose")
end
copy_files3("a.txt", "b.txt"; target_dir="/tmp", verbose=true, timeout=60)
paths=("a.txt", "b.txt"), confirm=false, /tmp
verbose=true
The options
vararg acts like a dictionary (we will discuss dictionaries later). The keys are symbols, e.g., :verbose
. Symbols are like strings, less flexible but faster. They are typically used as keys or identifiers.
Julia | Python (3.8+ if / is used) |
---|---|
function foo(a, b=2, c=3) ... end
|
def foo(a, b=2, c=3, /): ...
|
function foo(;a=1, b, c=3) ... end
|
def foo(*, a=1, b, c=3): ...
|
function foo(a, b=2; c=3, d) ... end
|
def foo(a, b=2, /, *, c=3, d): ...
|
function foo(a, b=2, c...) ... end
|
def foo(a, b=2, /, *c): ...
|
function foo(a, b=1, c...; d=1, e, f...) ... end
|
def foo(a, b=1, /, *c, d=1, e, **f): ...
|
34. Concise Functions
In Julia, the following definition:
square(x) = x^2
square (generic function with 1 method)
is equivalent to:
function square(x)
x^2
end
square (generic function with 1 method)
For example, here’s a shorter way to define the estimate_pi()
function in Julia:
estimate_pi3(n) = 4 * sum((isodd(i) ? -1 : 1)/(2i+1) for i in 0:n)
estimate_pi3 (generic function with 1 method)
To define a function on one line in Python, you need to use a lambda
(but this is generally frowned upon, since the resulting function’s name is ""
):
# PYTHON
square = lambda x: x**2
assert square.__name__ == "<lambda>"
This leads us to anonymous functions.
35. Anonymous Functions
Just like in Python, you can define anonymous functions:
map(x -> x^2, 1:4)
4-element Array{Int64,1}:
1
4
9
16
Here is the equivalent Python code:
list(map(lambda x: x**2, range(1, 5)))
Notes:
* map()
returns an array in Julia, instead of an iterator like in Python.
* You could use a comprehension instead: [x^2 for x in 1:4]
.
Julia | Python |
---|---|
x -> x^2 |
lambda x: x**2 |
(x,y) -> x + y |
lambda x,y: x + y |
() -> println("yes") |
lambda: print("yes") |
In Python, lambda functions must be simple expressions. They cannot contain multiple statements. In Julia, they can be as long as you want. Indeed, you can create a multi-statement block using the syntax (stmt_1; stmt_2; ...; stmt_n)
. The return value is the output of the last statement. For example:
map(x -> (println("Number $x"); x^2), 1:4)
Number 1
Number 2
Number 3
Number 4
4-element Array{Int64,1}:
1
4
9
16
This syntax can span multiple lines:
map(x -> (
println("Number $x");
x^2), 1:4)
Number 1
Number 2
Number 3
Number 4
4-element Array{Int64,1}:
1
4
9
16
But in this case, it’s probably clearer to use the begin ... end
syntax instead:
map(x -> begin
println("Number $x")
x^2
end, 1:4)
Number 1
Number 2
Number 3
Number 4
4-element Array{Int64,1}:
1
4
9
16
Notice that this syntax allows you to drop the semicolons ;
at the end of each line in the block.
Yet another way to define an anonymous function is using the function (args) ... end
syntax:
map(function (x)
println("Number $x")
x^2
end, 1:4)
Number 1
Number 2
Number 3
Number 4
4-element Array{Int64,1}:
1
4
9
16
Lastly, if you’re passing the anonymous function as the first argument to a function (as is the case in this example), it’s usually much preferable to define the anonymous function immediately after the function call, using the do
syntax, like this:
map(1:4) do x
println("Number $x")
x^2
end
Number 1
Number 2
Number 3
Number 4
4-element Array{Int64,1}:
1
4
9
16
This syntax lets you easily define constructs that feel like language extensions:
function my_for(func, collection)
for i in collection
func(i)
end
end
my_for(1:4) do i
println("The square of $i is $(i^2)")
end
The square of 1 is 1
The square of 2 is 4
The square of 3 is 9
The square of 4 is 16
In fact, Julia has a similar foreach()
function.
The do
syntax could be used to write a Domain Specific Language (DSL), for example an infrastructure automation DSL:
function spawn_server(startup_func, server_type)
println("Starting $server_type server")
server_id = 1234
println("Configuring server $server_id...")
startup_func(server_id)
end
# This is the DSL part
spawn_server("web") do server_id
println("Creating HTML pages on server $server_id...")
end
Starting web server
Configuring server 1234...
Creating HTML pages on server 1234...
It’s also quite nice for event-driven code:
handlers = []
on_click(handler) = push!(handlers, handler)
click(event) = foreach(handler->handler(event), handlers)
on_click() do event
println("Mouse clicked at $event")
end
on_click() do event
println("Beep.")
end
click((x=50, y=20))
click((x=120, y=10))
Mouse clicked at (x = 50, y = 20)
Beep.
Mouse clicked at (x = 120, y = 10)
Beep.
It can also be used to create context managers, for example to automatically close an object after it has been used, even if an exception is raised:
function with_database(func, name)
println("Opening connection to database $name")
db = "a db object for database $name"
try
func(db)
finally
println("Closing connection to database $name")
end
end
with_database("jobs") do db
println("I'm working with $db")
#error("Oops") # try uncommenting this line
end
Opening connection to database jobs
I'm working with a db object for database jobs
Closing connection to database jobs
The equivalent code in Python would look like this:
# PYTHON
class Database:
def __init__(self, name):
self.name = name
def __enter__(self):
print(f"Opening connection to database {self.name}")
return f"a db object for database {self.name}"
def __exit__(self, type, value, traceback):
print(f"Closing connection to database {self.name}")
with Database("jobs") as db:
print(f"I'm working with {db}")
#raise Exception("Oops") # try uncommenting this line
Or you could use contextlib
:
from contextlib import contextmanager
@contextmanager
def database(name):
print(f"Opening connection to database {name}")
db = f"a db object for database {name}"
try:
yield db
finally:
print(f"Closing connection to database {name}")
with database("jobs") as db:
print(f"I'm working with {db}")
#raise Exception("Oops") # try uncommenting this line
36. Piping
If you are used to the Object Oriented syntax "a b c".upper().split()
, you may feel that writing split(uppercase("a b c"))
is a bit backwards. If so, the piping operation |>
is for you:
"a b c" |> uppercase |> split
3-element Array{SubString{String},1}:
"A"
"B"
"C"
If you want to pass more than one argument to some of the functions, you can use anonymous functions:
"a b c" |> uppercase |> split |> tokens->join(tokens, ", ")
"A, B, C"
The dotted version of the pipe operator works as you might expect, applying the _i_th function of the right array to the _i_th value in the left array:
[π/2, "hello", 4] .|> [sin, length, x->x^2]
3-element Array{Real,1}:
1.0
5
16
37. Composition
Julia also lets you compose functions like mathematicians do, using the composition operator ∘ (\circ
in the REPL or Jupyter, but not Colab):
f = exp ∘ sin ∘ sqrt
f(2.0) == exp(sin(sqrt(2.0)))
true
38. Methods
Earlier, we discussed structs, which look a lot like Python classes, with instance variables and constructors, but they did not contain any methods (just the inner constructors). In Julia, methods are defined separately, like regular functions:
struct Person
name
age
end
function greetings(greeter)
println("Hi, my name is $(greeter.name), I am $(greeter.age) years old.")
end
p = Person("Alice", 70)
greetings(p)
Hi, my name is Alice, I am 70 years old.
Since the greetings()
method in Julia is not bound to any particular type, we can use it with any other type we want, as long as that type has a name
and an age
(i.e., if it quacks like a duck):
struct City
name
country
age
end
using Dates
c = City("Auckland", "New Zealand", year(now()) - 1840)
greetings(c)
Hi, my name is Auckland, I am 180 years old.
You could code this the same way in Python if you wanted to:
# PYTHON
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
class City:
def __init__(self, name, country, age):
self.name = name
self.country = country
self.age = age
def greetings(greeter):
print(f"Hi there, my name is {greeter.name}, I am {greeter.age} years old.")
p = Person("Lucy", 70)
greetings(p)
from datetime import date
c = City("Auckland", "New Zealand", date.today().year - 1840)
greetings(c)
However, many Python programmers would use inheritance in this case:
class Greeter:
def __init__(self, name, age):
self.name = name
self.age = age
def greetings(self):
print(f"Hi there, my name is {self.name}, I am {self.age} years old.")
class Person(Greeter):
def __init__(self, name, age):
super().__init__(name, age)
class City(Greeter):
def __init__(self, name, country, age):
super().__init__(name, age)
self.country = country
p = Person("Lucy", 70)
p.greetings()
from datetime import date
c = City("Auckland", "New Zealand", date.today().year - 1840)
c.greetings()
39. Extending a Function
One nice thing about having a class hierarchy is that you can override methods in subclasses to get specialized behavior for each class. For example, in Python you could override the greetings()
method like this:
# PYTHON
class Developer(Person):
def __init__(self, name, age, language):
super().__init__(name, age)
self.language = language
def greetings(self):
print(f"Hi there, my name is {self.name}, I am {self.age} years old.")
print(f"My favorite language is {self.language}.")
d = Developer("Amy", 40, "Julia")
d.greetings()
Notice that the expression d.greetings()
will call a different method if d
is a Person
or a Developer
. This is called “polymorphism”: the same method call behaves differently depending on the type of the object. The language chooses which actual method implementation to call, based on the type of d
: this is called method “dispatch”. More specifically, since it only depends on a single variable, it is called “single dispatch”.
The good news is that Julia can do single dispatch as well:
struct Developer
name
age
language
end
function greetings(dev::Developer)
println("Hi, my name is $(dev.name), I am $(dev.age) years old.")
println("My favorite language is $(dev.language).")
end
d = Developer("Amy", 40, "Julia")
greetings(d)
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.
Notice that the dev
argument is followed by ::Developer
, which means that this method will only be called if the argument has that type.
We have extended the greetings
function, so that it now has two different implementations, called methods, each for different argument types: namely, greetings(dev::Developer)
for arguments of type Developer
, and greetings(greeter)
for values of any other type.
You can easily get the list of all the methods of a given function:
methods(greetings)
40. Two methods for generic function greetings:
- greetings(dev::Developer) in Main at In[200]:8
- greetings(greeter) in Main at In[198]:7
You can also get the list of all the methods which take a particular type as argument:
methodswith(Developer)
1-element Array{Method,1}:
- greetings(dev::Developer) in Main at In[200]:8
When you call the greetings()
function, Julia automatically dispatches the call to the appropriate method, depending on the type of the argument. If Julia can determine at compile time what the type of the argument will be, then it optimizes the compiled code so that there’s no choice to be made at runtime. This is called static dispatch, and it can significantly speed up the program. If the argument’s type can’t be determined at compile time, then Julia makes the choice at runtime, just like in Python: this is called dynamic dispatch.
41. Multiple Dispatch
Julia actually looks at the types of all the positional arguments, not just the first one. This is called multiple dispatch. For example:
multdisp(a::Int64, b::Int64) = 1
multdisp(a::Int64, b::Float64) = 2
multdisp(a::Float64, b::Int64) = 3
multdisp(a::Float64, b::Float64) = 4
multdisp(10, 20) # try changing the arguments to get each possible output
1
Julia always chooses the most specific method it can, so the following method will only be called if the first argument is neither an Int64
nor a Float64
:
multdisp(a::Any, b::Int64) = 5
multdisp(10, 20)
1
Julia will raise an exception if there is some ambiguity as to which method is the most specific:
ambig(a::Int64, b) = 1
ambig(a, b::Int64) = 2
try
ambig(10, 20)
catch ex
ex
end
MethodError(ambig, (10, 20), 0x0000000000006a68)
To solve this problem, you can explicitely define a method for the ambiguous case:
ambig(a::Int64, b::Int64) = 3
ambig(10, 20)
3
So you can have polymorphism in Julia, just like in Python. This means that you can write your algorithms in a generic way, without having to know the exact types of the values you are manipulating, and it will work fine, as long as these types act in the general way you expect (i.e., if they “quack like ducks”). For example:
function how_can_i_help(greeter)
greetings(greeter)
println("How can I help?")
end
how_can_i_help(p) # called on a Person
how_can_i_help(d) # called on a Developer
Hi, my name is Alice, I am 70 years old.
How can I help?
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.
How can I help?
42. Calling super( )
?
You may have noticed that the greetings(dev::Developer)
method could be improved, since it currently duplicates the implementation of the base method greetings(greeter)
. In Python, you would get rid of this duplication by calling the base class’s greetings()
method, using super()
:
# PYTHON
class Developer(Person):
def __init__(self, name, age, language):
super().__init__(name, age)
self.language = language
def greetings(self):
super().greetings() # <== THIS!
print(f"My favorite language is {self.language}.")
d = Developer("Amy", 40, "Julia")
d.greetings()
In Julia, you can do something pretty similar, although you have to implement your own super()
function, as it is not part of the language:
super(dev::Developer) = Person(dev.name, dev.age)
function greetings(dev::Developer)
greetings(super(dev))
println("My favorite language is $(dev.language).")
end
greetings(d)
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.
However, this implementation creates a new Person
instance when calling super(dev)
, copying the name
and age
fields. That’s okay for small objects, but it’s not ideal for larger ones. Instead, you can explicitely call the specific method you want by using the invoke()
function:
function greetings(dev::Developer)
invoke(greetings, Tuple{Any}, dev)
println("My favorite language is $(dev.language).")
end
greetings(d)
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.
The invoke()
function expects the following arguments:
* The first argument is the function to call.
* The second argument is the type of the desired method’s arguments tuple: Tuple{TypeArg1, TypeArg2, etc.}
. In this case we want to call the base function, which takes a single Any
argument (the Any
type is implicit when no type is specified).
* Lastly, it takes all the arguments to be passed to the method. In this case, there’s just one: dev
.
As you can see, we managed to get the same advantages Object-Oriented programming offers, without defining classes or using inheritance. This takes a bit of getting used to, but you might come to prefer this style of generic programming. Indeed, OO programming encourage you to bundle data and behavior together, but this is not always a good idea. Let’s look at one example:
# PYTHON
class Rectangle:
def __init__(self, height, width):
self.height = height
self.width = width
def area(self):
return self.height * self.width
class Square(Rectangle):
def __init__(self, length):
super().__init__(length, length)
It makes sense for the Square
class to be a subclass of the Rectangle
class, since a square is a special type of rectangle. It also makes sense for the Square
class to inherit from all of the Rectangle
class’s behavior, such as the area()
method. However, it does not really make sense for rectangles and squares to have the same memory representation: a Rectangle
needs two numbers (height
and width
), while a Square
only needs one (length
).
It’s possible to work around this issue like this:
# PYTHON
class Rectangle:
def __init__(self, height, width):
self.height = height
self.width = width
def area(self):
return self.height * self.width
class Square(Rectangle):
def __init__(self, length):
self.length = length
@property
def width(self):
return self.length
@property
def height(self):
return self.length
That’s better: now, each square is only represented using a single number. We’ve inherited the behavior, but not the data.
In Julia, you could code this like so:
struct Rectangle
width
height
end
width(rect::Rectangle) = rect.width
height(rect::Rectangle) = rect.height
area(rect) = width(rect) * height(rect)
struct Square
length
end
width(sq::Square) = sq.length
height(sq::Square) = sq.length
height (generic function with 2 methods)
area(Square(5))
25
Notice that the area()
function relies on the getters width()
and height()
, rather than directly on the fields width
and height
. This way, the argument can be of any type at all, as long as it has these getters.
43. Abstract Types
One nice thing about the class hierarchy we defined in Python is that it makes it clear that a square is a kind of rectangle. Any new function you define that takes a Rectangle
as an argument will automatically accept a Square
as well, but no other non-rectangle type. In contrast, our area()
function currently accepts anything at all.
In Julia, a concrete type like Square
cannot extend another concrete type like Rectangle
. However, any type can extend from an abstract type. Let’s define some abstract types to create a type hierarchy for our Square
and Rectangle
types.
abstract type AbstractShape end
abstract type AbstractRectangle <: AbstractShape end # <: means "subtype of"
abstract type AbstractSquare <: AbstractRectangle end
The <:
operator means “subtype of”.
Now we can attach the area()
function to the AbstractRectangle
type, instead of any type at all:
area(rect::AbstractRectangle) = width(rect) * height(rect)
area (generic function with 2 methods)
Now we can define the concrete types, as subtypes of AbstractRectangle
and AbstractSquare
:
struct Rectangle_v2 <: AbstractRectangle
width
height
end
width(rect::Rectangle_v2) = rect.width
height(rect::Rectangle_v2) = rect.height
struct Square_v2 <: AbstractSquare
length
end
width(sq::Square_v2) = sq.length
height(sq::Square_v2) = sq.length
height (generic function with 4 methods)
In short, the Julian approach to type hierarchies looks like this:
- Create a hierarchy of abstract types to represent the concepts you want to implement.
- Write functions for these abstract types. Much of your implementation can be coded at that level, manipulating abstract concepts.
- Lastly, create concrete types, and write the methods needed to give them the behavior that is expected by the generic algorithms you wrote.
This pattern is used everywhere in Julia’s standard libraries. For example, here are the supertypes of Float64
and Int64
:
Base.show_supertypes(Float64)
Float64 <: AbstractFloat <: Real <: Number <: Any
Base.show_supertypes(Int64)
Int64 <: Signed <: Integer <: Real <: Number <: Any
Note: Julia implicitly runs using Core
and using Base
when starting the REPL. However, the show_supertypes()
function is not exported by the Base
module, thus you cannot access it by just typing show_supertypes(Float64)
. Instead, you have to specify the module name: Base.show_supertypes(Float64)
.
And here is the whole hierarchy of Number
types:
function show_hierarchy(root, indent=0)
println(repeat(" ", indent * 4), root)
for subtype in subtypes(root)
show_hierarchy(subtype, indent + 1)
end
end
show_hierarchy(Number)
Number
Complex
Real
AbstractFloat
BigFloat
Float16
Float32
Float64
AbstractIrrational
Irrational
FixedPointNumbers.FixedPoint
FixedPointNumbers.Fixed
FixedPointNumbers.Normed
Integer
Bool
Signed
BigInt
Int128
Int16
Int32
Int64
Int8
Unsigned
UInt128
UInt16
UInt32
UInt64
UInt8
Rational
44. Iterator Interface
You will sometimes want to provide a way to iterate over your custom types. In Python, this requires defining the __iter__()
method which should return an object which implements the __next__()
method. In Julia, you must define at least two functions:
* iterate(::YourIteratorType)
, which must return either nothing
if there are no values in the sequence, or (first_value, iterator_state)
.
* iterate(::YourIteratorType, state)
, which must return either nothing
if there are no more values, or (next_value, new_iterator_state)
.
For example, let’s create a simple iterator for the Fibonacci sequence:
struct FibonacciIterator end
import Base.iterate
iterate(f::FibonacciIterator) = (1, (1, 1))
function iterate(f::FibonacciIterator, state)
new_state = (state[2], state[1] + state[2])
(new_state[1], new_state)
end
iterate (generic function with 224 methods)
Now we can iterate over a FibonacciIterator
instance:
for f in FibonacciIterator()
println(f)
f > 10 && break
end
1
1
2
3
5
8
13
45. Indexing Interface
You can also create a type that will be indexable like an array (allowing syntax like a[5] = 3
). In Python, this requires implementing the __getitem__()
and __setitem__()
methods. In Julia, you must implement the getindex(A::YourType, i)
, setindex!(A::YourType, v, i)
, firstindex(A::YourType)
and lastindex(A::YourType)
methods.
struct MySquares end
import Base.getindex, Base.firstindex
getindex(::MySquares, i) = i^2
firstindex(::MySquares) = 0
S = MySquares()
S[10]
100
S[begin]
0
getindex(S::MySquares, r::UnitRange) = [S[i] for i in r]
getindex (generic function with 228 methods)
S[1:4]
4-element Array{Int64,1}:
1
4
9
16
For more details on these interfaces, and to learn how to build full-blown array types with broadcasting and more, check out this page.
46. Creating a Number Type
Let’s create a MyRational
struct and try to make it mimic the built-in Rational
type:
struct MyRational <: Real
num # numerator
den # denominator
end
MyRational(2, 3)
MyRational(2, 3)
It would be more convenient and readable if we could type 2 ⨸ 3
to create a MyRational
:
function ⨸(num, den)
MyRational(num, den)
end
⨸ (generic function with 1 method)
2 ⨸ 3
MyRational(2, 3)
I chose ⨸
because it’s a symbol that Julia’s parser treats as a binary operator, but which is otherwise not used by Julia (see the full list of parsed symbols and their priorities). This particular symbol will have the same priority as multiplication and division.
If you want to know how to type it and check that it is unused, type ?⨸
(copy/paste the symbol):
?⨸
"[36m⨸[39m" can be typed by [36m\odiv[39m
search: [0m[1m⨸[22m
No documentation found.
⨸
is a Function
.
# 1 method for generic function "⨸":
[1] ⨸(num, den) in Main at In[227]:2
Now let’s make it possible to add two MyRational
values. We want it to be possible for our MyRational
type to be used in existing algorithms which rely on +
, so we must create a new method for the Base.+
function:
import Base.+
function +(r1::MyRational, r2::MyRational)
(r1.num * r2.den + r1.den * r2.num) ⨸ (r1.den * r2.den)
end
+ (generic function with 173 methods)
2 ⨸ 3 + 3 ⨸ 5
MyRational(19, 15)
It’s important to import Base.+
first, or else you would just be defining a new +
function in the current module (Main
), which would not be called by existing algorithms.
You can easily implement *
, ^
and so on, in much the same way.
Let’s change the way MyRational
values are printed, to make them look a bit nicer. For this, we must create a new method for the Base.show(io::IO, x)
function:
import Base.show
function show(io::IO, r::MyRational)
print(io, "$(r.num) ⨸ $(r.den)")
end
2 ⨸ 3 + 3 ⨸ 5
19 ⨸ 15
We can expand the show()
function so it can provide an HTML representation for MyRational
values. This will be called by the display()
function in Jupyter or Colab:
function show(io::IO, ::MIME"text/html", r::MyRational)
print(io, "<sup><b>$(r.num)</b></sup>⁄<sub><b>$(r.den)</b></sub>")
end
2 ⨸ 3 + 3 ⨸ 5
19⁄15
Next, we want to be able to perform any operation involving MyRational
values and values of other Number
types. For example, we may want to multiply integers and MyRational
values. One option is to define a new method like this:
import Base.*
function *(r::MyRational, i::Integer)
(r.num * i) ⨸ r.den
end
2 ⨸ 3 * 5
10⁄3
Since multiplication is commutative, we need the reverse method as well:
function *(i::Integer, r::MyRational)
r * i # this will call the previous method
end
5 * (2 ⨸ 3) # we need the parentheses since * and ⨸ have the same priority
10⁄3
It’s cumbersome to have to define these methods for every operation. There’s a better way, which we will explore in the next two sections.
47. Conversion
It is possible to provide a way for integers to be automatically converted to MyRational
values:
import Base.convert
MyRational(x::Integer) = MyRational(x, 1)
convert(::Type{MyRational}, x::Integer) = MyRational(x)
convert(MyRational, 42)
42⁄1
The Type{MyRational}
type is a special type which has a single instance: the MyRational
type itself. So this convert()
method only accepts MyRational
itself as its first argument (and we don’t actually use the first argument, so we don’t even need to give it a name in the function declaration).
Now integers will be automatically converted to MyRational
values when you assign them to an array whose element type if MyRational
:
a = [2 ⨸ 3] # the element type is MyRational
a[1] = 5 # convert(MyRational, 5) is called automatically
push!(a, 6) # convert(MyRational, 6) is called automatically
println(a)
MyRational[5 ⨸ 1, 6 ⨸ 1]
Conversion will also occur automatically in these cases:
* r::MyRational = 42
: assigning an integer to r
where r
is a local variable with a declared type of MyRational
.
* s.b = 42
if s
is a struct and b
is a field of type MyRational
(also when calling new(42)
on that struct, assuming b
is the first field).
* return 42
if the return type is declared as MyRational
(e.g., function f(x)::MyRational ... end
).
However, there is no automatic conversion when calling functions:
function for_my_rationals_only(x::MyRational)
println("It works:", x)
end
try
for_my_rationals_only(42)
catch ex
ex
end
MethodError(for_my_rationals_only, (42,), 0x0000000000006a8f)
48. Promotion
The Base
functions +
, -
, *
, /
, ^
, etc. all use a “promotion” algorithm to convert the arguments to the appropriate type. For example, adding an integer and a float promotes the integer to a float before the addition takes place. These functions use the promote()
function for this. For example, given several integers and a float, all integers get promoted to floats:
promote(1, 2, 3, 4.0)
(1.0, 2.0, 3.0, 4.0)
This is why a sum of integers and floats results in a float:
1 + 2 + 3 + 4.0
10.0
The promote()
function is also called when creating an array. For example, the following array is a Float64
array:
a = [1, 2, 3, 4.0]
4-element Array{Float64,1}:
1.0
2.0
3.0
4.0
What about the MyRational
type? Rather than create new methods for the promote()
function, the recommended approach is to create a new method for the promote_rule()
function. It takes two types and returns the type to convert to:
promote_rule(Float64, Int64)
Float64
Let’s implement a new method for this function, to make sure that any subtype of the Integer
type will be promoted to MyRational
:
import Base.promote_rule
promote_rule(::Type{MyRational}, ::Type{T}) where {T <: Integer} = MyRational
promote_rule (generic function with 141 methods)
This method definition uses parametric types: the type T
can be any type at all, as long as it is a subtype of the Integer
abstract type. If you tried to define the method promote_rule(::Type{MyRational}, ::Type{Integer})
, it would expect the type Integer
itself as the second argument, which would not work, since the promote_rule()
function will usually be called with concrete types like Int64
as its arguments.
Let’s check that it works:
promote(5, 2 ⨸ 3)
(5 ⨸ 1, 2 ⨸ 3)
Yep! Now whenever we call +
, -
, etc., with an integer and a MyRational
value, the integer will get automatically promoted to a MyRational
value:
5 + 2 ⨸ 3
17⁄3
Under the hood:
* this called +(5, 2 ⨸ 3)
,
* which called the +(::Number, ::Number)
method (thanks to multiple dispatch),
* which called promote(5, 2 ⨸ 3)
,
* which called promote_rule(Int64, MyRational)
,
* which called promote_rule(::MyRational, ::T) where {T <: Integer}
,
* which returned MyRational
,
* then the +(::Number, ::Number)
method called convert(MyRational, 5)
,
* which called MyRational(5)
,
* which returned MyRational(5, 1)
,
* and finally +(::Number, ::Number)
called +(MyRational(5, 1), MyRational(2, 3))
,
* which returned MyRational(17, 3)
.
The benefit of this approach is that we only need to implement the +
, -
, etc. functions for pairs of MyRational
values, not with all combinations of MyRational
values and integers.
If your head hurts, it’s perfectly normal. 😉 Writing a new type that is easy to use, flexible and plays nicely with existing types takes a bit of planning and work, but the point is that you will not write these every day, and once you have, they will make your life much easier.
Now let’s handle the case where we want to execute operations with MyRational
values and floats. In this case, we naturally want to promote the MyRational
value to a float. We first need to define how to convert a MyRational
value to any subtype of AbstractFloat
:
convert(::Type{T}, x::MyRational) where {T <: AbstractFloat} = T(x.num / x.den)
convert (generic function with 246 methods)
This convert()
works with any type T
which is a subtype of AbstractFloat
. It just computes x.num / x.den
and converts the result to type T
. Let’s try it:
convert(Float64, 3 ⨸ 2)
1.5
Now let’s define a promote_rule()
method which will work for any type T
which is a subtype of AbstractFloat
, and which will give priority to T
over MyRational
:
promote_rule(::Type{MyRational}, ::Type{T}) where {T <: AbstractFloat} = T
promote_rule (generic function with 142 methods)
promote(1 ⨸ 2, 4.0)
(0.5, 4.0)
Now we can combine floats and MyRational
values easily:
2.25 ^ (1 ⨸ 2)
1.5
49. Parametric Types and Functions
Julia’s Rational
type is actually a parametric type which ensures that the numerator and denominator have the same type T
, subtype of Integer
. Here’s a new version of our rational struct which enforces the same constraint:
struct MyRational2{T <: Integer}
num::T
den::T
end
To instantiate this type, we can specify the type T
:
MyRational2{BigInt}(2, 3)
MyRational2{BigInt}(2, 3)
Alternatively, we can use the MyRational2
type’s default constructor, with two integers of the same type:
MyRational2(2, 3)
MyRational2{Int64}(2, 3)
If we want to be able to construct a MyRational2
with integers of different types, we must write an appropriate constructor which handles the promotion rule:
function MyRational2(num::Integer, den::Integer)
MyRational2(promote(num, den)...)
end
MyRational2
This constructor accepts two integers of potentially different types, and promotes them to the same type. Then it calls the default MyRational2
constructor which expects two arguments of the same type. The syntax f(args...)
is analog to Python’s f(*args)
.
Let’s see if this works:
MyRational2(2, BigInt(3))
MyRational2{BigInt}(2, 3)
Great!
Note that all parametrized types such as MyRational2{Int64}
or MyRational2{BigInt}
are subtypes of MyRational2
. So if a function accepts a MyRational2
argument, you can pass it an instance of any specific, parametrized type:
function for_any_my_rational2(x::MyRational2)
println(x)
end
for_any_my_rational2(MyRational2{BigInt}(1, 2))
for_any_my_rational2(MyRational2{Int64}(1, 2))
MyRational2{BigInt}(1, 2)
MyRational2{Int64}(1, 2)
A more explicit (but verbose) syntax for this function is:
function for_any_my_rational2(x::MyRational2{T}) where {T <: Integer}
println(x)
end
for_any_my_rational2 (generic function with 1 method)
It’s useful to think of types as sets. For example, the Int64
type represents the set of all 64-bit integer values, so 42 isa Int64
:
* When x
is an instance of some type T
, it is an element of the set T
represents, and x isa T
.
* When U
is a subtype of V
, U
is a subset of V
, and U <: V
.
The MyRational2
type itself (without any parameter) represents the set of all values of MyRational2{T}
for all subtypes T
of Integer
. In other words, it is the union of all the MyRational2{T}
types. This is called a UnionAll
type, and indeed the type MyRational2
itself is an instance of the UnionAll
type:
@assert MyRational2{BigInt}(2, 3) isa MyRational2{BigInt}
@assert MyRational2{BigInt}(2, 3) isa MyRational2
@assert MyRational2 === (MyRational2{T} where {T <: Integer})
@assert MyRational2{BigInt} <: MyRational2
@assert MyRational2 isa UnionAll
If we dump the MyRational2
type, we can see that it is a UnionAll
instance, with a parameter type T
, constrained to a subtype of the Integer
type (since the upper bound ub
is Integer
):
dump(MyRational2)
UnionAll
var: TypeVar
name: Symbol T
lb: Union{}
ub: Integer <: Real
body: MyRational2{T<:Integer} <: Any
num::T
den::T
There’s a lot more to learn about Julia types. When you feel ready to explore this in more depth, check out this page. You can also take a look at the source code of Julia’s rationals.
50. Writing/Reading Files
The do
syntax we saw earlier is helpful when using the open()
function:
open("test.txt", "w") do f
write(f, "This is a test.\n")
write(f, "I repeat, this is a test.\n")
end
open("test.txt") do f
for line in eachline(f)
println("[$line]")
end
end
[This is a test.]
[I repeat, this is a test.]
The open()
function automatically closes the file at the end of the block. Notice that the line feeds \n
at the end of each line are not returned by the eachline()
function. So the equivalent Python code is:
# PYTHON
with open("test.txt", "w") as f:
f.write("This is a test.\n")
f.write("I repeat, this is a test.\n")
with open("test.txt") as f:
for line in f.readlines():
line = line.rstrip("\n")
print(f"[{line}]")
Alternatively, you can read the whole file into a string:
open("test.txt") do f
s = read(f, String)
end
"This is a test.\nI repeat, this is a test.\n"
Or more concisely:
s = read("test.txt", String)
"This is a test.\nI repeat, this is a test.\n"
The Python equivalent is:
# PYTHON
with open("test.txt") as f:
s = f.read()
51. Exceptions
Julia’s exceptions behave very much like in Python:
a = [1]
try
push!(a, 2)
#throw("Oops") # try uncommenting this line
push!(a, 3)
catch ex
println(ex)
push!(a, 4)
finally
push!(a, 5)
end
println(a)
[1, 2, 3, 5]
The equivalent Python code is:
# PYTHON
a = [1]
try:
a.append(2)
#raise Exception("Oops") # try uncommenting this line
a.append(3)
except Exception as ex:
print(ex)
a.append(4)
finally:
a.append(5)
print(a)
There is a whole hierarchy of standard exceptions which can be thrown, just like in Python. For example:
choice = 1 # try changing this value (from 1 to 4)
try
choice == 1 && open("/foo/bar/i_dont_exist.txt")
choice == 2 && sqrt(-1)
choice == 3 && push!(a, "Oops")
println("Everything worked like a charm")
catch ex
if ex isa SystemError
println("Oops. System error #$(ex.errnum) ($(ex.prefix))")
elseif ex isa DomainError
println("Oh no, I could not compute sqrt(-1)")
else
println("I got an unexpected error: $ex")
end
end
Oops. System error #2 (opening file "/foo/bar/i_dont_exist.txt")
Compare this with Python’s equivalent code:
# PYTHON
choice = 3 # try changing this value (from 1 to 4)
try:
if choice == 1:
open("/foo/bar/i_dont_exist.txt")
if choice == 2:
math.sqrt(-1)
if choice == 3:
#a.append("Ok") # this would actually work
raise TypeError("Oops") # so let's fail manually
print("Everything worked like a charm")
except OSError as ex:
print(f"Oops. OS error (#{ex.errno} ({ex.strerror})")
except ValueError:
print("Oh no, I could not compute sqrt(-1)")
except Exception as ex:
print(f"I got an unexpected error: {ex}")
A few things to note here:
- Julia only allows a single
catch
block which handles all possible exceptions. obj isa SomeClass
is a shorthand forisa(obj, SomeClass)
which is equivalent to Python’sisinstance(obj, SomeClass)
.
Julia | Python |
---|---|
try ... catch ex if ex isa SomeError ... else ... end finally ... end |
try: ... except SomeException as ex: ... except Exception as ex: ... finally: ... |
throw any_value |
raise SomeException(...) |
obj isa SomeType or isa(obj, SomeType ) |
isinstance(obj, SomeType) |
Note that Julia does not support the equivalent of Python’s try / catch / else
construct. You need to write something like this:
catch_exception = true
try
println("Try something")
#error("ERROR: Catch me!") # try uncommenting this line
catch_exception = false
#error("ERROR: Don't catch me!") # try uncommenting this line
println("No error occurred")
catch ex
if catch_exception
println("I caught this exception: $ex")
else
throw(ex)
end
finally
println("The end")
end
println("After the end")
Try something
No error occurred
The end
After the end
The equivalent Python code is shorter, but it’s fairly uncommon:
# PYTHON
try:
print("Try something")
raise Exception("Catch me!") # try uncommenting this line
except Exception as ex:
print(f"I caught this exception: {ex}")
else:
raise Exception("Don't catch me!") # try uncommenting this line
print("No error occured")
finally:
print("The end")
print("After the end")
52. Docstrings
It’s good practice to add docstrings to every function you export. The docstring is placed just before the definition of the function:
"Compute the square of number x"
square(x::Number) = x^2
square
You can retrieve a function’s docstring using the @doc
macro:
@doc square
Compute the square of number x
The docstring is displayed when asking for help:
?square
search: square Square Square_v2 MySquares AbstractSquare lastdayofquarter
Compute the square of number x
Docstrings follow the Markdown format.
A typical docstring starts with the signature of the function, indented by 4 spaces, so it will get syntax highlighted as Julia code.
It also includes an Examples
section with Julia REPL outputs:
"""
cube(x::Number)
Compute the cube of `x`.
# Examples
```julia-repl
julia> cube(5)
125
julia> cube(im)
0 - 1im
“””
cube(x) = x^3
cube
Instead of using `julia-repl` code blocks for the examples, you can use `jldoctest` to mark these examples as doctests (similar to Python's doctests).
The help gets nicely formatted:
```julia
?cube
search: [0m[1mc[22m[0m[1mu[22m[0m[1mb[22m[0m[1me[22m [0m[1mC[22mdo[0m[1mu[22m[0m[1mb[22ml[0m[1me[22m
cube(x::Number)
Compute the cube of x
.
53. Examples
julia> cube(5)
125
julia> cube(im)
0 - 1im
When there are several methods for a given function, it is common to give general information about the function in the first method (usually the most generic), and only add docstrings to other methods if they add useful information (without repeating the general info).
Alternatively, you may attach the general information to the function itself:
"""
foo(x)
Compute the foo of the bar
"""
function foo end # declares the foo function
# foo(x::Number) behaves normally, no need for a docstring
foo(x::Number) = "baz"
"""
foo(x::String)
For strings, compute the qux of the bar instead.
"""
foo(x::String) = "qux"
foo
?foo
search: [0m[1mf[22m[0m[1mo[22m[0m[1mo[22m [0m[1mf[22ml[0m[1mo[22m[0m[1mo[22mr pointer_[0m[1mf[22mr[0m[1mo[22mm_[0m[1mo[22mbjref wait[0m[1mf[22m[0m[1mo[22mrbutt[0m[1mo[22mnpress Over[0m[1mf[22ml[0m[1mo[22mwErr[0m[1mo[22mr
foo(x)
Compute the foo of the bar
foo(x::String)
For strings, compute the qux of the bar instead.
54. Macros
We have seen a few macros already: @which
, @assert
, @time
, @benchmark
, @btime
and @doc
. You guessed it: all macros start with an @
sign.
What is a macro? It is a function which can fully inspect the expression that follows it, and apply any transformation to that code at parse time, before compilation.
This makes it possible for anyone to effectively extend the language in any way they please. Whereas C/C++ macros just do simple text replacement, Julia macros are powerful meta-programming tools.
On the flip side, this also means that each macro has its own syntax and behavior.
A personal opinion: in my experience, languages that provide great flexibility typically attract a community of programmers with a tinkering mindset, who will love to experiment with all the fun features the language has to offer. This is great for creativity, but it can also be a nuisance if the community ends up producing too much experimental code, without much care for code reliability, API stability, or even for simplicity. By all means, let’s be creative, let’s experiment, but with great power comes great responsibility: let’s also value reliability, stability and simplicity.
That said, to give you an idea of what macro definitions look like in Julia, here’s a simple toy macro that replaces a + b
expressions with a - b
, and leaves other expressions alone.
macro addtosub(x)
if x.head == :call && x.args[1] == :+ && length(x.args) == 3
Expr(:call, :-, x.args[2], x.args[3])
else
x
end
end
@addtosub 10 + 2
8
In this macro definition, :call
, :+
and :-
are symbols. These are similar to strings, only more efficient and less flexible. They are typically used as identifiers, such as keys in dictionaries.
If you’re curious, the macro works because the parser converts 10 + 2
to Expr(:call, :+, 10, 2)
and passes this expression to the macro (before compilation). The if
statement checks that the expression is a function call, where the called function is the +
function, with two arguments. If so, then the macro returns a new expression, corresponding to a call to the -
function, with the same arguments. So a + b
becomes a - b
.
For more info, check out this page.
55. Special Prefixed Strings
py"..."
strings are defined by the PyCall
module. Writing py"something"
is equivalent to writing @py_str "something"
. In other words, anyone can write a macro that defines a new kind of prefixed string. For example, if you write the @ok_str
macro, it will be called when you write ok"something"
.
Another example is the Pkg
module which defines the @pkg_str
macro: this is why you can use pkg"..."
to interact with the Pkg
module. This is how pkg"add PyCall; precompile;"
worked (at the end of the very first cell). This downloaded, installed and precompiled the PyCall
module.
56. Modules
In Python, a module must be defined in a dedicated file. In Julia, modules are independent from the file system. You can define several modules per file, or define one module across multiple files, it’s up to you. Let’s create a simple module containing two submodules, each containing a variable and a function:
module ModA
pi = 3.14
square(x) = x^2
module ModB
e = 2.718
cube(x) = x^3
end
module ModC
root2 = √2
relu(x) = max(0, x)
end
end
Main.ModA
The default module is Main
, so whatever we define is put in this module (except when defining a package, as we will see). This is why the ModA
‘s full name is Main.ModA
.
We can now access the contents of these modules by providing the full paths:
Main.ModA.ModC.root2
1.4142135623730951
Since our code runs in the Main
module, we can leave out the Main.
part:
ModA.ModC.root2
1.4142135623730951
Alternatively, you can use import
:
import Main.ModA.ModC.root2
root2
1.4142135623730951
Or we can use import
with a relative path. In this case, we need to prefix ModA
with a dot .
to indicate that we want the module ModA
located in the current module:
import .ModA.ModC.root2
root2
1.4142135623730951
Alternatively, we can import
the submodule:
import .ModA.ModC
ModC.root2
1.4142135623730951
When you want to import more than one name from a module, you can use this syntax:
import .ModA.ModC: root2, relu
This is equivalent to this more verbose syntax:
import .ModA.ModC.root2, .ModA.ModC.relu
Nested modules do not automatically have access to names in enclosing modules. To import names from a parent module, use ..x
. From a grand-parent module, use ...x
, and so on.
module ModD
d = 1
module ModE
try
println(d)
catch ex
println(ex)
end
end
module ModF
f = 2
module ModG
import ..f
import ...d
println(f)
println(d)
end
end
end
UndefVarError(:d)
2
1
Main.ModD
Instead of import
, you can use using
. It is analog to Python’s from foo import *
. It only gives access to names which were explicitly exported using export
(similar to the way from foo import *
in Python only imports names listed in the module’s __all__
list):
module ModH
h1 = 1
h2 = 2
export h1
end
Main.ModH
using .ModH
println(h1)
try
println(h2)
catch ex
ex
end
1
UndefVarError(:h2)
Note that using Foo
not only imports all exported names (like Python’s from foo import *
), it also imports Foo
itself (similarly, using Foo.Bar
imports Bar
itself):
ModH
Main.ModH
Even if a name is not exported, you can always access it using its full path, or using import
:
ModH.h2
2
import .ModH.h2
h2
2
You can also import individual names like this:
module ModG
g1 = 1
g2 = 2
export g2
end
using .ModG: g1, g2
println(g1)
println(g2)
1
2
Notice that this syntax gives you access to any name you want, whether or not it was exported. In other words, whether a name is exported or not only affects the using Foo
syntax.
Importantly, when you want to expand a function which is defined in a module, you must import the function using import
, or you must specify the function’s path:
module ModH
double(x) = x * 2
triple(x) = x * 3
end
import .ModH: double
double(x::AbstractString) = repeat(x, 2)
ModH.triple(x::AbstractString) = repeat(x, 3)
println(double(2))
println(double("Two"))
println(ModH.triple(3))
println(ModH.triple("Three"))
4
TwoTwo
9
ThreeThreeThree
WARNING: replacing module ModH.
You must never extend a function imported with using
, unless you provide the function’s path:
module ModI
quadruple(x) = x * 4
export quadruple
end
using .ModI
ModI.quadruple(x::AbstractString) = repeat(x, 4) # OK
println(quadruple(4))
println(quadruple("Four"))
#quadruple(x::AbstractString) = repeat(x, 4) # uncomment to see the error
16
FourFourFourFour
There is no equivalent of Python’s import foo as x
(yet), but you can do something like this:
import .ModI: quadruple
x = quadruple
quadruple (generic function with 2 methods)
In general, a module named Foo
will be defined in a file named Foo.jl
(along with its submodules). However, if the module becomes too big for a single file, you can split it into multiple files and include these files in Foo.jl
using the include()
function.
For example, let’s create three files: Awesome.jl
, great.jl
and amazing/Fantastic.jl
, where:
* Awesome.jl
defines the Awesome
module and includes the other two files
* great.jl
just defines a function
* amazing/Fantastic.jl
defines the Fantastic
submodule
code_awesome = """
module Awesome
include("great.jl")
include("amazing/Fantastic.jl")
end
"""
code_great = """
great() = "This is great!"
"""
code_fantastic = """
module Fantastic
fantastic = true
end
"""
open(f->write(f, code_awesome), "Awesome.jl", "w")
open(f->write(f, code_great), "great.jl", "w")
mkdir("amazing")
open(f->write(f, code_fantastic), "amazing/Fantastic.jl", "w")
38
If we try to execute import Awesome
now, it won’t work since Julia does not search in the current directory by default. Let’s change this:
pushfirst!(LOAD_PATH, ".")
4-element Array{String,1}:
"."
"@"
"@v#.#"
"@stdlib"
Now when we import the Awesome
module, Julia will look for a file named Awesome.jl
in the current directory, or for Awesome/src/Awesome.jl
, or for Awesome.jl/src/Awesome.jl
. If it does not find any of these, it will look in the other places listed in the LOAD_PATH
array (we will discuss this in more details in the “Package Management” section).
import Awesome
println(Awesome.great())
println("Is fantastic? ", Awesome.Fantastic.fantastic)
┌ Info: Precompiling Awesome [top-level]
└ @ Base loading.jl:1260
This is great!
Is fantastic? true
Let’s restore the original LOAD_PATH
:
popfirst!(LOAD_PATH)
"."
In short:
Julia | Python |
---|---|
import Foo |
import foo |
import Foo.Bar |
from foo import bar |
import Foo.Bar: a, b |
from foo.bar import a, b |
import Foo.Bar.a, Foo.Bar.b |
from foo.bar import a, b |
import .Foo |
import .foo |
import ..Foo.Bar |
from ..foo import bar |
import ...Foo.Bar |
from ...foo import bar |
import .Foo: a, b |
from .foo import a, b |
using Foo |
from foo import *; import foo |
using Foo.Bar |
from foo.bar import *; from foo import bar |
using Foo.Bar: a, b |
from foo.bar import a, b |
Extending function Foo.f() |
Result |
---|---|
import Foo.f # or Foo: f f(x::Int64) = ... |
OK |
import Foo Foo.f(x::Int64) = ... |
OK |
using Foo Foo.f(x::Int64) = ... |
OK |
import Foo.f # or Foo: f Foo.f(x::Int64) = ... |
ERROR: Foo not defined |
using Foo f(x::Int64) = ... |
ERROR: Foo.f must be explicitly imported |
using Foo: f f(x::Int64) = ... |
ERROR: Foo.f must be explicitly imported |
57. Scopes
Julia has two types of scopes: global and local.
Every module has its own global scope, independent from all other global scopes. There is no overarching global scope.
Modules, macros and types (including structs) can only be defined in a global scope.
Most code blocks, including function
, struct
, for
, while
, etc., have their own local scope. For example:
for q in 1:3
println(q)
end
try
println(q) # q is not available here
catch ex
ex
end
1
2
3
UndefVarError(:q)
A local scope inherits from its parent scope:
z = 5
for i in 1:3
w = 10
println(i * w * z) # i and w are local, z is from the parent scope
end
50
100
150
An inner scope can assign to a variable in the parent scope, if the parent scope is not global:
for i in 1:3
s = 0
for j in 1:5
s = j # variable s is from the parent scope
end
println(s)
end
5
5
5
You can force a variable to be local by using the local
keyword:
for i in 1:3
s = 0
for j in 1:5
local s = j # variable s is local now
end
println(s)
end
0
0
0
To assign to a global variable, you must declare the variable as global
in the local scope:
for i in 1:3
global p
p = i
end
p
3
There is one exception to this rule: when executing code directly in the REPL (since Julia 1.5) or in IJulia, you do not need to declare a variable as global
if the global variable already exists:
s = 0
for i in 1:3
s = i # implicitly global s: only in REPL Julia 1.5+ or IJulia
end
s
3
In functions, assigning to a variable which is not explicitly declared as global always makes it local (even in the REPL and IJulia):
s, t = 1, 2 # globals
function foo()
s = 10 * t # s is local, t is global
end
println(foo())
println(s)
20
1
Just like in Python, functions can capture variables from the enclosing scope (not from the scope the function is called from):
t = 1
foo() = t # foo() captures t from the global scope
function bar()
t = 5 # this is a new local variable
println(foo()) # foo() still uses t from the global scope
end
bar()
1
function quz()
global t
t = 5 # we change the global t
println(foo()) # and this affects foo()
end
quz()
5
Closures work much like in Python:
function create_multiplier(n)
function mul(x)
x * n # variable n is captured from the parent scope
end
end
mul2 = create_multiplier(2)
mul2(5)
10
An inner function can modify variables from its parent scope:
function create_counter()
c = 0
inc() = c += 1 # this inner function modifies the c from the outer function
end
cnt = create_counter()
println(cnt())
println(cnt())
1
2
Consider the following code, and see if you can figure out why it prints the same result multiple times:
funcs = []
i = 1
while i ≤ 5
push!(funcs, ()->i^2)
global i += 1
end
for fn in funcs
println(fn())
end
36
36
36
36
36
The answer is that there is a single variable i
, which is captured by all 5 closures. By the time these closures are executed, the value of i
is 6, so the square is 36, for every closure.
If we use a for
loop, we don’t have this problem, since a new local variable is created at every iteration:
funcs = []
for i in 1:5
push!(funcs, ()->i^2)
end
for fn in funcs
println(fn())
end
1
4
9
16
25
Any local variable created within a for
loop, a while
loop or a comprehension also get a new copy at each iteration. So we could code the above example like this:
funcs = []
i = 1
while i ≤ 5 # since we are in a while loop...
global i
local j = i # ...and j is created here, it's a new `j` at each iteration
push!(funcs, ()->j^2)
i += 1
end
for fn in funcs
println(fn())
end
1
4
9
16
25
Another way to get the same result is to use a let
block, which also creates a new local variable every time it is executed:
funcs = []
i = 0
while i < 5
let i=i
push!(funcs, ()->i^2)
end
global i += 1
end
for fn in funcs
println(fn())
end
0
1
4
9
16
This let i=i
block defines a new local variable i
at every iteration, and initializes it with the value of i
from the parent scope. Therefore each closure captures a different local variable i
.
Variables in a let
block are initialized from left to right, so they can access variables on their left:
a = 1
let a=a+1, b=a
println("a=$a, b=$b")
end
a=2, b=2
In this example, the local variable a
is initialized with the value of a + 1
, where a
comes from the parent scope (i.e., it’s the global a
in this case). However, b
is initialized with the value of the local a
, since it now hides the variable a
from the parent scope.
Default values in function arguments also have this left-to-right scoping logic:
a = 1
foobar(a=a+1, b=a) = println("a=$a, b=$b")
foobar()
foobar(5)
a=2, b=2
a=5, b=5
In this example, the first argument’s default value is a + 1
, where a
comes from the parent scope (i.e., the global a
in this case). However, the second argument’s default value is a
, where a
in this case is the value of the first argument (not the parent scope’s a
).
Note that if
blocks and begin
blocks do not have their own local scope, they just use the parent scope:
a = 1
if true
a = 2 # same `a` as above
end
a
2
a = 1
begin
a = 2 # same `a` as above
end
a
2
57. Package Management
Basic Workflow
The simplest way to write a Julia program is to create a .jl
file somewhere and run it using julia
. You would usually do this with your favorite editor, but in this notebook we must do this programmatically. For example:
code = """
println("Hello world")
"""
open(f->write(f, code), "my_program1.jl", "w")
23
Then let’s run the program using a shell command:
;julia my_program1.jl
Hello world
If you need to use a package which is not part of the standard library, such as PyCall
, you first need to install it using Julia’s package manager Pkg
:
using Pkg
Pkg.add("PyCall")
[32m[1m Updating[22m[39m registry at `~/.julia/registries/General`
[?25l[2K
[32m[1m Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[?25h
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
Alternatively, in interactive mode, you can enter the Pkg
mode by typing ]
, then type a command:
]add PyCall
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
You can also precompile the new package to avoid the compilation delay when the package is first used:
]add PyCall; precompile;
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...
One last alternative is to use pkg"..."
strings to run commands in your programs:
pkg"add PyCall; precompile;"
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...
Now you can import PyCall
in any of your Julia programs:
code = """
using PyCall
py"print('1 + 2 =', 1 + 2)"
"""
open(f->write(f, code), "my_program2.jl", "w")
41
;julia my_program2.jl
1 + 2 = 3
You can also add packages by providing their URL (typically on github). This is useful when you want to use a package which is not in the official Julia Package registry, or when you want the very latest version of a package:
]add https://github.com/JuliaLang/Example.jl
[?25l
[32m[1m Cloning[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`
[2K[?25h[?25l
[32m[1m Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`
[2K[?25h
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
You can install a specific package version like this:
]add [email protected]
[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m PyCall ─ v1.91.3
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [438e738f][39m[95m ↓ PyCall v1.91.4 ⇒ v1.91.3[39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [438e738f][39m[95m ↓ PyCall v1.91.4 ⇒ v1.91.3[39m
[32m[1m Building[22m[39m PyCall → `~/.julia/packages/PyCall/kAhnQ/deps/build.log`
If you only specify version 1
or version 1.91
, Julia will get the latest version with that prefix. For example, ]add [email protected]
would install the latest version 0.91.x
.
You can also update a package to its latest version:
]update PyCall
[32m[1m Updating[22m[39m registry at `~/.julia/registries/General`
[?25l[2K
[32m[1m Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[?25h
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [438e738f][39m[93m ↑ PyCall v1.91.3 ⇒ v1.91.4[39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [438e738f][39m[93m ↑ PyCall v1.91.3 ⇒ v1.91.4[39m
You can update all packages to their latest versions:
]update
[32m[1m Updating[22m[39m registry at `~/.julia/registries/General`
[?25l[2K
[32m[1m Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[?25h[?25l[2K
[32m[1m Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`
[?25h
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
If you don’t want a particular package to be updated the next time you call ]update
, you can pin it:
]pin PyCall
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [438e738f][39m[93m ~ PyCall v1.91.4 ⇒ v1.91.4 ⚲[39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [438e738f][39m[93m ~ PyCall v1.91.4 ⇒ v1.91.4 ⚲[39m
To unpin the package:
]free PyCall
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [438e738f][39m[93m ~ PyCall v1.91.4 ⚲ ⇒ v1.91.4[39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [438e738f][39m[93m ~ PyCall v1.91.4 ⚲ ⇒ v1.91.4[39m
You can also run the tests defined in a package:
]test Example
[32m[1m Testing[22m[39m Example
[32m[1mStatus[22m[39m `/tmp/jl_2kZjcq/Manifest.toml`
[90m [7876af07][39m[37m Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[90m [2a0f44e3][39m[37m Base64 [39m
[90m [8ba89e20][39m[37m Distributed [39m
[90m [b77e0a4c][39m[37m InteractiveUtils [39m
[90m [56ddb016][39m[37m Logging [39m
[90m [d6f4376e][39m[37m Markdown [39m
[90m [9a3f8284][39m[37m Random [39m
[90m [9e88b42a][39m[37m Serialization [39m
[90m [6462fe0b][39m[37m Sockets [39m
[90m [8dfed614][39m[37m Test [39m
[32m[1m Testing[22m[39m Example tests passed
Of course, you can remove a package:
]rm Example
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [7876af07][39m[91m - Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [7876af07][39m[91m - Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
Lastly, you can check which packages are installed using ]status
(or ]st
for short):
]st
[32m[1mStatus[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [6e4b80f9][39m[37m BenchmarkTools v0.5.0[39m
[90m [052768ef][39m[37m CUDA v1.0.2[39m
[90m [7073ff75][39m[37m IJulia v1.21.2[39m
[90m [438e738f][39m[37m PyCall v1.91.4[39m
[90m [d330b81b][39m[37m PyPlot v2.9.0[39m
For more Pkg
commands, type ]help
.
Julia (in interactive mode) | Python (in a terminal) |
---|---|
]status |
pip freeze or conda list |
]add Foo |
pip install foo or conda install foo |
]add [email protected] |
pip install foo==1.2 or conda install foo=1.2 |
]update Foo |
pip install --upgrade foo or conda update foo |
]pin Foo |
foo== in requirements.txt or foo= in environment.yml |
]free Foo |
foo in requirements.txt or foo in environment.yml |
]test Foo |
python -m unittest foo |
]rm Foo |
pip uninstall foo or conda remove foo |
]help |
pip --help |
This workflow is fairly simple, but it means that all of your programs will be using the same version of each package. This is analog to installing packages using pip install
without using virtual environments.
58. Projects
If you want to have multiple projects, each with different libraries and library versions, you should define projects. These are analog to Python virtual environments.
A project is just a directory containing a Project.toml
file and a Manifest.toml
file:
my_project/
Project.toml
Manifest.toml
Project.toml
is similar to arequirements.txt
file (for pip) orenvironment.yml
(for conda): it lists the dependencies of the project, and compatibility constraints (e.g.,SomeDependency = 2.5
).Manifest.toml
is an automatically generated file which lists the exact versions and unique IDs (UUIDs) of all the packages that Julia found, based onProject.toml
. It includes all the implicit dependencies of the project’s packages. This is useful to reproduce an environment precisely. Analog to the output ofpip --freeze
.
By default, the active project is located in ~/.julia/environments/v#.#
(where #.#
is the Julia version you are using, such as 1.4). You can set a different project when starting Julia:
# BASH
julia --project=/path/to/my_project
Or you can set the JULIA_PROJECT
environment variable:
# BASH
export JULIA_PROJECT=/path/to/my_project
julia
Or you can just activate a project directly in Julia (this is analog to running source my_project/env/bin/activate
when using virtualenv):
Pkg.activate("my_project")
[32m[1m Activating[22m[39m new environment at `/content/my_project/Project.toml`
The my_project
directory does not exist yet, but it gets created automatically, along with the Project.toml
and Manifest.toml
files, when you first add a package:
]add PyCall
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `/content/my_project/Project.toml`
[90m [438e738f][39m[92m + PyCall v1.91.4[39m
[32m[1m Updating[22m[39m `/content/my_project/Manifest.toml`
[90m [8f4d0f93][39m[92m + Conda v1.4.1[39m
[90m [682c06a0][39m[92m + JSON v0.21.0[39m
[90m [1914dd2f][39m[92m + MacroTools v0.5.5[39m
[90m [69de0a69][39m[92m + Parsers v1.0.6[39m
[90m [438e738f][39m[92m + PyCall v1.91.4[39m
[90m [81def892][39m[92m + VersionParsing v1.2.0[39m
[90m [2a0f44e3][39m[92m + Base64 [39m
[90m [ade2ca70][39m[92m + Dates [39m
[90m [8ba89e20][39m[92m + Distributed [39m
[90m [b77e0a4c][39m[92m + InteractiveUtils [39m
[90m [8f399da3][39m[92m + Libdl [39m
[90m [37e2e46d][39m[92m + LinearAlgebra [39m
[90m [56ddb016][39m[92m + Logging [39m
[90m [d6f4376e][39m[92m + Markdown [39m
[90m [a63ad114][39m[92m + Mmap [39m
[90m [de0858da][39m[92m + Printf [39m
[90m [9a3f8284][39m[92m + Random [39m
[90m [9e88b42a][39m[92m + Serialization [39m
[90m [6462fe0b][39m[92m + Sockets [39m
[90m [8dfed614][39m[92m + Test [39m
[90m [4ec0a83e][39m[92m + Unicode [39m
You can also add a package via its URL:
]add https://github.com/JuliaLang/Example.jl
[?25l[2K
[32m[1m Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`
[?25h
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `/content/my_project/Project.toml`
[90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m Updating[22m[39m `/content/my_project/Manifest.toml`
[90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
Let’s also add a package with a specific version:
]add [email protected]
[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m Example ─ v0.3.3
[32m[1m Updating[22m[39m `/content/my_project/Project.toml`
[90m [7876af07][39m[95m ↓ Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl) ⇒ v0.3.3[39m
[32m[1m Updating[22m[39m `/content/my_project/Manifest.toml`
[90m [7876af07][39m[95m ↓ Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl) ⇒ v0.3.3[39m
Now the Project.toml
and Manifest.toml
files were created:
;find my_project
my_project
my_project/Manifest.toml
my_project/Project.toml
Notice that the packages we added to the project were not placed in the my_project
directory itself. They were saved in the ~/.julia/packages
directory, the compiled files were placed in ~/.julia/compiled
director, logs were written to ~/.julia/logs
and so on.
If several projects use the same package, it will only be downloaded and built once (well, once per version). The ~/.julia/packages
directory can hold multiple versions of the same package, so it’s fine if different projects use different versions of the same package. There will be no conflict, no “dependency hell”.
The Project.toml
just says that the project depends on PyCall
and Example
, and it specifies the UUID of this package:
print(read("my_project/Project.toml", String))
[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
UUIDs are useful to avoid name conflicts. If several people name their package CoolStuff
, then the UUID will clarify which one we are referring to.
The Manifest.toml
file is much longer, since it contains all the packages which PyCall
and Example
depend on, along with their versions (except for the standard library packages), and the dependency graph. This file should never be modified manually:
print(read("my_project/Manifest.toml", String))
# This file is machine-generated - editing it directly is not advised
[[Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
[[Conda]]
deps = ["JSON", "VersionParsing"]
git-tree-sha1 = "7a58bb32ce5d85f8bf7559aa7c2842f9aecf52fc"
uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
version = "1.4.1"
[[Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
[[Distributed]]
deps = ["Random", "Serialization", "Sockets"]
uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"
[[Example]]
git-tree-sha1 = "276fa06109ac5c80035cff711b0a18ad5b3117cc"
uuid = "7876af07-990d-54b4-ab0e-23690620f79a"
version = "0.3.3"
[[InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
[[JSON]]
deps = ["Dates", "Mmap", "Parsers", "Unicode"]
git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e"
uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
version = "0.21.0"
[[Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
[[LinearAlgebra]]
deps = ["Libdl"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
[[Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
[[MacroTools]]
deps = ["Markdown", "Random"]
git-tree-sha1 = "f7d2e3f654af75f01ec49be82c231c382214223a"
uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
version = "0.5.5"
[[Markdown]]
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
[[Mmap]]
uuid = "a63ad114-7e13-5084-954f-fe012c677804"
[[Parsers]]
deps = ["Dates", "Test"]
git-tree-sha1 = "20ef902ea02f7000756a4bc19f7b9c24867c6211"
uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
version = "1.0.6"
[[Printf]]
deps = ["Unicode"]
uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
[[PyCall]]
deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"]
git-tree-sha1 = "3a3fdb9000d35958c9ba2323ca7c4958901f115d"
uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
version = "1.91.4"
[[Random]]
deps = ["Serialization"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
[[Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
[[Sockets]]
uuid = "6462fe0b-24de-5631-8697-dd941f90decc"
[[Test]]
deps = ["Distributed", "InteractiveUtils", "Logging", "Random"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
[[Unicode]]
uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
[[VersionParsing]]
git-tree-sha1 = "80229be1f670524750d905f8fc8148e5a8c4537f"
uuid = "81def892-9a0e-5fdd-b105-ffc91e053289"
version = "1.2.0"
Note that Manifest.toml
contains the precise version of the Example
package that was installed, but the Project.toml
file does not specify that version 0.3
is required. That’s because Julia cannot know whether your project is supposed to work only with any version 0.3.x
, or whether it could work with other versions as well. So if you want to specify a version constraint for the Example
package, you must add it manually in Project.toml
. You would normally use your favorite editor to do this, but in this notebook we’ll update Project.toml
programmatically:
append_config = """
[compat]
Example = "0.3"
"""
open(f->write(f, append_config), "my_project/Project.toml", "a")
26
Here is the updated Project.toml
file:
print(read("my_project/Project.toml", String))
[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
[compat]
Example = "0.3"
Now if we try to replace Example
0.3 with version 0.2, we get an error:
try
pkg"add [email protected]"
catch ex
ex
end
[32m[1m Resolving[22m[39m package versions...
Pkg.Resolve.ResolverError("empty intersection between [email protected] and project compatibility 0.3", nothing)
Now you can run a program based on this project, and it will have the possibility to use all the packages which have been added to this project, with their specific versions. If you import a package which was not explicitly added to this project, Julia will fallback to the default project:
code = """
import PyCall # found in the project
import PyPlot # not found, so falls back to default project
println("Success!")
"""
open(f->write(f, code), "my_program3.jl", "w")
117
;julia --project=my_project my_program3.jl
Success!
59. Packages
Falling back to the default project is fine, as long as you run the code on your own machine, but if you want to share your code with other people, it would be brittle to count on packages installed in their default project. Instead, if you plan to share your code, you should clearly specify which packages it depends on, and use only these packages. Such a shareable project is called a package.
A package is a regular project (as defined above), but with a few extras:
* the Project.toml
file must specify a name
, a version
and a uuid
.
* there must be a src/PackageName.jl
file containing a module named PackageName
.
* you generally want to specify the authors
and description
, and maybe also the license
, repository
(e.g., the package’s github URL), and some keywords
, but all of these are optional.
It is very easy to create a new package using the ]generate
command. To define the authors
field, Pkg
will look up the user.name
and user.email
git config entries, so let’s define them before we generate the package:
;git config --global user.name "Alice Bob"
;git config --global user.email "[email protected]"
]generate MyPackages/Hello
[32m[1m Generating[22m[39m project Hello:
MyPackages/Hello/Project.toml
MyPackages/Hello/src/Hello.jl
This generated the MyPackages/Hello/Project.toml
file (along with the enclosing directories) and the MyPackages/Hello/src/Hello.jl
file. Let’s take a look at the Project.toml
file:
print(read("MyPackages/Hello/Project.toml", String))
name = "Hello"
uuid = "b1200148-98bf-43d1-9bb1-85f7b4552217"
authors = ["Alice Bob "]
version = "0.1.0"
Notice that the project has no dependencies yet, but it has a name, a unique UUID, and a version (plus an author).
Note: if Pkg
does not find a your name or email in the git config, it falls back to environment variables (GIT_AUTHOR_NAME
, GIT_COMMITTER_NAME
, USER
, USERNAME
, NAME
and GIT_AUTHOR_EMAIL
, GIT_COMMITTER_EMAIL
, EMAIL
).
And let’s look at the src/Hello.jl
file:
print(read("MyPackages/Hello/src/Hello.jl", String))
module Hello
greet() = print("Hello World!")
end # module
Let’s try to use the greet()
function from the Hello
package:
try
import Hello
Hello.greet()
catch ex
ex
end
ArgumentError("Package Hello not found in current path:\n- Run `import Pkg; Pkg.add(\"Hello\")` to install the Hello package.\n")
Julia could not find the Hello
package. When you’re working on a package, don’t forget to activate it first!
]activate MyPackages/Hello
[32m[1m Activating[22m[39m environment at `/content/MyPackages/Hello/Project.toml`
import Hello
Hello.greet()
┌ Info: Precompiling Hello [b1200148-98bf-43d1-9bb1-85f7b4552217]
└ @ Base loading.jl:1260
Hello World!
It works!
If the Hello
package depends on other packages, we must add them:
]add PyCall Example
[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m Example ─ v0.5.3
[32m[1m Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
[90m [7876af07][39m[92m + Example v0.5.3[39m
[90m [438e738f][39m[92m + PyCall v1.91.4[39m
[32m[1m Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
[90m [8f4d0f93][39m[92m + Conda v1.4.1[39m
[90m [7876af07][39m[92m + Example v0.5.3[39m
[90m [682c06a0][39m[92m + JSON v0.21.0[39m
[90m [1914dd2f][39m[92m + MacroTools v0.5.5[39m
[90m [69de0a69][39m[92m + Parsers v1.0.6[39m
[90m [438e738f][39m[92m + PyCall v1.91.4[39m
[90m [81def892][39m[92m + VersionParsing v1.2.0[39m
[90m [2a0f44e3][39m[92m + Base64 [39m
[90m [ade2ca70][39m[92m + Dates [39m
[90m [8ba89e20][39m[92m + Distributed [39m
[90m [b77e0a4c][39m[92m + InteractiveUtils [39m
[90m [8f399da3][39m[92m + Libdl [39m
[90m [37e2e46d][39m[92m + LinearAlgebra [39m
[90m [56ddb016][39m[92m + Logging [39m
[90m [d6f4376e][39m[92m + Markdown [39m
[90m [a63ad114][39m[92m + Mmap [39m
[90m [de0858da][39m[92m + Printf [39m
[90m [9a3f8284][39m[92m + Random [39m
[90m [9e88b42a][39m[92m + Serialization [39m
[90m [6462fe0b][39m[92m + Sockets [39m
[90m [8dfed614][39m[92m + Test [39m
[90m [4ec0a83e][39m[92m + Unicode [39m
You must not use any package which has not been added to the project. If you do, you will get a warning.
Once you are happy with your package, you can deploy it to github (or anywhere else). Then you can add it to your own projects just like any other package.
If you want to make your package available to the world via the official Julia registry, you just need to send a Pull Request to https://github.com/JuliaRegistries/General. However, it’s highly recommended to automate this using the Registrator.jl github app.
If you want to use other registries (including private registries), check out this page.
Also check out the PkgTemplate
package, which provides more sophisticated templates for creating new packages, for example with continuous integration, code coverage tests, etc.
60. Fixing Issues in a Dependency
Sometimes you may run into an issue inside one of the packages your project depends on. When this happens, you can use Pkg
‘s dev
command to fix the issue. For example, let’s pretend the Example
package has a bug:
]dev Example
[?25l
[32m[1m Cloning[22m[39m git-repo `https://github.com/JuliaLang/Example.jl.git`
[2K[?25h
[32m[1m Resolving[22m[39m package versions...
[32m[1m Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
[90m [7876af07][39m[93m ↑ Example v0.5.3 ⇒ v0.5.4 [`~/.julia/dev/Example`][39m
[32m[1m Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
[90m [7876af07][39m[93m ↑ Example v0.5.3 ⇒ v0.5.4 [`~/.julia/dev/Example`][39m
This command cloned the repo into ~/.julia/dev/Example
:
;ls -l "~/.julia/dev"
total 4
drwxr-xr-x 7 root root 4096 Jul 2 00:06 Example
It also updated the Hello
package’s Manifest.toml
file to ensure the package now uses the Example
clone. You can see this using ]status
:
]st
[36m[1mProject [22m[39mHello v0.1.0
[32m[1mStatus[22m[39m `/content/MyPackages/Hello/Project.toml`
[90m [7876af07][39m[37m Example v0.5.4 [`~/.julia/dev/Example`][39m
[90m [438e738f][39m[37m PyCall v1.91.4[39m
So you would now go ahead and edit the clone and fix the bug. Of course, you would also want to send a PR to the package’s owners so the source package gets fixed. Once that happens, you can go back to the official Example
package easily:
]free Example
[32m[1m Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
[90m [7876af07][39m[95m ↓ Example v0.5.4 [`~/.julia/dev/Example`] ⇒ v0.5.3[39m
[32m[1m Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
[90m [7876af07][39m[95m ↓ Example v0.5.4 [`~/.julia/dev/Example`] ⇒ v0.5.3[39m
]st
[36m[1mProject [22m[39mHello v0.1.0
[32m[1mStatus[22m[39m `/content/MyPackages/Hello/Project.toml`
[90m [7876af07][39m[37m Example v0.5.3[39m
[90m [438e738f][39m[37m PyCall v1.91.4[39m
61. Instantiating a Project
If you want to run someone else’s project and you want to make sure you are using the exact same package versions, you can clone the project, and assuming it has a Manifest.toml
file, you can activate the project and run ]instantiate
to install all the appropriate packages. For example, let’s instantiate the Registrator.jl
project:
;git clone https://github.com/JuliaRegistries/Registrator.jl
Cloning into 'Registrator.jl'...
]activate Registrator.jl
[32m[1m Activating[22m[39m environment at `/content/Registrator.jl/Project.toml`
]instantiate
[32m[1m Installed[22m[39m TableTraits ───────────────── v1.0.0
[32m[1m Installed[22m[39m AutoHashEquals ────────────── v0.2.0
[32m[1m Installed[22m[39m Hiccup ────────────────────── v0.2.2
[32m[1m Installed[22m[39m DataAPI ───────────────────── v1.2.0
[32m[1m Installed[22m[39m Lazy ──────────────────────── v0.14.0
[32m[1m Installed[22m[39m WebSockets ────────────────── v1.5.2
[32m[1m Installed[22m[39m JSON2 ─────────────────────── v0.3.1
[32m[1m Installed[22m[39m HTTP ──────────────────────── v0.8.14
[32m[1m Installed[22m[39m IniFile ───────────────────── v0.5.0
[32m[1m Installed[22m[39m ZMQ ───────────────────────── v1.2.0
[32m[1m Installed[22m[39m GitForge ──────────────────── v0.1.5
[32m[1m Installed[22m[39m AssetRegistry ─────────────── v0.1.0
[32m[1m Installed[22m[39m TimeToLive ────────────────── v0.3.0
[32m[1m Installed[22m[39m DataValueInterfaces ───────── v1.0.0
[32m[1m Installed[22m[39m IteratorInterfaceExtensions ─ v1.0.0
[32m[1m Installed[22m[39m ZeroMQ_jll ────────────────── v4.3.2+2
[32m[1m Installed[22m[39m Tables ────────────────────── v1.0.4
[32m[1m Installed[22m[39m Mux ───────────────────────── v0.7.1
[32m[1m Installed[22m[39m Parsers ───────────────────── v1.0.2
[32m[1m Installed[22m[39m MbedTLS_jll ───────────────── v2.16.0+2
[32m[1m Installed[22m[39m Mustache ──────────────────── v1.0.2
[32m[1m Installed[22m[39m Pidfile ───────────────────── v1.1.0
[32m[1m Installed[22m[39m GitHub ────────────────────── v5.1.5
[32m[1m Installed[22m[39m RegistryTools ─────────────── v1.5.0
######################################################################### 100.0%
######################################################################### 100.0%
Usually, that’s all you need to know about projects and packages, but let’s look at bit under the hood, so you can handle less common cases.
62. Load Path
When you import a package, Julia searches for it in the environments listed in the LOAD_PATH
array. An environment can be a project or a directory containing a bunch of packages directly. By default, the LOAD_PATH
array contains three elements:
LOAD_PATH
3-element Array{String,1}:
"@"
"@v#.#"
"@stdlib"
Here’s what these elements mean:
* "@"
represents the active project, if any: that’s the project activated via --project
, JULIA_PROJECT
, ]activate
or Pkg.activate()
.
* "@v#.#"
represents the default shared project for the version of Julia we are running. That’s why it is used by default when there is no active project.
* "@stdlib"
represents the standard library. This is not a project: it’s a directory containing many packages.
If you want to see the actual paths, you can call Base.load_path()
:
Base.load_path()
3-element Array{String,1}:
"/content/Registrator.jl/Project.toml"
"/root/.julia/environments/v1.4/Project.toml"
"/usr/local/share/julia/stdlib/v1.4"
You can change the load path if you want to. For example, if you want Julia to look only in the active project and in the standard library, without looking in the default project, then you can set the JULIA_LOAD_PATH
environment variable to "@:@stdlib"
.
If you try to run my_program3.jl
this way, it will successfully import PyCall
, but it will fail to import PyPlot
, since it is not listed in Project.toml
(however, it would successfully import any package from the standard library):
try
withenv("JULIA_LOAD_PATH"=>"@:@stdlib") do
run(`julia --project=my_project my_program3.jl`)
end
catch ex
ex
end
ERROR: LoadError: ArgumentError: Package PyPlot not found in current path:
- Run `import Pkg; Pkg.add("PyPlot")` to install the PyPlot package.
Stacktrace:
[1] require(::Module, ::Symbol) at ./loading.jl:892
[2] include(::Module, ::String) at ./Base.jl:377
[3] exec_options(::Base.JLOptions) at ./client.jl:288
[4] _start() at ./client.jl:484
in expression starting at /content/my_program3.jl:2
ProcessFailedException(Base.Process[Process(`[4mjulia[24m [4m--project=my_project[24m [4mmy_program3.jl[24m`, ProcessExited(1))])
You can also modify the LOAD_PATH
array programmatically, for example to make all the packages in the my_packages/
directory available to the project:
push!(LOAD_PATH, "my_packages")
4-element Array{String,1}:
"@"
"@v#.#"
"@stdlib"
"my_packages"
Now any package added to this directory will be directly available to us:
]generate my_packages/Hello2
[32m[1m Generating[22m[39m project Hello2:
my_packages/Hello2/Project.toml
my_packages/Hello2/src/Hello2.jl
using Hello2
Hello2.greet()
┌ Info: Precompiling Hello2 [b76a3422-75bc-4a82-ad3b-dff89fdf93f4]
└ @ Base loading.jl:1260
Hello World!
This is a convenience for development, as we didn’t have to push this package to a repository or even add it to the project. However, it’s just for development: once you’re happy with your package, make sure to push it to a repo, and add it to the project normally.
63. Depots
As we saw earlier, new packages you add to a project are placed in the ~/.julia/packages
directory, logs are placed in ~/.julia/logs
, and so on.
A directory like ~/.julia
which contains Pkg
related content is called a depot. Julia installs all new packages in the default depot, which is the first directory in the DEPOT_PATH
array (this array can be modified manually in Julia, or set via the JULIA_DEPOT_PATH
environment variable):
DEPOT_PATH
3-element Array{String,1}:
"/root/.julia"
"/usr/local/local/share/julia"
"/usr/local/share/julia"
The default depot needs to be writeable for the current user, since that’s where new packages will be written to (as well as logs and other stuff). The other depots can be read-only: they’re typically used for private package registries.
You can occasionally run the ]gc
command, which will remove all unused package versions (Pkg
will use the logs to located existing projects).
In summary: when some code runs using Foo
or import Foo
, the LOAD_PATH
is used to determine which specific package Foo
refers to, while the DEPOT_PATH
is used to determine where it is. The exception is when the LOAD_PATH
contains directories which directly contain packages: for these packages, the DEPOT_PATH
is not used.
64. Parallel Computing
Julia supports coroutines (aka green threads), multithreading without a GIL like CPython!, multiprocessing and distributed computing.
65. Coroutines
Let’s go back to the fibonacci()
generator function:
function fibonacci(n)
Channel() do ch
a, b = 1, 1
for i in 1:n
put!(ch, a)
a, b = b, a + b
end
end
end
for f in fibonacci(10)
println(f)
end
1
1
2
3
5
8
13
21
34
55
Under the hood, Channel() do ... end
creates a Channel
object, and spawns an asynchronous Task
to execute the code in the do ... end
block. The task is scheduled to execute immediately, but when it calls the put!()
function on the channel to yield a value, it blocks until another task calls the take!()
function to grab that value. You do not see the take!()
function explicitly in this code example, since it is executed automatically in the for
loop, in the main task. To demonstrate this, we can just call the take!()
function 10 times to get all the items from the channel:
ch = fibonacci(10)
for i in 1:10
println(take!(ch))
end
1
1
2
3
5
8
13
21
34
55
This channel is bound to the task, therefore it is automatically closed when the task ends. So if we try to get one more element, we will get an exception:
try
take!(ch)
catch ex
ex
end
InvalidStateException("Channel is closed.", :closed)
Here is a more explicit version of the fibonacci()
function:
function fibonacci(n)
function generator_func(ch, n)
a, b = 1, 1
for i in 1:n
put!(ch, a)
a, b = b, a + b
end
end
ch = Channel()
task = @task generator_func(ch, n) # creates a task without starting it
bind(ch, task) # the channel will be closed when the task ends
schedule(task) # start running the task asynchronously
ch
end
fibonacci (generic function with 1 method)
And here is a more explicit version of the for
loop:
ch = fibonacci(10)
while isopen(ch)
value = take!(ch)
println(value)
end
1
1
2
3
5
8
13
21
34
55
Note that asynchronous tasks (also called “coroutines” or “green threads”) are not actually run in parallel: they cooperate to alternate execution. Some functions, such as put!()
, take!()
, and many I/O functions, interrupt the current task’s execution, at which point it lets Julia’s scheduler decide which task should resume its execution. This is just like Python’s coroutines.
For more details on coroutines and tasks, see the manual.
62. Multithreading
Julia also supports multithreading. Currently, you need to specify the number of O.S. threads upon startup, by setting the JULIA_NUM_THREADS
environment variable (or setting the -t
argument in Julia 1.5+). In the first cell, we configured the IJulia kernel so that set environment variable is set:
ENV["JULIA_NUM_THREADS"]
"4"
The actual number of threads started by Julia may be lower than that, as it is limited to the number of available cores on the machine (thanks to hyperthreading, each physical core may run two threads). Here is the number of threads that were actually started:
using Base.Threads
nthreads()
2
Now let’s run 10 tasks across these threads:
@threads for i in 1:10
println("thread #", threadid(), " is starting task #$i")
sleep(rand()) # pretend we're actually working
println("thread #", threadid(), " is finished")
end
thread #1 is starting task #1
thread #2 is starting task #6
thread #2 is finished
thread #2 is starting task #7
thread #1 is finished
thread #1 is starting task #2
thread #2 is finished
thread #2 is starting task #8
thread #1 is finished
thread #1 is starting task #3
thread #1 is finished
thread #1 is starting task #4
thread #2 is finished
thread #2 is starting task #9
thread #1 is finished
thread #1 is starting task #5
thread #1 is finished
thread #2 is finished
thread #2 is starting task #10
thread #2 is finished
Here is a multithreaded version of the estimate_pi()
function. Each thread computes part of the sum, and the parts are added at the end:
function parallel_estimate_pi(n)
s = zeros(nthreads())
nt = n ÷ nthreads()
@threads for t in 1:nthreads()
for i in (1:nt) .+ nt*(t - 1)
@inbounds s[t] += (isodd(i) ? -1 : 1) / (2i + 1)
end
end
return 4.0 * (1.0 + sum(s))
end
@btime parallel_estimate_pi(100_000_000)
128.853 ms (16 allocations: 1.63 KiB)
3.1415926635894196
The @inbounds
macro is an optimization: it tells the Julia compiler not to add any bounds check when accessing the array. It’s safe in this case since the s
array has one element per thread, and t
varies from 1
to nthreads()
, so there is no risk for s[t]
to be out of bounds.
Let’s compare this with the single-threaded implementation:
@btime estimate_pi(100_000_000)
134.263 ms (0 allocations: 0 bytes)
3.141592663589326
If you are running this notebook on Colab, the parallel implementation is probably no faster than the single-threaded one. That’s because the Colab Runtime only has a single CPU, so there is no benefit from multithreading (plus there is a bit of overhead for managing threads). However, on my 8-core machine, using 16 threads, the parallel implementation is about 6 times faster than the single-threaded one.
Julia has a mapreduce()
function which makes it easy to implement functions like parallel_estimate_pi()
:
function parallel_estimate_pi2(n)
4.0 * mapreduce(i -> (isodd(i) ? -1 : 1) / (2i + 1), +, 0:n)
end
parallel_estimate_pi2 (generic function with 1 method)
@btime parallel_estimate_pi2(100_000_000)
106.664 ms (0 allocations: 0 bytes)
3.1415926635897917
The mapreduce()
function is well optimized, so it’s about twice faster than parallel_estimate_pi()
.
You can also spawn a task using Threads.@spawn
. It will get executed on any one of the running threads (it will not start a new thread):
task = Threads.@spawn begin
println("Thread starting")
sleep(1)
println("Thread stopping")
42 # result
end
println("Hello!")
println("The result is: ", fetch(task))
Hello!
Thread starting
Thread stopping
The result is: 42
The fetch()
function waits for the thread to finish, and fetches the result. You can also just call wait()
if you don’t need the result.
Last but not least, you can use channels to synchronize and communicate across tasks, even if they are running across separate threads:
ch = Channel()
task1 = Threads.@spawn begin
for i in 1:5
sleep(rand())
put!(ch, i^2)
end
println("Finished sending!")
close(ch)
end
task2 = Threads.@spawn begin
foreach(v->println("Received $v"), ch)
println("Finished receiving!")
end
wait(task2)
Received 1
Received 4
Received 9
Received 16
Finished sending!
Received 25
Finished receiving!
For more details about multithreading, check out this page.
63. Multiprocessing & Distributed Programming
Julia can spawn multiple Julia processes upon startup if you specify the number of processes via the -p
argument. You can also spawn extra processes from Julia itself:
using Distributed
addprocs(4)
workers() # array of worker process ids
4-element Array{Int64,1}:
2
3
4
5
The main process has id 1:
myid()
1
The @everywhere
macro lets you run any code on all workers:
@everywhere println("Hi! I'm worker $(myid())")
Hi! I'm worker 1
From worker 4: Hi! I'm worker 4
From worker 3: Hi! I'm worker 3
From worker 2: Hi! I'm worker 2
From worker 5: Hi! I'm worker 5
You can also execute code on a particular worker by using @spawnat
:
@spawnat 3 println("Hi! I'm worker $(myid())")
Future(3, 1, 14, nothing)
If you specify :any
instead of a worker id, Julia chooses the worker for you:
@spawnat :any println("Hi! I'm worker $(myid())")
From worker 3: Hi! I'm worker 3
Future(2, 1, 15, nothing)
Both @everywhere
and @spawnat
return immediately. The output of @spawnat
is a Future
object. You can call fetch()
on this object to wait for the result:
result = @spawnat 3 1+2+3+4
fetch(result)
10
If you import some package in the main process, it is not automatically imported in the workers. For example, the following code fails because the worker does not know what pyimport
is:
using PyCall
result = @spawnat 4 (np = pyimport("numpy"); np.log(10))
try
fetch(result)
catch ex
ex
end
From worker 2: Hi! I'm worker 2
RemoteException(4, CapturedException(UndefVarError(:pyimport), Any[(#121 at macros.jl:87, 1), (#101 at process_messages.jl:290, 1), (run_work_thunk at process_messages.jl:79, 1), (run_work_thunk at process_messages.jl:88, 1), (#94 at task.jl:358, 1)]))
You must use @everywhere
or @spawnat
to import the packages you need in each worker:
@everywhere using PyCall
result = @spawnat 4 (np = pyimport("numpy"); np.log(10))
fetch(result)
2.302585092994046
Similarly, if you define a function in the main process, it is not automatically available in the workers. You must define the function in every worker:
@everywhere addtwo(n) = n + 2
result = @spawnat 4 addtwo(40)
fetch(result)
42
You can pass a Future
to @everywhere
or @spawnat
, as long as you wrap it in a fetch()
function:
M = @spawnat 2 rand(5)
result = @spawnat 3 fetch(M) .* 10.0
fetch(result)
5-element Array{Float64,1}:
4.475589942138973
3.7844448153428067
6.199227766558075
8.66410018066203
3.364462310811107
In this example, worker 2 creates a random array, then worker 3 fetches this array and multiplies each element by 10, then the main process fetches the result and displays it.
64. GPU
Julia has excellent GPU support. As you may know, GPUs are devices which can run thousands of threads in parallel. Each thread is slower and more limited than on a CPU, but there are so many of them that plenty of tasks can be executed much faster on a GPU than on a CPU, provided these tasks can be parallelized.
Let’s check which GPU device is installed:
;nvidia-smi
Thu Jul 2 00:08:11 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 33C P0 26W / 250W | 0MiB / 16280MiB | 0% Default |
| | | ERR! |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
If you’re running on Colab, your runtime will generally have an Nvidia Tesla K80 GPU with 12GB of RAM installed, but sometimes other GPUs like Nvidia Tesla T4 16GB, or Nvidia Tesla P100).
If no GPU is detected, go to Runtime > Change runtime type, set Hardware accelerator to GPU, then go to Runtime > Factory reset runtime, then reinstall Julia by running the first cell again, then reload the page and come back here). If you’re running on your own machine, make sure you have a compatible GPU card installed, with the appropriate drivers.
Now let’s create a large matrix and time how long it takes to square it on the CPU:
using BenchmarkTools
M = rand(2^11, 2^11)
function benchmark_matmul_cpu(M)
M * M
return
end
benchmark_matmul_cpu(M) # warm up
@btime benchmark_matmul_cpu($M)
436.690 ms (2 allocations: 32.00 MiB)
Notes:
* For benchmarking, we wrapped the operation in a function which returns nothing
.
* Why do we have a “warm up” line? Well, since Julia compiles code on the fly the first time it is executed, it’s good practice to execute the operation we want to benchmark at least once before starting the benchmark, or else the benchmark will include the compilation time.
* We used $M
instead of M
on the last line. This is a feature of the @btime
macro: it evaluates M
before benchmarking takes place, to avoid the extra delay that is incurred when benchmarking with global variables.
Now let’s benchmark this same operation on the GPU:
using CUDA
# Copy the data to the GPU. Creates a CuArray:
M_on_gpu = cu(M)
# Alternatively, create a new random matrix directly on the GPU:
#M_on_gpu = CUDA.CURAND.rand(2^11, 2^11)
function benchmark_matmul_gpu(M)
CUDA.@sync M * M
return
end
benchmark_matmul_gpu(M_on_gpu) # warm up
@btime benchmark_matmul_gpu($M_on_gpu)
[32m[1mDownloading[22m[39m artifact: CUDA10.1
[?25l
######################################################################### 100.0%
[1A[2K[?25h[32m[1mDownloading[22m[39m artifact: CUDNN+CUDA10.1
[?25l
######################################################################### 100.0%
[1A[2K[?25h[32m[1mDownloading[22m[39m artifact: CUTENSOR+CUDA10.1
[?25l
######################################################################### 100.0%
[1A[2K[?25h
┌ Warning: `haskey(::TargetIterator, name::String)` is deprecated, use `Target(; name = name) !== nothing` instead.
│ caller = llvm_compat(::VersionNumber) at compatibility.jl:181
└ @ CUDA /root/.julia/packages/CUDA/42B9G/deps/compatibility.jl:181
2.360 ms (9 allocations: 368 bytes)
That’s much faster (185x faster in my test on Colab with an NVidia Tesla P100 GPU).
Importantly:
* Before the GPU can work on some data, it needs to be copied to the GPU (or generated there directly).
* the CUDA.@sync
macro waits for the GPU operation to complete. Without it, the operation would happen in parallel on the GPU, while execution would continue on the CPU. So we would just be timing how long it takes to start the operation, not how long it takes to complete.
* In general, you don’t need CUDA.@sync
, since many operations (including cu()
) call it implicitly, and it’s usually a good idea to let the CPU and GPU work in parallel. Typically, the GPU will be working on the current batch of data while the CPU works on preparing the next batch.
Of course, the speed up will vary depending on the matrix size and the GPU type. Moreover, copying the data from the CPU to the GPU is often the slowest part of the operation, but we only benchmarked the matrix multiplication itself. Let’s see what we get if we include the data transfer in the benchmark:
That’s still much faster than on the CPU.
Let’s check how much RAM we have left on the GPU:
CUDA.memory_status()
Effective GPU memory usage: 99.93% (15.888 GiB/15.899 GiB)
CUDA allocator usage: 15.594 GiB
BinnedPool usage: 15.594 GiB (16.000 MiB allocated, 15.578 GiB cached)
Julia’s Garbage Collector will free CUDA arrays like any other object, when there’s no more reference to it. However, CUDA.jl
uses a memory pool to make allocations faster on the GPU, so don’t be surprised if the allocated memory on the GPU does not go down immediately. Moreover, IJulia keeps a reference to the output of each cell, so if you let any cell output a CuArray
, it will only be released when you execute Out[]=0
. If you want to force the Garbage Collector to run, you an run GC.gc()
. To reclaim memory from the memory pool, use CUDA.reclaim()
:
GC.gc()
CUDA.reclaim()
16726884352
Many other operations are implemented for CuArray
(+
, -
, etc.) and dotted operations (.+
, exp.()
, etc). Importantly, loop fusion also works on the GPU. For example, if we want to compute M .* M .+ M
, without loop fusion the GPU would first compute M .* M
and create a temporary array, then it would add M
to that array, like this:
function benchmark_without_fusion(M)
P = M .* M
CUDA.@sync P .+ M
return
end
benchmark_without_fusion(M_on_gpu) # warm up
@btime benchmark_without_fusion($M_on_gpu)
676.534 μs (140 allocations: 4.30 KiB)
Instead, loop fusion ensures that the array is only traversed once, without the need for a temporary array:
function benchmark_with_fusion(M)
CUDA.@sync M .* M .+ M
return
end
benchmark_with_fusion(M_on_gpu) # warm up
@btime benchmark_with_fusion($M_on_gpu)
387.141 μs (87 allocations: 3.36 KiB)
That’s much faster (75% faster in my test on Colab). 😃
Lastly, you can actually write your own GPU kernels in Julia! In other words, rather than using GPU operations implemented in the CUDA.jl
package (or others), you can write Julia code that will be compiled for the GPU, and executed there. This can occasionally be useful to speed up some algorithms where the standard kernels don’t suffice. For example, here’s a GPU kernel which implements u .+= v
, where u
and v
are two (large) vectors:
function worker_gpu_add!(u, v)
index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
index ≤ length(u) && (@inbounds u[index] += v[index])
return
end
function gpu_add!(u, v)
numblocks = ceil(Int, length(u) / 256)
@cuda threads=256 blocks=numblocks worker_gpu_add!(u, v)
return u
end
gpu_add! (generic function with 1 method)
This code example is adapted from the CUDA.jl
package’s documentation, which I highly encourage you to check out if you’re interested in writing your own kernels. Here are the key parts to understand this example, starting from the end:
* The gpu_add!()
function first calculates numblocks
, the number of blocks of threads to start, then it uses the @cuda
macro to spawn numblocks
blocks of GPU threads, each with 256 threads, and each thread runs worker_gpu_add!(u, v)
.
* The worker_gpu_add!()
function computes u[index] += v[index]
for a single value of index
: in other words, each thread will just update a single value in the vector! Let’s see how the index is computed:
* The @cuda
macro spawned many blocks of 256 threads each. These blocks are organized in a grid, which is one-dimensional by default, but it can be up to three-dimensional. Therefore each thread and each block have an (x, y, z)
coordinate in this grid. See this diagram from the Nvidia blog post:
.
* threadIdx().x
returns the current GPU thread’s x
coordinate within its block (one difference with the diagram is that Julia is 1-indexed).
* blockIdx().x
returns the current block’s x
coordinate in the grid.
* blockDim().x
returns the block size along the x
axis (in this example, it’s 256).
* gridDim().x
returns the number of blocks in the grid, along the x
axis (in this example it’s numblocks
).
* So the index
that each thread must update in the array is (blockIdx().x - 1) * blockDim().x + threadIdx().x
.
* As explained earlier, the @inbounds
macro is an optimization that tells Julia that the index is guaranteed to be inbounds, so there’s no need for it to check.
Now writing your own GPU kernel won’t seem like something only top experts with advanced C++ skills can do: you can do it too!
Let’s check that the kernel works as expected:
u = rand(2^20)
v = rand(2^20)
u_on_gpu = cu(u)
v_on_gpu = cu(v)
u .+= v
gpu_add!(u_on_gpu, v_on_gpu)
@assert Array(u_on_gpu) ≈ u
Yes, it works well!
Note: the ≈
operator checks whether the operands are approximately equal within the float precision limit.
Let’s benchmark our custom kernel:
function benchmark_custom_assign_add!(u, v)
CUDA.@sync gpu_add!(u, v)
return
end
benchmark_custom_assign_add!(u_on_gpu, v_on_gpu)
@btime benchmark_custom_assign_add!($u_on_gpu, $v_on_gpu)
98.689 μs (52 allocations: 1.31 KiB)
Let’s see how this compares to CUDA.jl
‘s implementation:
function benchmark_assign_add!(u, v)
CUDA.@sync u .+= v
return
end
benchmark_assign_add!(u_on_gpu, v_on_gpu)
@btime benchmark_assign_add!($u_on_gpu, $v_on_gpu)
137.072 μs (70 allocations: 1.89 KiB)
How about that? Our custom kernel is faster than CUDA.jl
‘s kernel! But to be fair, our kernel would not work with huge vectors, since there’s a limit to the number of blocks & threads you can spawn (see Table 15 in CUDA’s documentation). To support such huge vectors, we need each worker to run a loop like this:
function worker_gpu_add!(u, v)
index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
stride = blockDim().x * gridDim().x
for i = index:stride:length(u)
@inbounds u[i] += v[i]
end
return
end
worker_gpu_add! (generic function with 1 method)
This way, if @cuda
is executed with a smaller number of blocks than needed to have one thread per array item, the workers will loop appropriately.
This should get you started! For more info, check out CUDA.jl
‘s documentation.
65. Command Line Arguments
Command line arguments are available via ARGS
:
ARGS
1-element Array{String,1}:
"/root/.local/share/jupyter/runtime/kernel-4b7aa9c6-4581-4d7b-acea-4e4dfaf036c8.json"
Unlike Python’s sys.argv
, the first element of this array is not the program name. If you need the program name, use PROGRAM_FILE
instead:
PROGRAM_FILE
"/root/.julia/packages/IJulia/DrVMH/src/kernel.jl"
You can get the current module, directory, file or line number:
@__MODULE__, @__DIR__, @__FILE__, @__LINE__
(Main, "/content", "In[406]", 1)
The equivalent of Python’s if __name__ == "__main__"
is:
if abspath(PROGRAM_FILE) == @__FILE__
println("Starting of the program")
end
66. Memory Management
Let’s check how many megabytes of RAM are available:
free() = println("Available RAM: ", Sys.free_memory() ÷ 10^6, " MB")
free()
Available RAM: 3120 MB
If a variable holds a large object that you don’t need anymore, you can either wait until the variable falls out of scope, or set it to nothing
. Either way, the memory will only be freed when the Garbage Collector does its magic, which may not be immediate. In general, you don’t have to worry about that, but if you want, you can always call the GC directly:
function use_ram()
M = rand(10000, 10000) # use 400+MB of RAM
println("sum(M)=$(sum(M))")
end # M will be freed by the GC eventually after this
use_ram()
M = rand(10000, 10000) # use 400+MB of RAM
println("sum(M)=$(sum(M))")
M = nothing
GC.gc() # rarely needed
sum(M)=4.9997184380985916e7
sum(M)=5.000422876376158e7
free()
Available RAM: 1528 MB
Thanks!
I hope you enjoyed this introduction to Julia! I recommend you join the friendly and helpful Julia community on Slack or Discourse.
Cheers!
Aurélien Geron
Ref: Git repo for this post.