Menu

Julia Programming Language for Pythonistas – A Practical Tutorial

Julia programming language tutorial is an introduction to Julia for Python programmers. It will go through the most important Python features (such as functions, basic types, list comprehensions, exceptions, generators, modules, packages, and so on) and show you how to code them in Julia IDE. By the end of this Julia tutorial, you will have a fair mental model of what coding in Julia is all about.

Julia looks and feels a lot like Python, only much faster. It’s dynamic, expressive, extensible, with batteries included, in particular for Data Science.

Julia Tutorial

 

1. Running This Code Locally

If you prefer to run this code on your machine, then:

  1. Install Julia
  2. Run the following command in a terminal (or command prompt for windows) to install IJulia (the Jupyter kernel for Julia), and a few packages we will use:
julia -e 'using Pkg
            pkg"add IJulia; precompile;"
            pkg"add BenchmarkTools; precompile;"
            pkg"add PyCall; precompile;"
            pkg"add PyPlot; precompile;"'

Next, go to the directory containing this notebook:

cd /path/to/notebook/directory

Start Jupyter Notebook:

julia -e 'using IJulia; IJulia.notebook()'

Or replace notebook() with jupyterlab() if you prefer JupyterLab.

If you do not already have Jupyter installed, IJulia will propose to install it. If you agree, it will automatically install a private Miniconda (just for Julia), and install Jupyter and Python inside it.

2. Checking the Installation

The versioninfo() function should print your Julia version and some other info about the system (if you ever ask for help or file an issue about Julia, you should always provide this information).

versioninfo()

Output:

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)
Environment:
  JULIA_NUM_THREADS = 4

3. Getting Help

To get help on any module, function, variable, or just about anything else, just type ? followed by what you’re interested in. For example:

?versioninfo
#> search: versioninfo
#> versioninfo(io::IO=stdout; verbose::Bool=false)

Check version info.

versioninfo(io::IO=stdout; verbose::Bool=false)
#> Print information about the version of Julia in use. The output is controlled with boolean keyword arguments:
#> `verbose`: print all additional information

This works in interactive mode only: in Jupyter, Colab and in the Julia shell (called the REPL).

Here are a few more ways to get help and inspect objects in interactive mode:

Julia Python
?obj help(obj)
dump(obj) print(repr(obj))
names(FooModule) dir(foo_module)
methodswith(SomeType) dir(SomeType)
@which func func.__module__
apropos("bar") Search for "bar" in docstrings of all installed packages
typeof(obj) type(obj)
obj isa SomeType
or
isa(obj, SomeType)
isinstance(obj, SomeType)

If you ever ask for help or file an issue about Julia, you should generally provide the output of versioninfo().

And of course, you can also learn and get help here:

  1. Learning
  2. Documentation
  3. Questions & Discussions:
    1. Discourse
    2. Slack
    3. stackoverflow

4. A First Look at Julia

This section will give you an idea of what Julia looks like and what some of its major qualities are: it’s expressive, dynamic, flexible, and most of all, super fast.

Estimating π

Let’s write our first function. It will estimate π using the equation:

π = 4 x (1 – 1/3 + 1/5 – 1/7 + 1/9 – 1/11 + . .)

There are much better ways to estimate π, but this one is easy to implement.

function estimate_pi(n)
    s = 1.0
    for i in 1:n
        s += (isodd(i) ? -1 : 1) / (2i + 1)
    end
    4s
end

p = estimate_pi(100_000_000)
println("π ≈ $p")
println("Error is $(p - π)")

#> π ≈ 3.141592663589326
#> Error is 9.999532757376528e-9

Compare this with the equivalent Python 3 code:

import math

def estimate_pi(n):
    s = 1.0
    for i in range(1, n + 1):
        s += (-1 if i % 2 else 1) / (2 * i + 1)
    return 4 * s

p = estimate_pi(100_000_000)
print(f"π ≈ {p}") # f-strings are available in Python 3.6+
print(f"Error is {p - math.pi}")

Pretty similar, right? But notice the small differences:

Julia Python
function def
for i in X
...
end
for i in X:
...
1:n range(1, n+1)
cond ? a : b a if cond else b
2i + 1 2 * i + 1
4s return 4 * s
println(a, b) print(a, b, sep="")
print(a, b) print(a, b, sep="", end="")
"$p" f"{p}"
"$(p - π)" f"{p - math.pi}"

This example shows that:
1. Julia can be just as concise and readable as Python.
2. Indentation in Julia is not meaningful like it is in Python. Instead, blocks end with end.
3. Many math features are built in Julia and need no imports.
4. There’s some mathy syntactic sugar, such as 2i (but you can write 2 * i if you prefer).
5. In Julia, the return keyword is optional at the end of a function. The result of the last expression is returned (4s in this example).
6. Julia loves Unicode and does not hesitate to use Unicode characters like π. However, there are generally plain-ASCII equivalents (e.g., π == pi).

5. Typing Unicode Characters

Typing Unicode characters is easy: for latex symbols like π, just type \pi. For emojis like 😃, type \:smiley:.

This works in the REPL, in Jupyter, but unfortunately not in Colab (yet?). As a workaround, you can run the following code to print the character you want, then copy/paste it:

using REPL.REPLCompletions: latex_symbols, emoji_symbols
latex_symbols["\\pi"]

#> "π"

Emoji

emoji_symbols["\\:smiley:"]
#> "😃"

In Julia, using Foo.Bar: a, b corresponds to running from foo.bar import a, b in Python.

Julia Python
using Foo from foo import *; import foo
using Foo.Bar from foo.bar import *; from foo import bar
using Foo.Bar: a, b from foo.bar import a, b
using Foo: Bar from foo import bar

More on this later.

6. Running Python code in Julia

Julia lets you easily run Python code using the PyCall module. We installed it earlier, so we just need to import it:

using PyCall

Now that we have imported PyCall, we can use the pyimport() function to import a Python module directly in Julia! For example, let’s check which Python version we are using:

sys = pyimport("sys")
sys.version
#> "3.6.9 (default, Apr 18 2020, 01:56:04) \n[GCC 8.4.0]"

In fact, let’s run the Python code we discussed earlier (this will take about 15 seconds to run, because Python is so slow… ):

# Run Python code in Julia
py"""
import math

def estimate_pi(n):
    s = 1.0
    for i in range(1, n + 1):
        s += (-1 if i % 2 else 1) / (2 * i + 1)
    return 4 * s

p = estimate_pi(100_000_000)
print(f"π ≈ {p}") # f-strings are available in Python 3.6+
print(f"Error is {p - math.pi}")
"""

As you can see, running arbitrary Python code is as simple as using py-strings (py"..."). Note that py-strings are not part of the Julia language itself: they are defined by the PyCall module (we will see how this works later).

Unfortunately, Python’s print() function writes to the standard output, which is not captured by Colab, so we can’t see the output of this code. That’s okay, we can look at the value of p:

# Python 'p'
py"p"
#> 3.141592663589326

Let’s compare this to the value we calculated above using Julia:

# subtract Julia 'p' from Python 'p'
py"p" - p
#> 0.0

Perfect, they are exactly equal!

As you can see, it’s very easy to mix Julia and Python code. So if there’s a module you really love in Python, you can keep using it as long as you want! For example, let’s use NumPy:

np = pyimport("numpy")
a = np.random.rand(2, 3)
#> 2×3 Array{Float64,2}:
#> 0.326131  0.337986  0.475167
#> 0.537621  0.912136  0.792325

Notice that PyCall automatically converts some Python types to Julia types, including NumPy arrays. That’s really quite convenient! Note that Julia supports multi-dimensional arrays (analog to NumPy arrays) out of the box. Array{Float64, 2} means that it’s a 2-dimensional array of 64-bit floats.

PyCall also converts Julia arrays to NumPy arrays when needed:

exp_a = np.exp(a)
#> 2×3 Array{Float64,2}:
#> 1.3856   1.40212  1.60828
#> 1.71193  2.48963  2.20852

If you want to use some Julia variable in a py-string, for example exp_a, you can do so by writing $exp_a like this:

py"""
import numpy as np

result = np.log($exp_a)
"""

py"result"

#> 2×3 Array{Float64,2}:
#> 0.326131  0.337986  0.475167
#> 0.537621  0.912136  0.792325

If you want to keep using Matplotlib, it’s best to use the PyPlot module (which we installed earlier), rather than trying to use pyimport("matplotlib"), as PyPlot provides a more straightforward interface with Julia, and it plays nicely with Jupyter and Colab:

using PyPlot

x = range(-5π, 5π, length=100)
plt.plot(x, sin.(x) ./ x) # we'll discuss this syntax in the next section
plt.title("sin(x) / x")
plt.grid("True")
plt.show()

That said, Julia has its own plotting libraries, such as the Plots library, which you may want to check out.

As you can see, Julia’s range() function acts much like NumPy’s linspace() function, when you use the length argument.

However, it acts like Python’s range() function when you use the step argument instead (except the upper bound is inclusive). Julia’s range() function returns an object which behaves just like an array, except it doesn’t actually use any RAM for its elements, it just stores the range parameters. If you want to collect all of the elements into an array, use the collect() function (similar to Python’s list() function):

println(collect(range(10, 80, step=20)))
#> [10, 30, 50, 70]

println(collect(10:20:80)) # 10:20:80 is equivalent to the previous range
#> [10, 30, 50, 70]

println(collect(range(10, 80, length=5))) # similar to NumPy's linspace()
#> [10.0, 27.5, 45.0, 62.5, 80.0]

step = (80-10)/(5-1) # 17.5
println(collect(10:step:80)) # equivalent to the previous range
#> [10.0, 27.5, 45.0, 62.5, 80.0]

The equivalent Python code is:

# PYTHON
print(list(range(10, 80+1, 20)))

# there's no short-hand for range() in Python
print(np.linspace(10, 80, 5))
step = (80-10)/(5-1) # 17.5
print([i*step + 10 for i in range(5)])
Julia Python
np = pyimport("numpy") import numpy as np
using PyPlot from pylab import *
1:10 range(1, 11)
1:2:10
or
range(1, 11, 2)
range(1, 11, 2)
1.2:0.5:10.3
or
range(1.2, 10.3, step=0.5)
np.arange(1.2, 10.3, 0.5)
range(1, 10, length=3) np.linspace(1, 10, 3)
collect(1:5)
or
[i for i in 1:5]
list(range(1, 6))
or
[i for i in range(1, 6)]

7. Loop Fusion (Similar to Python’s List comprehension)

Did you notice that we wrote sin.(x) ./ x (not sin(x) / x)? This is equivalent to [sin(i) / i for i in x].

a = sin.(x) ./ x
b = [sin(i) / i for i in x]
@assert a == b

This is called a ‘dot’ operation.

This is not just syntactic sugar: it’s actually a very powerful Julia feature. Indeed, notice that the array only gets traversed once. Even if we chained more than two dotted operations, the array would still only get traversed once. This is called loop fusion.

This is significantly faster than NumPy, though NumPy is written in C. Why?

Because, when using NumPy arrays, sin(x) / x first computes a temporary array containing sin(x) and then it computes the final array. Two loops and two arrays instead of one. NumPy is implemented in C, and has been heavily optimized, but if you chain many operations, it still ends up being slower and using more RAM than Julia.

However, all the extra dots can sometimes make the code a bit harder to read. To avoid that, you can write @. before an expression: every operation will be “dotted” automatically, like this:

a = @. sin(x) / x
b = sin.(x) ./ x
@assert a == b

Note: Julia’s @assert statement starts with an @ sign, just like @., which means that they are macros.

In Julia, macros are very powerful metaprogramming tools. A macro is evaluated at parse time, and it can inspect the expression that follows it and then transform it, or even replace it. In practice, you will often use macros, but you will rarely define your own. I’ll come back to macros later.

8. Julia is fast!

Let’s compare the Julia and Python implementations of the estimate_pi() function:

@time estimate_pi(100_000_000);
#> 0.140922 seconds

To get a more precise benchmark, it’s preferable to use the BenchmarkTools module. Just like Python’s timeit module, it provides tools to benchmark code by running it multiple times. This provides a better estimate of how long each call takes.

using BenchmarkTools

@benchmark estimate_pi(100_000_000)

Output:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     133.074 ms (0.00% GC)
  median time:      137.283 ms (0.00% GC)
  mean time:        137.457 ms (0.00% GC)
  maximum time:     145.218 ms (0.00% GC)
  --------------
  samples:          37
  evals/sample:     1

If this output is too verbose for you, simply use @btime instead:

@btime estimate_pi(100_000_000)
#> 132.646 ms (0 allocations: 0 bytes)

Now let’s time the Python version. Since the call is so slow, we just run it once (it will take about 15 seconds):

py"""
from timeit import timeit
duration = timeit("estimate_pi(100_000_000)", number=1, globals=globals())
"""

py"duration"
#> 14.16427015499994

It looks like Julia is close to 100 times faster than Python in this case! To be fair, PyCall does add some overhead, but even if you run this code in a separate Python shell, you will see that Julia crushes (pure) Python when it comes to speed.

So why is Julia so much faster than Python?

Well, Julia compiles the code on the fly as it runs it.

Okay, let’s summarize what we learned so far:

Julia is a dynamic language that looks and feels a lot like Python, you can even execute Python code super easily, and pure Julia code runs much faster than pure Python code, because it is compiled on the fly. I hope this convinces you to read on!

Next, let’s continue to see how Python’s main constructs can be implemented in Julia.

9. Working with Numbers

i = 42 # 64-bit integer
f = 3.14 # 64-bit float
c = 3.4 + 4.5im # 128-bit complex number

bi = BigInt(2)^1000 # arbitrarily long integer
bf = BigFloat(1) / 7 # arbitrary precision

r = 15//6 * 9//20 # rational number
#> 9//8

And the equivalent Python code:

# PYTHON
i = 42
f = 3.14
c = 3.4 + 4.5j

bi = 2**1000 # integers are seemlessly promoted to long integers
from decimal import Decimal
bf = Decimal(1) / 7

from fractions import Fraction
r = Fraction(15, 6) * Fraction(9, 20)

Dividing integers gives floats, like in Python:

5 / 2
#> 2.5

For integer division, use ÷ or div():

5 ÷ 2
#> 2

Or use div() for division

div(5, 2)
#> 2

The % operator is the remainder, not the modulo like in Python. These differ only for negative numbers:

# remainder
57 % 10
#> 7
Julia Python
3.4 + 4.5im 3.4 + 4.5j
BigInt(2)^1000 2**1000
BigFloat(3.14) from decimal import Decimal
Decimal(3.14)
9//8 from fractions import Fraction
Fraction(9, 8)
5/2 == 2.5 5/2 == 2.5
5÷2 == 2
or
div(5, 2)
5//2 == 2
57%10 == 7 57%10 == 7
(-57)%10 == -7 (-57)%10 == 3

10. Strings

Julia strings use double quotes " or triple quotes """, but not single quotes ':

s = "ångström" # Julia strings are UTF-8 encoded by default
println(s)
#> ångström
s = "Julia strings
     can span
     several lines\n\n
     and they support the \"usual\" escapes like
     \x41, \u5bb6, and \U0001f60a!"
println(s)

#>    Julia strings
#>    can span
#>    several lines
#>    
#>    
#>    and they support the "usual" escapes like
#>    A, 家, and 😊!

Use repeat() instead of * to repeat a string, and use * instead of + for concatenation:

s = repeat("tick, ", 10) * "BOOM!"
println(s)

#> tick, tick, tick, tick, tick, tick, tick, tick, tick, tick, BOOM!

The equivalent Python code is:

# Python
s = "tick, " * 10 + "BOOM!"
print(s)

Use join(a, s) instead of s.join(a):

s = join([i for i in 1:4], ", ")
println(s)
#> 1, 2, 3, 4

You can also specify a string for the last join:

s = join([i for i in 1:4], ", ", " and ")
#> "1, 2, 3 and 4"

split() works as you might expect:

split("   one    three     four   ")

#> 3-element Array{SubString{String},1}:
#> "one"
#> "three"
#> "four"

You can specify a separator as well.

split("one,,three,four!", ",")
#> 4-element Array{SubString{String},1}:
#> "one"
#> ""
#> "three"
#> "four!"

Check if a pattern occurs in a string.

occursin("sip", "Mississippi")
#> true

Replace a string with another.

replace("I like coffee", "coffee" => "tea")
#> "I like tea"

Triple quotes work a bit like in Python, but they also remove indentation and ignore the first line feed:

s = """
1. the first line feed is ignored if it immediately follows \"""
2. triple quotes let you use "quotes" easily
3. indentation is ignored
    - up to left-most character
    - ignoring the first line (the one with \""")
4. the final line feed it n̲o̲t̲ ignored
"""
println("<start>")
println(s)
println("<end>")

#> 1. the first line feed is ignored if it immediately follows """
#> 2. triple quotes let you use "quotes" easily
#> 3. indentation is ignored
#>      - up to left-most character
#>      - ignoring the first line (the one with """)
#> 4. the final line feed it n̲o̲t̲ ignored

Let’s see some more examples.

11. String Interpolation

String interpolation uses $variable and $(expression):

total = 1 + 2 + 3
s = "1 + 2 + 3 = $total = $(1 + 2 + 3)"
println(s)

#> 1 + 2 + 3 = 6 = 6

This means you must escape the $ sign:

s = "The car costs \$10,000"
println(s)
#> The car costs $10,000

12. Raw Strings

Raw strings use raw"..." instead of the r"..." used in Python.

s = raw"In a raw string, you only need to escape quotes \", but not
        $ or \. There is one exception, however: the backslash \
        must be escaped if it's just before quotes like \\\"."
println(s)

#> In a raw string, you only need to escape quotes ", but not
#> $ or \. There is one exception, however: the backslash \
#> must be escaped if it's just before quotes like \".

Another Example

s = raw"""
Triple quoted raw strings are possible too: $, \, \t, "
  - They handle indentation and the first line feed like regular
    triple quoted strings.
  - You only need to escape triple quotes like \""", and the
    backslash before quotes like \\".
"""
println(s)

#> Triple quoted raw strings are possible too: $, \, \t, "
#>   - They handle indentation and the first line feed like regular
#>     triple quoted strings.
#>   - You only need to escape triple quotes like """, and the
#>     backslash before quotes like \".

13. Characters

Single quotes are used for individual Unicode characters:

a = 'å' # Unicode code point (single quotes)
#> 'å': Unicode U+00E5 (category Ll: Letter, lowercase)

To be more precise:
1. A Julia “character” represents a single Unicode code point (sometimes called a Unicode scalar).
2. Multiple code points may be required to produce a single grapheme, i.e., something that readers would recognize as a single character. Such a sequence of code points is called a “Grapheme cluster”.

For example, the character é can be represented either using the single code point \u00E9, or the grapheme cluster e + \u0301:

s = "café"
println(s, " has ", length(s), " code points")
#> café has 4 code points

Alternately:

s = "cafe\u0301"
println(s, " has ", length(s), " code points")
#> café has 5 code points

In a ‘For loop’:

for c in "cafe\u0301"
    display(c)
end

#> 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
#> 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
#> 'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)
#> 'e': ASCII/Unicode U+0065 (category Ll: Letter, lowercase)
#> '́': Unicode U+0301 (category Mn: Mark, nonspacing)

Julia represents any individual character like 'é' using 32-bits (4 bytes):

sizeof('é')
#> 4

But strings are represented using the UTF-8 encoding. In this encoding, code points 0 to 127 are represented using one byte, but any code point above 127 is represented using 2 to 6 bytes:

sizeof("a")
#> 1

Special characters:

sizeof("é")
#> 2

One more:

sizeof("家")
#> 3

Size of a grapheme.

sizeof("🏳️‍🌈") # this is a grapheme with 4 code points of 4 + 3 + 3 + 4 bytes
#> 14

Loop fusion on a grapheme.

[sizeof(string(c)) for c in "🏳️‍🌈"]
#> 4-element Array{Int64,1}:
#> 4
#> 3
#> 3
#> 4

You can iterate through graphemes instead of code points:

using Unicode

for g in graphemes("e\u0301🏳️‍🌈")
  println(g)
end

#> é
#> 🏳️‍🌈

14. String Indexing

Characters in a string are indexed based on the position of their starting byte in the UTF-8 representation. For example, the character ê in the string "être" is located at index 1, but the character 't' is located at index 3, since the UTF-8 encoding of ê is 2 bytes long:

s = "être"
println(s[1])
println(s[3])
println(s[4])
println(s[5])

#>   ê
#>   t
#>   r
#>   e

If you try to get the character at index 2, you get an exception:

try
    s[2]
catch ex
    ex
end
#> StringIndexError("être", 2)

By the way, notice the exception-handling syntax (we’ll discuss exceptions later):

Julia Python
try
...
catch ex
...
end
try
...
except Exception as ex
...
end

You can get a substring easily, using valid character indices:

s[1:3]
#> "êt"

You can iterate through a string, and it will return all the code points:

for c in s
    println(c)
end

#> ê
#> t
#> r
#> e

Or you can iterate through the valid character indices:

for i in eachindex(s)
    println(i, ": ", s[i])
end

#> 1: ê
#> 3: t
#> 4: r
#> 5: e

Benefits of representing strings as UTF-8:
1. All Unicode characters are supported.
2. UTF-8 is fairly compact (at least for Latin scripts).
3. It plays nicely with C libraries which expect ASCII characters only, since ASCII characters correspond to the Unicode code points 0 to 127, which UTF-8 encodes exactly like ASCII.

Drawbacks:
1. UTF-8 uses a variable number of bytes per character, which makes indexing harder.
2. However, If the language tried to hide this by making s[5] search for the 5th character from the start of the string, then code like for i in 1:length(s); s[i]; end would be unexpectedly inefficient, since at each iteration there would be a search from the beginning of the string, leading to O(n_2) performance instead of O(_n).

findfirst(isequal('t'), "être")
#> 3

Find last occurrence of:

findlast(isequal('p'), "Mississippi")
#> 10

Find next occurrence of:

findnext(isequal('i'), "Mississippi", 2)
#> 2

Find next occurrence of:

findnext(isequal('i'), "Mississippi", 2 + 1)
#> 5

Find previous occurrence of:

findprev(isequal('i'), "Mississippi", 5 - 1)
#> 2

Other useful string functions: ncodeunits(str), codeunit(str, i), thisind(str, i), nextind(str, i, n=1), prevind(str, i, n=1).

15. Regular Expressions in Julia

To create a regular expression in Julia, use the r"..." syntax:

regex = r"c[ao]ff?(?:é|ee)"
#> r"c[ao]ff?(?:é|ee)"

The expression r"..." is equivalent to Regex("...") except the former is evaluated at parse time, while the latter is evaluated at runtime, so unless you need to construct a Regex dynamically, you should prefer r"...".

occursin(regex, "A bit more coffee?")
#> true

Return the pattern match

m = match(regex, "A bit more coffee?")
m.match
#> "coffee"

Offset position.

m.offset
#> 12

Another example.

m = match(regex, "A bit more tea?")
isnothing(m) && println("I suggest coffee instead")
#> I suggest coffee instead

One more.

regex = r"(.*)#(.+)"
line = "f(1) # nice comment"
m = match(regex, line)
code, comment = m.captures
println("code: ", repr(code))
println("comment: ", repr(comment))

#> code: "f(1) "
#> comment: " nice comment"

Print m.

m[2]
#> " nice comment"

Show Offsets

m.offsets
#> 2-element Array{Int64,1}:
#> 1
#> 7

Matches

m = match(r"(?<code>.+)#(?<comment>.+)", line)
m[:comment]
#> " nice comment"

Replace

replace("Want more bread?", r"(?<verb>more|some)" => s"a little")
#> "Want a little bread?"

A slightly involved replace example.

replace("Want more bread?", r"(?<verb>more|less)" => s"\g<verb> and \g<verb>")
#> "Want more and more bread?"

16. Control Flow – if statement

Julia’s if statement works just like in Python, with a few differences:

  1. Julia uses elseif instead of Python’s elif.
  2. Julia’s logic operators are just like in C-like languages: &amp;&amp; means and, || means or, ! means not, and so on.
a = 1
if a == 1
    println("One")
elseif a == 2
    println("Two")
else
    println("Other")
end

#> One

Julia also has for exclusive or (you can type \xor to get the ⊻ character):

@assert false ⊻ false == false
@assert false ⊻ true == true
@assert true ⊻ false == true
@assert true ⊻ true == false

Oh, and notice that true and false are all lowercase, unlike Python’s True and False.

Since &amp;&amp; is lazy (like and in Python), cond &amp;&amp; f() is a common shorthand for if cond; f(); end. Think of it as “cond then f()“:

a = 2
a == 1 && println("One")
a == 2 && println("Two")
#> Two

Similarly, cond || f() is a common shorthand for if !cond; f(); end. Think of it as “cond else f()“:

a = 1
a == 1 || println("Not one")
a == 2 || println("Not two")
#> Not two

All expressions return a value in Julia, including if statements. For example:

a = 1
result = if a == 1
             "one"
         else
             "two"
         end
result
#> "one"

When an expression cannot return anything, it returns nothing:

a = 1
result = if a == 2
            "two"
          end

isnothing(result)
#> true

nothing is the single instance of the type Nothing:

typeof(nothing)
#> Nothing

17. For loops

You can use for loops just like in Python, as we saw earlier. However, it’s also possible to create nested loops on a single line:

for a in 1:2, b in 1:3, c in 1:2
    println((a, b, c))
end

#> (1, 1, 1)
#> (1, 1, 2)
#> (1, 2, 1)
#> (1, 2, 2)
#> (1, 3, 1)
#> (1, 3, 2)
#> (2, 1, 1)
#> (2, 1, 2)
#> (2, 2, 1)
#> (2, 2, 2)
#> (2, 3, 1)
#> (2, 3, 2)

The corresponding Python code would look like this:

# Python
from itertools import product

for a, b, c in product(range(1, 3), range(1, 4), range(1, 3)):
    print((a, b, c))

The continue and break keywords work just like in Python. Note that in single-line nested loops, break will exit all loops, not just the inner loop:

for a in 1:2, b in 1:3, c in 1:2
    println((a, b, c))
    (a, b, c) == (2, 1, 1) && break
end

#> (1, 1, 1)
#> (1, 1, 2)
#> (1, 2, 1)
#> (1, 2, 2)
#> (1, 3, 1)
#> (1, 3, 2)
#> (2, 1, 1)

Julia does not support the equivalent of Python’s for/else construct. You need to write something like this:

found = false
for person in ["Joe", "Jane", "Wally", "Jack", "Julia"] # try removing "Wally"
    println("Looking at $person")
    person == "Wally" && (found = true; break)
end
found || println("I did not find Wally.")

#> Looking at Joe
#> Looking at Jane
#> Looking at Wally

#> true

The equivalent Python code looks like this:

# PYTHON
for person in ["Joe", "Jane", "Wally", "Jack", "Julia"]: # try removing "Wally"
    print(f"Looking at {person}")
    if person == "Wally":
        break
else:
    print("I did not find Wally.")
Julia Python
if cond1
...
elseif cond2
...
else
...
end
if cond1:
...
elif cond2:
...
else:
...
&amp;&amp; and
\|\| or
! not
(type \xor) ^
true True
false False
cond &amp;&amp; f() if cond: f()
cond \|\| f() if not cond: f()
for i in 1:5 ... end for i in range(1, 6): ...
for i in 1:5, j in 1:6 ... end from itertools import product
for i, j in product(range(1, 6), range(1, 7)):
...
while cond ... end while cond: ...
continue continue
break break

Now lets looks at data structures, starting with tuples.

18. Tuples

Julia has tuples, very much like Python. They can contain anything:

t = (1, "Two", 3, 4, 5)
#> (1, "Two", 3, 4, 5)

Let’s look at one element:

t[1]
#> 1

Hey! Did you see that? Julia is 1-indexed, like Matlab and other math-oriented programming languages, not 0-indexed like Python and most programming languages. I found it easy to get used to, and in fact I quite like it, but your mileage may vary.

Moreover, the indexing bounds are inclusive. In Python, to get the 1st and 2nd elements of a list or tuple, you would write t[0:2] (or just t[:2]), while in Julia you write t[1:2].

t[1:2]
#> (1, "Two")

Note that end represents the index of the last element in the tuple. So you must write t[end] instead of t[-1]. Similarly, you must write t[end - 1], not t[-2], and so on.

t[end]
#> 5

Last two:

t[end - 1:end]
#> (4, 5)

Like in Python, tuples are immutable:

try
  t[2] = 2
catch ex
  ex
end
#> MethodError(setindex!, ((1, "Two", 3, 4, 5), 2, 2), 0x0000000000006a24)

The syntax for empty and 1-element tuples is the same as in Python:

empty_tuple = ()
one_element_tuple = (42,)
#> (42,)

You can unpack a tuple, just like in Python (it’s called “destructuring” in Julia):

a, b, c, d, e = (1, "Two", 3, 4, 5)
println("a=$a, b=$b, c=$c, d=$d, e=$e")
#> a=1, b=Two, c=3, d=4, e=5

It also works with nested tuples, just like in Python:

(a, (b, c), (d, e)) = (1, ("Two", 3), (4, 5))
println("a=$a, b=$b, c=$c, d=$d, e=$e")
#> a=1, b=Two, c=3, d=4, e=5

However, consider this example:

a, b, c = (1, "Two", 3, 4, 5)
println("a=$a, b=$b, c=$c")
#> a=1, b=Two, c=3

In Python, this would cause a ValueError: too many values to unpack. In Julia, the extra values in the tuple are just ignored.
If you want to capture the extra values in the variable c, you need to do so explicitly:

t = (1, "Two", 3, 4, 5)
a, b = t[1:2]
c = t[3:end]
println("a=$a, b=$b, c=$c")
#> a=1, b=Two, c=(3, 4, 5)

Or more concisely:

(a, b), c = t[1:2], t[3:end]
println("a=$a, b=$b, c=$c")
#> a=1, b=Two, c=(3, 4, 5)

The corresponding Python code is:

# PYTHON
t = (1, "Two", 3, 4, 5)
a, b, *c = t
print(f"a={a}, b={b}, c={c}")

19. Named Tuples

Julia supports named tuples:

nt = (name="Julia", category="Language", stars=5)
#> (name = "Julia", category = "Language", stars = 5)

See name attribute.

nt.name
#> "Julia"

Get the full dump of info about the Tuple.

dump(nt)
#> NamedTuple{(:name, :category, :stars),Tuple{String,String,Int64}}
#> name: String "Julia"
#> category: String "Language"
#> stars: Int64 5

The corresponding Python code is:

# Python
from collections import namedtuple

Rating = namedtuple("Rating", ["name", "category", "stars"])
nt = Rating(name="Julia", category="Language", stars=5)
print(nt.name) # prints: Julia
print(nt) # prints: Rating(name='Julia', category='Language', stars=5)

20. Structs

Julia supports structs, which hold multiple named fields, a bit like named tuples:

struct Person
    name
    age
end

Structs have a default constructor, which expects all the field values, in order:

p = Person("Mary", 30)
Person("Mary", 30)
p.age
30

You can create other constructors by creating functions with the same name as the struct:

function Person(name)
    Person(name, -1)
end

function Person()
    Person("no name")
end

p = Person()
Person("no name", -1)

This creates two constructors: the second calls the first, which calls the default constructor. Notice that you can create multiple functions with the same name but different arguments. We will discuss this later.

These two constructors are called “outer constructors”, since they are defined outside of the definition of the struct. You can also define “inner constructors”:

struct Person2
    name
    age
    function Person2(name)
        new(name, -1)
    end
end

function Person2()
    Person2("no name")
end

p = Person2()
Person2("no name", -1)

This time, the outer constructor calls the inner constructor, which calls the new() function. This new() function only works in inner constructors, and of course it creates an instance of the struct.

When you define inner constructors, they replace the default constructor:

try
    Person2("Bob", 40)
catch ex
    ex
end
MethodError(Person2, ("Bob", 40), 0x0000000000006a29)

Structs usually have very few inner constructors (often just one), which do the heavy duty work, and the checks. Then they may have multiple outer constructors which are mostly there for convenience.

By default, structs are immutable:

try
    p.name = "Someone"
catch ex
    ex
end
ErrorException("setfield! immutable struct of type Person2 cannot be changed")

However, it is possible to define a mutable struct:

mutable struct Person3
    name
    age
end

p = Person3("Lucy", 79)
p.age += 1
p
Person3("Lucy", 80)

Structs look a lot like Python classes, with instance variables and constructors, but where are the methods? We will discuss this later, in the “Methods” section.

21. Arrays

Let’s create a small array:

a = [1, 4, 9, 16]
4-element Array{Int64,1}:
  1
  4
  9
 16

Indexing and assignments work as you would expect:

a[1] = 10
a[2:3] = [20, 30]
a
4-element Array{Int64,1}:
 10
 20
 30
 16

22. Element Type

Since we used only integers when creating the array, Julia inferred that the array is only meant to hold integers (NumPy arrays behave the same way). Let’s try adding a string:

try
  a[3] = "Three"
catch ex
  ex
end
MethodError(convert, (Int64, "Three"), 0x0000000000006a2a)

Nope! We get a MethodError exception, telling us that Julia could not convert the string "Three" to a 64-bit integer (we will discuss exceptions later). If we want an array that can hold any type, like Python’s lists can, we must prefix the array with Any, which is Julia’s root type (like object in Python):

a = Any[1, 4, 9, 16]
a[3] = "Three"
a
4-element Array{Any,1}:
  1
  4
   "Three"
 16

Prefixing with Float64, or String or any other type works as well:

Float64[1, 4, 9, 16]
4-element Array{Float64,1}:
  1.0
  4.0
  9.0
 16.0

An empty array is automatically an Any array:

a = []
0-element Array{Any,1}

You can use the eltype() function to get an array’s element type (the equivalent of NumPy arrays’ dtype):

eltype([1, 4, 9, 16])
Int64

If you create an array containing objects of different types, Julia will do its best to use a type that can hold all the values as precisely as possible. For example, a mix of integers and floats results in a float array:

[1, 2, 3.0, 4.0]
4-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0

This is similar to NumPy’s behavior:

# PYTHON
np.array([1, 2, 3.0, 4.0]) # => array([1., 2., 3., 4.])

A mix of unrelated types results in an Any array:

[1, 2, "Three", 4]
4-element Array{Any,1}:
 1
 2
  "Three"
 4

If you want to live in a world without type constraints, you can prefix all you arrays with Any, and you will feel like you’re coding in Python. But I don’t recommend it: the compiler can perform a bunch of optimizations when it knows exactly the type and size of the data the program will handle, so it will run much faster. So when you create an empty array but you know the type of the values it will contain, you might as well prefix it with that type (you don’t have to, but it will speed up your program).

23. Push and Pop

To append elements to an array, use the push!() function. By convention, functions whose name ends with a bang ! may modify their arguments:

a = [1]
push!(a, 4)
push!(a, 9, 16)
4-element Array{Int64,1}:
  1
  4
  9
 16

This is similar to the following Python code:

# PYTHON
a = [1]
a.append(4)
a.extend([9, 16]) # or simply a += [9, 16]

And pop!() works like in Python:

pop!(a)
16

Equivalent to:

# PYTHON
a.pop()

There are many more functions you can call on an array. We will see later how to find them.

24. Multidimensional Arrays

Importantly, Julia arrays can be multidimensional, just like NumPy arrays:

M = [1   2   3   4
     5   6   7   8
     9  10  11  12]
3×4 Array{Int64,2}:
 1   2   3   4
 5   6   7   8
 9  10  11  12

Another syntax for this is:

M = [1 2 3 4; 5 6 7 8; 9 10 11 12]
3×4 Array{Int64,2}:
 1   2   3   4
 5   6   7   8
 9  10  11  12

You can index them much like NumPy arrays:

M[2:3, 3:4]
2×2 Array{Int64,2}:
  7   8
 11  12

You can transpose a matrix using the “adjoint” operator ':

M'
4×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
 1  5   9
 2  6  10
 3  7  11
 4  8  12

As you can see, Julia arrays are closer to NumPy arrays than to Python lists.

Arrays can be concatenated vertically using the vcat() function:

M1 = [1 2
      3 4]
M2 = [5 6
      7 8]
vcat(M1, M2)
4×2 Array{Int64,2}:
 1  2
 3  4
 5  6
 7  8

Alternatively, you can use the [M1; M2] syntax:

[M1; M2]
4×2 Array{Int64,2}:
 1  2
 3  4
 5  6
 7  8

To concatenate arrays horizontally, use hcat():

hcat(M1, M2)
2×4 Array{Int64,2}:
 1  2  5  6
 3  4  7  8

Or you can use the [M1 M2] syntax:

[M1 M2]
2×4 Array{Int64,2}:
 1  2  5  6
 3  4  7  8

You can combine horizontal and vertical concatenation:

M3 = [9 10 11 12]
[M1 M2; M3]
3×4 Array{Int64,2}:
 1   2   5   6
 3   4   7   8
 9  10  11  12

Equivalently, you can call the hvcat() function. The first argument specifies the number of arguments to concatenate in each block row:

hvcat((2, 1), M1, M2, M3)
3×4 Array{Int64,2}:
 1   2   5   6
 3   4   7   8
 9  10  11  12

hvcat() is useful to create a single cell matrix:

hvcat(1, 42)
1×1 Array{Int64,2}:
 42

Or a column vector (i.e., an n×1 matrix = a matrix with a single column):

hvcat((1, 1, 1), 10, 11, 12) # a column vector with values 10, 11, 12
hvcat(1, 10, 11, 12) # equivalent to the previous line
3×1 Array{Int64,2}:
 10
 11
 12

Alternatively, you can transpose a row vector (but hvcat() is a bit faster):

[10 11 12]'
3×1 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
 10
 11
 12

The REPL and IJulia call display() to print the result of the last expression in a cell (except when it is nothing). It is fairly verbose:

display([1, 2, 3, 4])
4-element Array{Int64,1}:
 1
 2
 3
 4

The println() function is more concise, but be careful not to confuse vectors, column vectors and row vectors (printed with commas, semi-colons and spaces, respectively):

println("Vector: ", [1, 2, 3, 4])
println("Column vector: ", hvcat(1, 1, 2, 3, 4))
println("Row vector: ", [1 2 3 4])
println("Matrix: ", [1 2 3; 4 5 6])
Vector: [1, 2, 3, 4]
Column vector: [1; 2; 3; 4]
Row vector: [1 2 3 4]
Matrix: [1 2 3; 4 5 6]

Although column vectors are printed as [1; 2; 3; 4], evaluating [1; 2; 3; 4] will give you a regular vector. That’s because [x;y] concatenates x and y vertically, and if x and y are scalars or vectors, you just get a regular vector.

Julia Python
a = [1, 2, 3] a = [1, 2, 3]
or
import numpy as np
np.array([1, 2, 3])
a[1] a[0]
a[end] a[-1]
a[2:end-1] a[1:-1]
push!(a, 5) a.append(5)
pop!(a) a.pop()
M = [1 2 3] np.array([[1, 2, 3]])
M = [1 2 3]' np.array([[1, 2, 3]]).T
M = hvcat(1, 1, 2, 3) np.array([[1], [2], [3]])
M = [1 2 3
4 5 6]
or
M = [1 2 3; 4 5 6]
M = np.array([[1,2,3], [4,5,6]])
M[1:2, 2:3] M[0:2, 1:3]
[M1; M2] np.r_[M1, M2]
[M1 M2] np.c_[M1, M2]
[M1 M2; M3] np.r_[np.c_[M1, M2], M3]

25. Comprehensions

List comprehensions are available in Julia, just like in Python (they’re usually just called “comprehensions” in Julia):

a = [x^2 for x in 1:4]
4-element Array{Int64,1}:
  1
  4
  9
 16

You can filter elements using an if clause, just like in Python:

a = [x^2 for x in 1:5 if x ∉ (2, 4)]
3-element Array{Int64,1}:
  1
  9
 25
  • a ∉ b is equivalent to !(a in b) (or a not in b in Python). You can type with \notin
  • a ∈ b is equivalent to a in b. You can type it with \in

In Julia, comprehensions can contain nested loops, just like in Python:

a = [(i,j) for i in 1:3 for j in 1:i]
6-element Array{Tuple{Int64,Int64},1}:
 (1, 1)
 (2, 1)
 (2, 2)
 (3, 1)
 (3, 2)
 (3, 3)

Here’s the corresponding Python code:

# PYTHON
a = [(i, j) for i in range(1, 4) for j in range(1, i+1)]

Julia comprehensions can also create multi-dimensional arrays (note the different syntax: there is only one for):

a = [row * col for row in 1:3, col in 1:5]
3×5 Array{Int64,2}:
 1  2  3   4   5
 2  4  6   8  10
 3  6  9  12  15

26. Dictionaries

The syntax for dictionaries is a bit different than Python:

d = Dict("tree"=>"arbre", "love"=>"amour", "coffee"=>"café")
println(d["tree"])
arbre
println(get(d, "unknown", "pardon?"))
pardon?
keys(d)
Base.KeySet for a Dict{String,String} with 3 entries. Keys:
  "coffee"
  "tree"
  "love"
values(d)
Base.ValueIterator for a Dict{String,String} with 3 entries. Values:
  "café"
  "arbre"
  "amour"
haskey(d, "love")
true
"love" in keys(d) # this is slower than haskey()
true

The equivalent Python code is of course:

d = {"tree": "arbre", "love": "amour", "coffee": "café"}
d["tree"]
d.get("unknown", "pardon?")
d.keys()
d.values()
"love" in d
"love" in d.keys()

Dict comprehensions work as you would expect:

d = Dict(i=>i^2 for i in 1:5)
Dict{Int64,Int64} with 5 entries:
  4 =&gt; 16
  2 =&gt; 4
  3 =&gt; 9
  5 =&gt; 25
  1 =&gt; 1

Note that the items (aka “pairs” in Julia) are shuffled, since dictionaries are hash-based, like in Python (although Python sorts them by key for display).

You can easily iterate through the dictionary’s pairs like this:

for (k, v) in d
    println("$k maps to $v")
end
4 maps to 16
2 maps to 4
3 maps to 9
5 maps to 25
1 maps to 1

The equivalent code in Python is:

# PYTHON
d = {i: i**2 for i in range(1, 6)}
for k, v in d.items():
    print(f"{k} maps to {v}")

And you can merge dictionaries like this:

d1 = Dict("tree"=>"arbre", "love"=>"amour", "coffee"=>"café")
d2 = Dict("car"=>"voiture", "love"=>"aimer")

d = merge(d1, d2)
Dict{String,String} with 4 entries:
  "car"    =&gt; "voiture"
  "coffee" =&gt; "café"
  "tree"   =&gt; "arbre"
  "love"   =&gt; "aimer"

Notice that the second dictionary has priority in case of conflict (it’s "love" =&gt; "aimer", not "love" =&gt; "amour").

In Python, this would be:

# PYTHON
d1 = {"tree": "arbre", "love": "amour", "coffee": "café"}
d2 = {"car": "voiture", "love": "aimer"}
d = {**d1, **d2}

Or if you want to update the first dictionary instead of creating a new one:

merge!(d1, d2)
Dict{String,String} with 4 entries:
  "car"    =&gt; "voiture"
  "coffee" =&gt; "café"
  "tree"   =&gt; "arbre"
  "love"   =&gt; "aimer"

In Python, that’s:

# PYTHON
d1.update(d2)

In Julia, each pair is an actual Pair object:

p = "tree" => "arbre"
println(typeof(p))
k, v = p
println("$k maps to $v")
Pair{String,String}
tree maps to arbre

Note that any object for which a hash() method is implemented can be used as a key in a dictionary. This includes all the basic types like integers, floats, as well as string, tuples, etc. But it also includes arrays! In Julia, you have the freedom to use arrays as keys (unlike in Python), but make sure not to mutate these arrays after insertion, or else things will break! Indeed, the pairs will be stored in memory in a location that depends on the hash of the key at insertion time, so if that key changes afterwards, you won’t be able to find the pair anymore:

a = [1, 2, 3]
d = Dict(a => "My array")
println("The dictionary is: $d")
println("Indexing works fine as long as the array is unchanged: ", d[a])
a[1] = 10
println("This is the dictionary now: $d")
try
    println("Key changed, indexing is now broken: ", d[a])
catch ex
    ex
end
The dictionary is: Dict([1, 2, 3] =&gt; "My array")
Indexing works fine as long as the array is unchanged: My array
This is the dictionary now: Dict([10, 2, 3] =&gt; "My array")





KeyError([10, 2, 3])

However, it’s still possible to iterate through the keys, the values or the pairs:

for pair in d
    println(pair)
end
[10, 2, 3] =&gt; "My array"
Julia Python
Dict("tree"=&gt;"arbre", "love"=&gt;"amour") {"tree": "arbre", "love": "amour"}
d["arbre"] d["arbre"]
get(d, "unknown", "default") d.get("unknown", "default")
keys(d) d.keys()
values(d) d.values()
haskey(d, k) k in d
Dict(i=&gt;i^2 for i in 1:4) {i: i**2 for i in 1:4}
for (k, v) in d for k, v in d.items():
merge(d1, d2) {**d1, **d2}
merge!(d1, d2) d1.update(d2)

27. Sets

Let’s create a couple sets:

odd = Set([1, 3, 5, 7, 9, 11])
prime = Set([2, 3, 5, 7, 11])
Set{Int64} with 5 elements:
  7
  2
  3
  11
  5

The order of sets is not guaranteed, just like in Python.

Use in or (type \in) to check whether a set contains a given value:

5 ∈ odd
true
5 in odd
true

Both of these expressions are equivalent to:

in(5, odd)
true

Now let’s get the union of these two sets:

odd ∪ prime
Set{Int64} with 7 elements:
  7
  9
  2
  3
  11
  5
  1

∪ is the union symbol, not a U. To type this character, type \cup (it has the shape of a cup). Alternatively, you can just use the union() function:

union(odd, prime)
Set{Int64} with 7 elements:
  7
  9
  2
  3
  11
  5
  1

Now let’s get the intersection using the ∩ symbol (type \cap):

odd ∩ prime
Set{Int64} with 4 elements:
  7
  3
  11
  5

Or use the intersect() function:

intersect(odd, prime)
Set{Int64} with 4 elements:
  7
  3
  11
  5

Next, let’s get the set difference and the symetric difference between these two sets:

setdiff(odd, prime) # values in odd but not in prime
Set{Int64} with 2 elements:
  9
  1
symdiff(odd, prime) # values that are not in the intersection
Set{Int64} with 3 elements:
  9
  2
  1

Lastly, set comprehensions work just fine:

Set([i^2 for i in 1:4])
Set{Int64} with 4 elements:
  4
  9
  16
  1

The equivalent Python code is:

# PYTHON
odds = {1, 3, 5, 7, 9, 11}
primes = {2, 3, 5, 7, 11}
5 in primes
odds | primes # union
odds.union(primes)
odds & primes # intersection
odds.intersection(primes)
odds - primes # set difference
odds.difference(primes)
odds ^ primes # symmetric difference
odds.symmetric_difference(primes)
{i**2 for i in range(1, 5)}

Note that you can store any hashable object in a Set (i.e., any instance of a type for which the hash() method is implemented). This includes arrays, unlike in Python. Just like for dictionary keys, you can add arrays to sets, but make sure not to mutate them after insertion.

Julia Python
Set([1, 3, 5, 7]) {1, 3, 5, 7}
5 in odd 5 in odd
Set([i^2 for i in 1:4]) {i**2 for i in range(1, 5)}
odd ∪ primes odd | primes
union(odd, primes) odd.union(primes)
odd ∩ primes odd &amp; primes
insersect(odd, primes) odd.intersection(primes)
setdiff(odd, primes) odd - primes or odd.difference(primes)
symdiff(odd, primes) odd ^ primes or odd.symmetric_difference(primes)

28. Enums

To create an enum, use the @enum macro:

@enum Fruit apple=1 banana=2 orange=3

This creates the Fruit enum, with 3 possible values. It also binds the names to the values:

banana
banana::Fruit = 2

Or you can get a Fruit instance using the value:

Fruit(2)
banana::Fruit = 2

And you can get all the instances of the enum easily:

instances(Fruit)
(apple, banana, orange)
Julia Python
@enum Fruit apple=1 banana=2 orange=3 from enum import Enum
class Fruit(Enum):
APPLE = 1
BANANA = 2
ORANGE = 3
Fruit(2) === banana Fruit(2) is Fruit.BANANA
instances(Fruit) dir(Fruit)

29. Object Identity

In the previous example, Fruit(2) and banana refer to the same object, not just two objects that happen to be equal. You can verify using the === operator, which is the equivalent of Python’s is operator:

banana === Fruit(2)
true

You can also check this by looking at their objectid(), which is the equivalent of Python’s id() function:

objectid(banana)
0x360d21ab82c8ee67
objectid(Fruit(2))
0x360d21ab82c8ee67
a = [1, 2, 4]
b = [1, 2, 4]
@assert a == b  # a and b are equal
@assert a !== b # but they are not the same object
Julia Python
a === b a is b
a !== b a is not b
objectid(obj) id(obj)

30. Other Collections

For the Julia equivalent of Python’s other collections, namely defaultdict, deque, OrderedDict, and Counter, check out these libraries:

  • https://github.com/JuliaCollections/DataStructures.jl
  • https://github.com/JuliaCollections/OrderedCollections.jl
  • https://github.com/andyferris/Dictionaries.jl

Now let’s looks at various iteration constructs.

Iteration Tools

31. Generator Expressions

Just like in Python, a generator expression resembles a list comprehension, but without the square brackets, and it returns a generator instead of a list. Here’s a much shorter implementation of the estimate_pi() function using a generator expression:

function estimate_pi2(n)
    4 * sum((isodd(i) ? -1 : 1)/(2i+1) for i in 0:n)
end

@assert estimate_pi(100) == estimate_pi2(100)

That’s very similar to the corresponding Python code:

# PYTHON
def estimate_pi2(n):
  return 4 * sum((-1 if i%2==1 else 1)/(2*i+1) for i in range(n+1))

assert estimate_pi(100) == estimate_pi2(100)

zip, enumerate, collect

The zip() function works much like in Python:

for (i, s) in zip(10:13, ["Ten", "Eleven", "Twelve"])
    println(i, ": ", s)
end
10: Ten
11: Eleven
12: Twelve

Notice that the parentheses in for (i, s) are required in Julia, as opposed to Python.

The enumerate() function also works like in Python, except of course it is 1-indexed:

for (i, s) in enumerate(["One", "Two", "Three"])
    println(i, ": ", s)
end
1: One
2: Two
3: Three

To pull the values of a generator into an array, use collect():

collect(1:5)
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

A shorter syntax for that is:

[1:5;]
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

The equivalent Python code is:

# PYTHON
list(range(1, 6))

32. Generators

In Python, you can easily write a generator function to create an object that will behave like an iterator. For example, let’s create a generator for the Fibonacci sequence (where each number is the sum of the two previous numbers):

def fibonacci(n):
    a, b = 1, 1
    for i in range(n):
      yield a
      a, b = b, a + b

for f in fibonacci(10):
    print(f)

This is also quite easy in Julia:

function fibonacci(n)
    Channel() do ch
        a, b = 1, 1
        for i in 1:n
            put!(ch, a)
            a, b = b, a + b
        end
    end
end

for f in fibonacci(10)
    println(f)
end
1
1
2
3
5
8
13
21
34
55

The Channel type is part of the API for tasks and coroutines. We’ll discuss these later.

Now let’s take a closer look at functions.

33. Functions

Arguments

Julia functions supports positional arguments and default values:

function draw_face(x, y, width=3, height=4)
    println("x=$x, y=$y, width=$width, height=$height")
end

draw_face(10, 20, 30)
x=10, y=20, width=30, height=4

However, unlike in Python, positional arguments must not be named when the function is called:

try
    draw_face(10, 20, width=30)
catch ex
    ex
end
MethodError(var"#draw_face##kw"(), ((width = 30,), draw_face, 10, 20), 0x0000000000006a3e)

Julia also supports a variable number of arguments (called “varargs”) using the syntax arg..., which is the equivalent of Python’s *arg:

function copy_files(target_dir, paths...)
    println("target_dir=$target_dir, paths=$paths")
end

copy_files("/tmp", "a.txt", "b.txt")
target_dir=/tmp, paths=("a.txt", "b.txt")

Keyword arguments are supported, after a semicolon ;:

function copy_files2(paths...; confirm=false, target_dir)
    println("paths=$paths, confirm=$confirm, $target_dir")
end

copy_files2("a.txt", "b.txt"; target_dir="/tmp")
paths=("a.txt", "b.txt"), confirm=false, /tmp

Notes:
* target_dir has no default value, so it is a required argument.
* The order of the keyword arguments does not matter.

You can have another vararg in the keyword section. It corresponds to Python’s **kwargs:

function copy_files3(paths...; confirm=false, target_dir, options...)
    println("paths=$paths, confirm=$confirm, $target_dir")
    verbose = options[:verbose]
    println("verbose=$verbose")
end

copy_files3("a.txt", "b.txt"; target_dir="/tmp", verbose=true, timeout=60)
paths=("a.txt", "b.txt"), confirm=false, /tmp
verbose=true

The options vararg acts like a dictionary (we will discuss dictionaries later). The keys are symbols, e.g., :verbose. Symbols are like strings, less flexible but faster. They are typically used as keys or identifiers.

Julia Python (3.8+ if / is used)
function foo(a, b=2, c=3)
...
end

 

foo(1, 2) # positional only

def foo(a, b=2, c=3, /):
...

 

foo(1, 2) # pos only because of /

function foo(;a=1, b, c=3)
...
end

 

foo(c=30, b=2) # keyword only

def foo(*, a=1, b, c=3):
...

 

foo(c=30, b=2) # kw only because of *

function foo(a, b=2; c=3, d)
...
end

 

foo(1; d=4) # pos only; then keyword only

def foo(a, b=2, /, *, c=3, d):
...

 

foo(1, d=4) # pos only then kw only

function foo(a, b=2, c...)
...
end

 

foo(1, 2, 3, 4) # positional only

def foo(a, b=2, /, *c):
...

 

foo(1, 2, 3, 4) # positional only

function foo(a, b=1, c...; d=1, e, f...)
...
end

 

foo(1, 2, 3, 4, e=5, x=10, y=20)

def foo(a, b=1, /, *c, d=1, e, **f):
...

 

foo(1, 2, 3, 4, e=5, x=10, y=20)

34. Concise Functions

In Julia, the following definition:

square(x) = x^2
square (generic function with 1 method)

is equivalent to:

function square(x)
    x^2
end
square (generic function with 1 method)

For example, here’s a shorter way to define the estimate_pi() function in Julia:

estimate_pi3(n) = 4 * sum((isodd(i) ? -1 : 1)/(2i+1) for i in 0:n)
estimate_pi3 (generic function with 1 method)

To define a function on one line in Python, you need to use a lambda (but this is generally frowned upon, since the resulting function’s name is ""):

# PYTHON
square = lambda x: x**2
assert square.__name__ == "<lambda>"

This leads us to anonymous functions.

35. Anonymous Functions

Just like in Python, you can define anonymous functions:

map(x -> x^2, 1:4)
4-element Array{Int64,1}:
  1
  4
  9
 16

Here is the equivalent Python code:

list(map(lambda x: x**2, range(1, 5)))

Notes:
* map() returns an array in Julia, instead of an iterator like in Python.
* You could use a comprehension instead: [x^2 for x in 1:4].

Julia Python
x -&gt; x^2 lambda x: x**2
(x,y) -&gt; x + y lambda x,y: x + y
() -&gt; println("yes") lambda: print("yes")

In Python, lambda functions must be simple expressions. They cannot contain multiple statements. In Julia, they can be as long as you want. Indeed, you can create a multi-statement block using the syntax (stmt_1; stmt_2; ...; stmt_n). The return value is the output of the last statement. For example:

map(x -> (println("Number $x"); x^2), 1:4)
Number 1
Number 2
Number 3
Number 4





4-element Array{Int64,1}:
  1
  4
  9
 16

This syntax can span multiple lines:

map(x -> (
  println("Number $x");
  x^2), 1:4)
Number 1
Number 2
Number 3
Number 4





4-element Array{Int64,1}:
  1
  4
  9
 16

But in this case, it’s probably clearer to use the begin ... end syntax instead:

map(x -> begin
        println("Number $x")
        x^2
    end, 1:4)
Number 1
Number 2
Number 3
Number 4





4-element Array{Int64,1}:
  1
  4
  9
 16

Notice that this syntax allows you to drop the semicolons ; at the end of each line in the block.

Yet another way to define an anonymous function is using the function (args) ... end syntax:

map(function (x)
        println("Number $x")
        x^2
    end, 1:4)
Number 1
Number 2
Number 3
Number 4





4-element Array{Int64,1}:
  1
  4
  9
 16

Lastly, if you’re passing the anonymous function as the first argument to a function (as is the case in this example), it’s usually much preferable to define the anonymous function immediately after the function call, using the do syntax, like this:

map(1:4) do x
  println("Number $x")
  x^2
end
Number 1
Number 2
Number 3
Number 4





4-element Array{Int64,1}:
  1
  4
  9
 16

This syntax lets you easily define constructs that feel like language extensions:

function my_for(func, collection)
    for i in collection
        func(i)
    end
end

my_for(1:4) do i
    println("The square of $i is $(i^2)")
end
The square of 1 is 1
The square of 2 is 4
The square of 3 is 9
The square of 4 is 16

In fact, Julia has a similar foreach() function.

The do syntax could be used to write a Domain Specific Language (DSL), for example an infrastructure automation DSL:

function spawn_server(startup_func, server_type)
    println("Starting $server_type server")
    server_id = 1234
    println("Configuring server $server_id...")
    startup_func(server_id)
end

# This is the DSL part
spawn_server("web") do server_id
    println("Creating HTML pages on server $server_id...")
end
Starting web server
Configuring server 1234...
Creating HTML pages on server 1234...

It’s also quite nice for event-driven code:

handlers = []

on_click(handler) = push!(handlers, handler)

click(event) = foreach(handler->handler(event), handlers)

on_click() do event
    println("Mouse clicked at $event")
end

on_click() do event
    println("Beep.")
end

click((x=50, y=20))
click((x=120, y=10))
Mouse clicked at (x = 50, y = 20)
Beep.
Mouse clicked at (x = 120, y = 10)
Beep.

It can also be used to create context managers, for example to automatically close an object after it has been used, even if an exception is raised:

function with_database(func, name)
    println("Opening connection to database $name")
    db = "a db object for database $name"
    try
        func(db)
    finally
        println("Closing connection to database $name")
    end
end

with_database("jobs") do db
    println("I'm working with $db")
    #error("Oops") # try uncommenting this line
end
Opening connection to database jobs
I'm working with a db object for database jobs
Closing connection to database jobs

The equivalent code in Python would look like this:

# PYTHON
class Database:
    def __init__(self, name):
        self.name = name
    def __enter__(self):
        print(f"Opening connection to database {self.name}")
        return f"a db object for database {self.name}"
    def __exit__(self, type, value, traceback):
        print(f"Closing connection to database {self.name}")

with Database("jobs") as db:
    print(f"I'm working with {db}")
    #raise Exception("Oops") # try uncommenting this line

Or you could use contextlib:

from contextlib import contextmanager

@contextmanager
def database(name):
    print(f"Opening connection to database {name}")
    db = f"a db object for database {name}"
    try:
        yield db
    finally:
        print(f"Closing connection to database {name}")

with database("jobs") as db:
    print(f"I'm working with {db}")
    #raise Exception("Oops") # try uncommenting this line
 

36. Piping

If you are used to the Object Oriented syntax "a b c".upper().split(), you may feel that writing split(uppercase("a b c")) is a bit backwards. If so, the piping operation |&gt; is for you:

"a b c" |> uppercase |> split
3-element Array{SubString{String},1}:
 "A"
 "B"
 "C"

If you want to pass more than one argument to some of the functions, you can use anonymous functions:

"a b c" |> uppercase |> split |> tokens->join(tokens, ", ")
"A, B, C"

The dotted version of the pipe operator works as you might expect, applying the _i_th function of the right array to the _i_th value in the left array:

[π/2, "hello", 4] .|> [sin, length, x->x^2]
3-element Array{Real,1}:
  1.0
  5
 16

37. Composition

Julia also lets you compose functions like mathematicians do, using the composition operator ∘ (\circ in the REPL or Jupyter, but not Colab):

f = exp ∘ sin ∘ sqrt
f(2.0) == exp(sin(sqrt(2.0)))
true

38. Methods

Earlier, we discussed structs, which look a lot like Python classes, with instance variables and constructors, but they did not contain any methods (just the inner constructors). In Julia, methods are defined separately, like regular functions:

struct Person
    name
    age
end

function greetings(greeter)
    println("Hi, my name is $(greeter.name), I am $(greeter.age) years old.")
end

p = Person("Alice", 70)
greetings(p)
Hi, my name is Alice, I am 70 years old.

Since the greetings() method in Julia is not bound to any particular type, we can use it with any other type we want, as long as that type has a name and an age (i.e., if it quacks like a duck):

struct City
    name
    country
    age
end

using Dates
c = City("Auckland", "New Zealand", year(now()) - 1840)

greetings(c)
Hi, my name is Auckland, I am 180 years old.

You could code this the same way in Python if you wanted to:

# PYTHON
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

class City:
    def __init__(self, name, country, age):
        self.name = name
        self.country = country
        self.age = age

def greetings(greeter):
    print(f"Hi there, my name is {greeter.name}, I am {greeter.age} years old.")

p = Person("Lucy", 70)
greetings(p)

from datetime import date
c = City("Auckland", "New Zealand", date.today().year - 1840)
greetings(c)

However, many Python programmers would use inheritance in this case:

class Greeter:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    def greetings(self):
        print(f"Hi there, my name is {self.name}, I am {self.age} years old.")

class Person(Greeter):
    def __init__(self, name, age):
        super().__init__(name, age)

class City(Greeter):
    def __init__(self, name, country, age):
        super().__init__(name, age)
        self.country = country

p = Person("Lucy", 70)
p.greetings()

from datetime import date
c = City("Auckland", "New Zealand", date.today().year - 1840)
c.greetings()

39. Extending a Function

One nice thing about having a class hierarchy is that you can override methods in subclasses to get specialized behavior for each class. For example, in Python you could override the greetings() method like this:

# PYTHON
class Developer(Person):
    def __init__(self, name, age, language):
        super().__init__(name, age)
        self.language = language
    def greetings(self):
        print(f"Hi there, my name is {self.name}, I am {self.age} years old.")
        print(f"My favorite language is {self.language}.")

d = Developer("Amy", 40, "Julia")
d.greetings()

Notice that the expression d.greetings() will call a different method if d is a Person or a Developer. This is called “polymorphism”: the same method call behaves differently depending on the type of the object. The language chooses which actual method implementation to call, based on the type of d: this is called method “dispatch”. More specifically, since it only depends on a single variable, it is called “single dispatch”.

The good news is that Julia can do single dispatch as well:

struct Developer
    name
    age
    language
end

function greetings(dev::Developer)
    println("Hi, my name is $(dev.name), I am $(dev.age) years old.")
    println("My favorite language is $(dev.language).")
end

d = Developer("Amy", 40, "Julia")
greetings(d)
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.

Notice that the dev argument is followed by ::Developer, which means that this method will only be called if the argument has that type.

We have extended the greetings function, so that it now has two different implementations, called methods, each for different argument types: namely, greetings(dev::Developer) for arguments of type Developer, and greetings(greeter) for values of any other type.

You can easily get the list of all the methods of a given function:

methods(greetings)

40. Two methods for generic function greetings:

  • greetings(dev::Developer) in Main at In[200]:8
  • greetings(greeter) in Main at In[198]:7

You can also get the list of all the methods which take a particular type as argument:

methodswith(Developer)

1-element Array{Method,1}:

  • greetings(dev::Developer) in Main at In[200]:8

When you call the greetings() function, Julia automatically dispatches the call to the appropriate method, depending on the type of the argument. If Julia can determine at compile time what the type of the argument will be, then it optimizes the compiled code so that there’s no choice to be made at runtime. This is called static dispatch, and it can significantly speed up the program. If the argument’s type can’t be determined at compile time, then Julia makes the choice at runtime, just like in Python: this is called dynamic dispatch.

41. Multiple Dispatch

Julia actually looks at the types of all the positional arguments, not just the first one. This is called multiple dispatch. For example:

multdisp(a::Int64, b::Int64) = 1
multdisp(a::Int64, b::Float64) = 2
multdisp(a::Float64, b::Int64) = 3
multdisp(a::Float64, b::Float64) = 4

multdisp(10, 20) # try changing the arguments to get each possible output
1

Julia always chooses the most specific method it can, so the following method will only be called if the first argument is neither an Int64 nor a Float64:

multdisp(a::Any, b::Int64) = 5

multdisp(10, 20)
1

Julia will raise an exception if there is some ambiguity as to which method is the most specific:

ambig(a::Int64, b) = 1
ambig(a, b::Int64) = 2

try
    ambig(10, 20)
catch ex
    ex
end
MethodError(ambig, (10, 20), 0x0000000000006a68)

To solve this problem, you can explicitely define a method for the ambiguous case:

ambig(a::Int64, b::Int64) = 3
ambig(10, 20)
3

So you can have polymorphism in Julia, just like in Python. This means that you can write your algorithms in a generic way, without having to know the exact types of the values you are manipulating, and it will work fine, as long as these types act in the general way you expect (i.e., if they “quack like ducks”). For example:

function how_can_i_help(greeter)
    greetings(greeter)
    println("How can I help?")
end

how_can_i_help(p) # called on a Person
how_can_i_help(d) # called on a Developer
Hi, my name is Alice, I am 70 years old.
How can I help?
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.
How can I help?

42. Calling super( )?

You may have noticed that the greetings(dev::Developer) method could be improved, since it currently duplicates the implementation of the base method greetings(greeter). In Python, you would get rid of this duplication by calling the base class’s greetings() method, using super():

# PYTHON
class Developer(Person):
    def __init__(self, name, age, language):
        super().__init__(name, age)
        self.language = language
    def greetings(self):
        super().greetings() # <== THIS!
        print(f"My favorite language is {self.language}.")

d = Developer("Amy", 40, "Julia")
d.greetings()

In Julia, you can do something pretty similar, although you have to implement your own super() function, as it is not part of the language:

super(dev::Developer) = Person(dev.name, dev.age)

function greetings(dev::Developer)
    greetings(super(dev))
    println("My favorite language is $(dev.language).")
end

greetings(d)
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.

However, this implementation creates a new Person instance when calling super(dev), copying the name and age fields. That’s okay for small objects, but it’s not ideal for larger ones. Instead, you can explicitely call the specific method you want by using the invoke() function:

function greetings(dev::Developer)
    invoke(greetings, Tuple{Any}, dev)
    println("My favorite language is $(dev.language).")
end

greetings(d)
Hi, my name is Amy, I am 40 years old.
My favorite language is Julia.

The invoke() function expects the following arguments:
* The first argument is the function to call.
* The second argument is the type of the desired method’s arguments tuple: Tuple{TypeArg1, TypeArg2, etc.}. In this case we want to call the base function, which takes a single Any argument (the Any type is implicit when no type is specified).
* Lastly, it takes all the arguments to be passed to the method. In this case, there’s just one: dev.

As you can see, we managed to get the same advantages Object-Oriented programming offers, without defining classes or using inheritance. This takes a bit of getting used to, but you might come to prefer this style of generic programming. Indeed, OO programming encourage you to bundle data and behavior together, but this is not always a good idea. Let’s look at one example:

# PYTHON
class Rectangle:
    def __init__(self, height, width):
        self.height = height
        self.width = width
    def area(self):
        return self.height * self.width

class Square(Rectangle):
    def __init__(self, length):
        super().__init__(length, length)

It makes sense for the Square class to be a subclass of the Rectangle class, since a square is a special type of rectangle. It also makes sense for the Square class to inherit from all of the Rectangle class’s behavior, such as the area() method. However, it does not really make sense for rectangles and squares to have the same memory representation: a Rectangle needs two numbers (height and width), while a Square only needs one (length).

It’s possible to work around this issue like this:

# PYTHON
class Rectangle:
    def __init__(self, height, width):
        self.height = height
        self.width = width
    def area(self):
        return self.height * self.width

class Square(Rectangle):
    def __init__(self, length):
        self.length = length
    @property
    def width(self):
        return self.length
    @property
    def height(self):
        return self.length

That’s better: now, each square is only represented using a single number. We’ve inherited the behavior, but not the data.

In Julia, you could code this like so:

struct Rectangle
    width
    height
end

width(rect::Rectangle) = rect.width
height(rect::Rectangle) = rect.height

area(rect) = width(rect) * height(rect)

struct Square
    length
end

width(sq::Square) = sq.length
height(sq::Square) = sq.length
height (generic function with 2 methods)
area(Square(5))
25

Notice that the area() function relies on the getters width() and height(), rather than directly on the fields width and height. This way, the argument can be of any type at all, as long as it has these getters.

43. Abstract Types

One nice thing about the class hierarchy we defined in Python is that it makes it clear that a square is a kind of rectangle. Any new function you define that takes a Rectangle as an argument will automatically accept a Square as well, but no other non-rectangle type. In contrast, our area() function currently accepts anything at all.

In Julia, a concrete type like Square cannot extend another concrete type like Rectangle. However, any type can extend from an abstract type. Let’s define some abstract types to create a type hierarchy for our Square and Rectangle types.

abstract type AbstractShape end
abstract type AbstractRectangle <: AbstractShape end  # <: means "subtype of"
abstract type AbstractSquare <: AbstractRectangle end

The &lt;: operator means “subtype of”.

Now we can attach the area() function to the AbstractRectangle type, instead of any type at all:

area(rect::AbstractRectangle) = width(rect) * height(rect)
area (generic function with 2 methods)

Now we can define the concrete types, as subtypes of AbstractRectangle and AbstractSquare:

struct Rectangle_v2 <: AbstractRectangle
  width
  height
end

width(rect::Rectangle_v2) = rect.width
height(rect::Rectangle_v2) = rect.height

struct Square_v2 <: AbstractSquare
  length
end

width(sq::Square_v2) = sq.length
height(sq::Square_v2) = sq.length
height (generic function with 4 methods)

In short, the Julian approach to type hierarchies looks like this:

  • Create a hierarchy of abstract types to represent the concepts you want to implement.
  • Write functions for these abstract types. Much of your implementation can be coded at that level, manipulating abstract concepts.
  • Lastly, create concrete types, and write the methods needed to give them the behavior that is expected by the generic algorithms you wrote.

This pattern is used everywhere in Julia’s standard libraries. For example, here are the supertypes of Float64 and Int64:

Base.show_supertypes(Float64)
Float64 &lt;: AbstractFloat &lt;: Real &lt;: Number &lt;: Any
Base.show_supertypes(Int64)
Int64 &lt;: Signed &lt;: Integer &lt;: Real &lt;: Number &lt;: Any

Note: Julia implicitly runs using Core and using Base when starting the REPL. However, the show_supertypes() function is not exported by the Base module, thus you cannot access it by just typing show_supertypes(Float64). Instead, you have to specify the module name: Base.show_supertypes(Float64).

And here is the whole hierarchy of Number types:

function show_hierarchy(root, indent=0)
    println(repeat(" ", indent * 4), root)
    for subtype in subtypes(root)
        show_hierarchy(subtype, indent + 1)
    end
end

show_hierarchy(Number)
Number
    Complex
    Real
        AbstractFloat
            BigFloat
            Float16
            Float32
            Float64
        AbstractIrrational
            Irrational
        FixedPointNumbers.FixedPoint
            FixedPointNumbers.Fixed
            FixedPointNumbers.Normed
        Integer
            Bool
            Signed
                BigInt
                Int128
                Int16
                Int32
                Int64
                Int8
            Unsigned
                UInt128
                UInt16
                UInt32
                UInt64
                UInt8
        Rational

44. Iterator Interface

You will sometimes want to provide a way to iterate over your custom types. In Python, this requires defining the __iter__() method which should return an object which implements the __next__() method. In Julia, you must define at least two functions:
* iterate(::YourIteratorType), which must return either nothing if there are no values in the sequence, or (first_value, iterator_state).
* iterate(::YourIteratorType, state), which must return either nothing if there are no more values, or (next_value, new_iterator_state).

For example, let’s create a simple iterator for the Fibonacci sequence:

struct FibonacciIterator end
import Base.iterate

iterate(f::FibonacciIterator) = (1, (1, 1))

function iterate(f::FibonacciIterator, state)
    new_state = (state[2], state[1] + state[2])
    (new_state[1], new_state)
end
iterate (generic function with 224 methods)

Now we can iterate over a FibonacciIterator instance:

for f in FibonacciIterator()
    println(f)
    f > 10 && break
end
1
1
2
3
5
8
13

45. Indexing Interface

You can also create a type that will be indexable like an array (allowing syntax like a[5] = 3). In Python, this requires implementing the __getitem__() and __setitem__() methods. In Julia, you must implement the getindex(A::YourType, i), setindex!(A::YourType, v, i), firstindex(A::YourType) and lastindex(A::YourType) methods.

struct MySquares end

import Base.getindex, Base.firstindex

getindex(::MySquares, i) = i^2
firstindex(::MySquares) = 0

S = MySquares()
S[10]
100
S[begin]
0
getindex(S::MySquares, r::UnitRange) = [S[i] for i in r]
getindex (generic function with 228 methods)
S[1:4]
4-element Array{Int64,1}:
  1
  4
  9
 16

For more details on these interfaces, and to learn how to build full-blown array types with broadcasting and more, check out this page.

46. Creating a Number Type

Let’s create a MyRational struct and try to make it mimic the built-in Rational type:

struct MyRational <: Real
    num # numerator
    den # denominator
end
MyRational(2, 3)
MyRational(2, 3)

It would be more convenient and readable if we could type 2 ⨸ 3 to create a MyRational:

function ⨸(num, den)
    MyRational(num, den)
end
⨸ (generic function with 1 method)
2 ⨸ 3
MyRational(2, 3)

I chose because it’s a symbol that Julia’s parser treats as a binary operator, but which is otherwise not used by Julia (see the full list of parsed symbols and their priorities). This particular symbol will have the same priority as multiplication and division.

If you want to know how to type it and check that it is unused, type ?⨸ (copy/paste the symbol):

?⨸
&quot;[36m⨸[39m&quot; can be typed by [36m\odiv[39m

search: [0m[1m⨸[22m

No documentation found.

is a Function.

# 1 method for generic function "⨸":
[1] ⨸(num, den) in Main at In[227]:2

Now let’s make it possible to add two MyRational values. We want it to be possible for our MyRational type to be used in existing algorithms which rely on +, so we must create a new method for the Base.+ function:

import Base.+

function +(r1::MyRational, r2::MyRational)
    (r1.num * r2.den + r1.den * r2.num) ⨸ (r1.den * r2.den)
end
+ (generic function with 173 methods)
2 ⨸ 3 + 3 ⨸ 5
MyRational(19, 15)

It’s important to import Base.+ first, or else you would just be defining a new + function in the current module (Main), which would not be called by existing algorithms.

You can easily implement *, ^ and so on, in much the same way.

Let’s change the way MyRational values are printed, to make them look a bit nicer. For this, we must create a new method for the Base.show(io::IO, x) function:

import Base.show

function show(io::IO, r::MyRational)
    print(io, "$(r.num) ⨸ $(r.den)")
end

2 ⨸ 3 + 3 ⨸ 5
19 ⨸ 15

We can expand the show() function so it can provide an HTML representation for MyRational values. This will be called by the display() function in Jupyter or Colab:

function show(io::IO, ::MIME"text/html", r::MyRational)
    print(io, "<sup><b>$(r.num)</b></sup>⁄<sub><b>$(r.den)</b></sub>")
end

2 ⨸ 3 + 3 ⨸ 5

1915

Next, we want to be able to perform any operation involving MyRational values and values of other Number types. For example, we may want to multiply integers and MyRational values. One option is to define a new method like this:

import Base.*

function *(r::MyRational, i::Integer)
    (r.num * i) ⨸ r.den
end

2 ⨸ 3 * 5

103

Since multiplication is commutative, we need the reverse method as well:

function *(i::Integer, r::MyRational)
    r * i # this will call the previous method
end

5 * (2 ⨸ 3) # we need the parentheses since * and ⨸ have the same priority

103

It’s cumbersome to have to define these methods for every operation. There’s a better way, which we will explore in the next two sections.

47. Conversion

It is possible to provide a way for integers to be automatically converted to MyRational values:

import Base.convert

MyRational(x::Integer) = MyRational(x, 1)

convert(::Type{MyRational}, x::Integer) = MyRational(x)

convert(MyRational, 42)

421

The Type{MyRational} type is a special type which has a single instance: the MyRational type itself. So this convert() method only accepts MyRational itself as its first argument (and we don’t actually use the first argument, so we don’t even need to give it a name in the function declaration).

Now integers will be automatically converted to MyRational values when you assign them to an array whose element type if MyRational:

a = [2 ⨸ 3] # the element type is MyRational
a[1] = 5    # convert(MyRational, 5) is called automatically
push!(a, 6) # convert(MyRational, 6) is called automatically
println(a)
MyRational[5 ⨸ 1, 6 ⨸ 1]

Conversion will also occur automatically in these cases:
* r::MyRational = 42: assigning an integer to r where r is a local variable with a declared type of MyRational.
* s.b = 42 if s is a struct and b is a field of type MyRational (also when calling new(42) on that struct, assuming b is the first field).
* return 42 if the return type is declared as MyRational (e.g., function f(x)::MyRational ... end).

However, there is no automatic conversion when calling functions:

function for_my_rationals_only(x::MyRational)
    println("It works:", x)
end

try
    for_my_rationals_only(42)
catch ex
    ex
end
MethodError(for_my_rationals_only, (42,), 0x0000000000006a8f)

48. Promotion

The Base functions +, -, *, /, ^, etc. all use a “promotion” algorithm to convert the arguments to the appropriate type. For example, adding an integer and a float promotes the integer to a float before the addition takes place. These functions use the promote() function for this. For example, given several integers and a float, all integers get promoted to floats:

promote(1, 2, 3, 4.0)
(1.0, 2.0, 3.0, 4.0)

This is why a sum of integers and floats results in a float:

1 + 2 + 3 + 4.0
10.0

The promote() function is also called when creating an array. For example, the following array is a Float64 array:

a = [1, 2, 3, 4.0]
4-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0

What about the MyRational type? Rather than create new methods for the promote() function, the recommended approach is to create a new method for the promote_rule() function. It takes two types and returns the type to convert to:

promote_rule(Float64, Int64)
Float64

Let’s implement a new method for this function, to make sure that any subtype of the Integer type will be promoted to MyRational:

import Base.promote_rule

promote_rule(::Type{MyRational}, ::Type{T}) where {T <: Integer} = MyRational
promote_rule (generic function with 141 methods)

This method definition uses parametric types: the type T can be any type at all, as long as it is a subtype of the Integer abstract type. If you tried to define the method promote_rule(::Type{MyRational}, ::Type{Integer}), it would expect the type Integer itself as the second argument, which would not work, since the promote_rule() function will usually be called with concrete types like Int64 as its arguments.

Let’s check that it works:

promote(5, 2 ⨸ 3)
(5 ⨸ 1, 2 ⨸ 3)

Yep! Now whenever we call +, -, etc., with an integer and a MyRational value, the integer will get automatically promoted to a MyRational value:

5 + 2 ⨸ 3

173

Under the hood:
* this called +(5, 2 ⨸ 3),
* which called the +(::Number, ::Number) method (thanks to multiple dispatch),
* which called promote(5, 2 ⨸ 3),
* which called promote_rule(Int64, MyRational),
* which called promote_rule(::MyRational, ::T) where {T &lt;: Integer},
* which returned MyRational,
* then the +(::Number, ::Number) method called convert(MyRational, 5),
* which called MyRational(5),
* which returned MyRational(5, 1),
* and finally +(::Number, ::Number) called +(MyRational(5, 1), MyRational(2, 3)),
* which returned MyRational(17, 3).

The benefit of this approach is that we only need to implement the +, -, etc. functions for pairs of MyRational values, not with all combinations of MyRational values and integers.

If your head hurts, it’s perfectly normal. 😉 Writing a new type that is easy to use, flexible and plays nicely with existing types takes a bit of planning and work, but the point is that you will not write these every day, and once you have, they will make your life much easier.

Now let’s handle the case where we want to execute operations with MyRational values and floats. In this case, we naturally want to promote the MyRational value to a float. We first need to define how to convert a MyRational value to any subtype of AbstractFloat:

convert(::Type{T}, x::MyRational) where {T <: AbstractFloat} = T(x.num / x.den)
convert (generic function with 246 methods)

This convert() works with any type T which is a subtype of AbstractFloat. It just computes x.num / x.den and converts the result to type T. Let’s try it:

convert(Float64, 3 ⨸ 2)
1.5

Now let’s define a promote_rule() method which will work for any type T which is a subtype of AbstractFloat, and which will give priority to T over MyRational:

promote_rule(::Type{MyRational}, ::Type{T}) where {T <: AbstractFloat} = T
promote_rule (generic function with 142 methods)
promote(1 ⨸ 2, 4.0)
(0.5, 4.0)

Now we can combine floats and MyRational values easily:

2.25 ^ (1 ⨸ 2)
1.5

49. Parametric Types and Functions

Julia’s Rational type is actually a parametric type which ensures that the numerator and denominator have the same type T, subtype of Integer. Here’s a new version of our rational struct which enforces the same constraint:

struct MyRational2{T <: Integer}
    num::T
    den::T
end

To instantiate this type, we can specify the type T:

MyRational2{BigInt}(2, 3)
MyRational2{BigInt}(2, 3)

Alternatively, we can use the MyRational2 type’s default constructor, with two integers of the same type:

MyRational2(2, 3)
MyRational2{Int64}(2, 3)

If we want to be able to construct a MyRational2 with integers of different types, we must write an appropriate constructor which handles the promotion rule:

function MyRational2(num::Integer, den::Integer)
    MyRational2(promote(num, den)...)
end
MyRational2

This constructor accepts two integers of potentially different types, and promotes them to the same type. Then it calls the default MyRational2 constructor which expects two arguments of the same type. The syntax f(args...) is analog to Python’s f(*args).

Let’s see if this works:

MyRational2(2, BigInt(3))
MyRational2{BigInt}(2, 3)

Great!

Note that all parametrized types such as MyRational2{Int64} or MyRational2{BigInt} are subtypes of MyRational2. So if a function accepts a MyRational2 argument, you can pass it an instance of any specific, parametrized type:

function for_any_my_rational2(x::MyRational2)
    println(x)
end

for_any_my_rational2(MyRational2{BigInt}(1, 2))
for_any_my_rational2(MyRational2{Int64}(1, 2))
MyRational2{BigInt}(1, 2)
MyRational2{Int64}(1, 2)

A more explicit (but verbose) syntax for this function is:

function for_any_my_rational2(x::MyRational2{T}) where {T <: Integer}
    println(x)
end
for_any_my_rational2 (generic function with 1 method)

It’s useful to think of types as sets. For example, the Int64 type represents the set of all 64-bit integer values, so 42 isa Int64:
* When x is an instance of some type T, it is an element of the set T represents, and x isa T.
* When U is a subtype of V, U is a subset of V, and U &lt;: V.

The MyRational2 type itself (without any parameter) represents the set of all values of MyRational2{T} for all subtypes T of Integer. In other words, it is the union of all the MyRational2{T} types. This is called a UnionAll type, and indeed the type MyRational2 itself is an instance of the UnionAll type:

@assert MyRational2{BigInt}(2, 3) isa MyRational2{BigInt}
@assert MyRational2{BigInt}(2, 3) isa MyRational2
@assert MyRational2 === (MyRational2{T} where {T <: Integer})
@assert MyRational2{BigInt} <: MyRational2
@assert MyRational2 isa UnionAll

If we dump the MyRational2 type, we can see that it is a UnionAll instance, with a parameter type T, constrained to a subtype of the Integer type (since the upper bound ub is Integer):

dump(MyRational2)
UnionAll
  var: TypeVar
    name: Symbol T
    lb: Union{}
    ub: Integer &lt;: Real
  body: MyRational2{T&lt;:Integer} &lt;: Any
    num::T
    den::T

There’s a lot more to learn about Julia types. When you feel ready to explore this in more depth, check out this page. You can also take a look at the source code of Julia’s rationals.

50. Writing/Reading Files

The do syntax we saw earlier is helpful when using the open() function:

open("test.txt", "w") do f
    write(f, "This is a test.\n")
    write(f, "I repeat, this is a test.\n")
end

open("test.txt") do f
    for line in eachline(f)
        println("[$line]")
    end
end
[This is a test.]
[I repeat, this is a test.]

The open() function automatically closes the file at the end of the block. Notice that the line feeds \n at the end of each line are not returned by the eachline() function. So the equivalent Python code is:

# PYTHON
with open("test.txt", "w") as f:
    f.write("This is a test.\n")
    f.write("I repeat, this is a test.\n")

with open("test.txt") as f:
    for line in f.readlines():
        line = line.rstrip("\n")
        print(f"[{line}]")

Alternatively, you can read the whole file into a string:

open("test.txt") do f
    s = read(f, String)
end
&quot;This is a test.\nI repeat, this is a test.\n&quot;

Or more concisely:

s = read("test.txt", String)
&quot;This is a test.\nI repeat, this is a test.\n&quot;

The Python equivalent is:

# PYTHON
with open("test.txt") as f:
    s = f.read()

51. Exceptions

Julia’s exceptions behave very much like in Python:

a = [1]
try
    push!(a, 2)
    #throw("Oops") # try uncommenting this line
    push!(a, 3)
catch ex
    println(ex)
    push!(a, 4)
finally
    push!(a, 5)
end
println(a)
[1, 2, 3, 5]

The equivalent Python code is:

# PYTHON
a = [1]
try:
    a.append(2)
    #raise Exception("Oops") # try uncommenting this line
    a.append(3)
except Exception as ex:
    print(ex)
    a.append(4)
finally:
    a.append(5)

print(a)

There is a whole hierarchy of standard exceptions which can be thrown, just like in Python. For example:

choice = 1 # try changing this value (from 1 to 4)
try
    choice == 1 && open("/foo/bar/i_dont_exist.txt")
    choice == 2 && sqrt(-1)
    choice == 3 && push!(a, "Oops")
    println("Everything worked like a charm")
catch ex
    if ex isa SystemError
        println("Oops. System error #$(ex.errnum) ($(ex.prefix))")
    elseif ex isa DomainError
        println("Oh no, I could not compute sqrt(-1)")
    else
        println("I got an unexpected error: $ex")
    end
end
Oops. System error #2 (opening file &quot;/foo/bar/i_dont_exist.txt&quot;)

Compare this with Python’s equivalent code:

# PYTHON
choice = 3 # try changing this value (from 1 to 4)
try:
  if choice == 1:
      open("/foo/bar/i_dont_exist.txt")
  if choice == 2:
      math.sqrt(-1)
  if choice == 3:
      #a.append("Ok") # this would actually work
      raise TypeError("Oops") # so let's fail manually
  print("Everything worked like a charm")
except OSError as ex:
    print(f"Oops. OS error (#{ex.errno} ({ex.strerror})")
except ValueError:
    print("Oh no, I could not compute sqrt(-1)")
except Exception as ex:
    print(f"I got an unexpected error: {ex}")

A few things to note here:

  • Julia only allows a single catch block which handles all possible exceptions.
  • obj isa SomeClass is a shorthand for isa(obj, SomeClass) which is equivalent to Python’s isinstance(obj, SomeClass).
Julia Python
try
...
catch ex
if ex isa SomeError
...
else
...
end
finally
...
end
try:
...
except SomeException as ex:
...
except Exception as ex:
...
finally:
...
throw any_value raise SomeException(...)
obj isa SomeType
or
isa(obj, SomeType)
isinstance(obj, SomeType)

Note that Julia does not support the equivalent of Python’s try / catch / else construct. You need to write something like this:

catch_exception = true
try
    println("Try something")
    #error("ERROR: Catch me!") # try uncommenting this line
    catch_exception = false
    #error("ERROR: Don't catch me!") # try uncommenting this line
    println("No error occurred")
catch ex
    if catch_exception
        println("I caught this exception: $ex")
    else
        throw(ex)
    end
finally
    println("The end")
end
println("After the end")
Try something
No error occurred
The end
After the end

The equivalent Python code is shorter, but it’s fairly uncommon:

# PYTHON
try:
    print("Try something")
    raise Exception("Catch me!") # try uncommenting this line
except Exception as ex:
    print(f"I caught this exception: {ex}")
else:
    raise Exception("Don't catch me!") # try uncommenting this line
    print("No error occured")
finally:
    print("The end")

print("After the end")

52. Docstrings

It’s good practice to add docstrings to every function you export. The docstring is placed just before the definition of the function:

"Compute the square of number x"
square(x::Number) = x^2
square

You can retrieve a function’s docstring using the @doc macro:

@doc square

Compute the square of number x

The docstring is displayed when asking for help:

?square
search: square Square Square_v2 MySquares AbstractSquare lastdayofquarter

Compute the square of number x

Docstrings follow the Markdown format.
A typical docstring starts with the signature of the function, indented by 4 spaces, so it will get syntax highlighted as Julia code.
It also includes an Examples section with Julia REPL outputs:

"""
    cube(x::Number)

Compute the cube of `x`.

# Examples
```julia-repl
julia> cube(5)
125
julia> cube(im)
0 - 1im

“””
cube(x) = x^3


    cube



Instead of using `julia-repl` code blocks for the examples, you can use `jldoctest` to mark these examples as doctests (similar to Python's doctests).

The help gets nicely formatted:


```julia
?cube
search: [0m[1mc[22m[0m[1mu[22m[0m[1mb[22m[0m[1me[22m [0m[1mC[22mdo[0m[1mu[22m[0m[1mb[22ml[0m[1me[22m
cube(x::Number)

Compute the cube of x.

53. Examples

julia> cube(5)
125
julia> cube(im)
0 - 1im

When there are several methods for a given function, it is common to give general information about the function in the first method (usually the most generic), and only add docstrings to other methods if they add useful information (without repeating the general info).

Alternatively, you may attach the general information to the function itself:

"""
    foo(x)

Compute the foo of the bar
"""
function foo end  # declares the foo function

# foo(x::Number) behaves normally, no need for a docstring
foo(x::Number) = "baz"

"""
    foo(x::String)

For strings, compute the qux of the bar instead.
"""
foo(x::String) = "qux"
foo
?foo
search: [0m[1mf[22m[0m[1mo[22m[0m[1mo[22m [0m[1mf[22ml[0m[1mo[22m[0m[1mo[22mr pointer_[0m[1mf[22mr[0m[1mo[22mm_[0m[1mo[22mbjref wait[0m[1mf[22m[0m[1mo[22mrbutt[0m[1mo[22mnpress Over[0m[1mf[22ml[0m[1mo[22mwErr[0m[1mo[22mr
foo(x)

Compute the foo of the bar


foo(x::String)

For strings, compute the qux of the bar instead.

54. Macros

We have seen a few macros already: @which, @assert, @time, @benchmark, @btime and @doc. You guessed it: all macros start with an @ sign.

What is a macro? It is a function which can fully inspect the expression that follows it, and apply any transformation to that code at parse time, before compilation.

This makes it possible for anyone to effectively extend the language in any way they please. Whereas C/C++ macros just do simple text replacement, Julia macros are powerful meta-programming tools.

On the flip side, this also means that each macro has its own syntax and behavior.

A personal opinion: in my experience, languages that provide great flexibility typically attract a community of programmers with a tinkering mindset, who will love to experiment with all the fun features the language has to offer. This is great for creativity, but it can also be a nuisance if the community ends up producing too much experimental code, without much care for code reliability, API stability, or even for simplicity. By all means, let’s be creative, let’s experiment, but with great power comes great responsibility: let’s also value reliability, stability and simplicity.

That said, to give you an idea of what macro definitions look like in Julia, here’s a simple toy macro that replaces a + b expressions with a - b, and leaves other expressions alone.

macro addtosub(x)
  if x.head == :call && x.args[1] == :+ && length(x.args) == 3
    Expr(:call, :-, x.args[2], x.args[3])
  else
    x
  end
end

@addtosub 10 + 2
8

In this macro definition, :call, :+ and :- are symbols. These are similar to strings, only more efficient and less flexible. They are typically used as identifiers, such as keys in dictionaries.

If you’re curious, the macro works because the parser converts 10 + 2 to Expr(:call, :+, 10, 2) and passes this expression to the macro (before compilation). The if statement checks that the expression is a function call, where the called function is the + function, with two arguments. If so, then the macro returns a new expression, corresponding to a call to the - function, with the same arguments. So a + b becomes a - b.

For more info, check out this page.

55. Special Prefixed Strings

py"..." strings are defined by the PyCall module. Writing py"something" is equivalent to writing @py_str "something". In other words, anyone can write a macro that defines a new kind of prefixed string. For example, if you write the @ok_str macro, it will be called when you write ok"something".

Another example is the Pkg module which defines the @pkg_str macro: this is why you can use pkg"..." to interact with the Pkg module. This is how pkg"add PyCall; precompile;" worked (at the end of the very first cell). This downloaded, installed and precompiled the PyCall module.

56. Modules

In Python, a module must be defined in a dedicated file. In Julia, modules are independent from the file system. You can define several modules per file, or define one module across multiple files, it’s up to you. Let’s create a simple module containing two submodules, each containing a variable and a function:

module ModA
    pi = 3.14
    square(x) = x^2

    module ModB
        e = 2.718
        cube(x) = x^3
    end

    module ModC
        root2 = √2
        relu(x) = max(0, x)
    end
end
Main.ModA

The default module is Main, so whatever we define is put in this module (except when defining a package, as we will see). This is why the ModA‘s full name is Main.ModA.

We can now access the contents of these modules by providing the full paths:

Main.ModA.ModC.root2
1.4142135623730951

Since our code runs in the Main module, we can leave out the Main. part:

ModA.ModC.root2
1.4142135623730951

Alternatively, you can use import:

import Main.ModA.ModC.root2

root2
1.4142135623730951

Or we can use import with a relative path. In this case, we need to prefix ModA with a dot . to indicate that we want the module ModA located in the current module:

import .ModA.ModC.root2

root2
1.4142135623730951

Alternatively, we can import the submodule:

import .ModA.ModC

ModC.root2
1.4142135623730951

When you want to import more than one name from a module, you can use this syntax:

import .ModA.ModC: root2, relu

This is equivalent to this more verbose syntax:

import .ModA.ModC.root2, .ModA.ModC.relu

Nested modules do not automatically have access to names in enclosing modules. To import names from a parent module, use ..x. From a grand-parent module, use ...x, and so on.

module ModD
    d = 1
    module ModE
        try
            println(d)
        catch ex
            println(ex)
        end
    end
    module ModF
        f = 2
        module ModG
            import ..f
            import ...d
            println(f)
            println(d)
        end
    end
end
UndefVarError(:d)
2
1





Main.ModD

Instead of import, you can use using. It is analog to Python’s from foo import *. It only gives access to names which were explicitly exported using export (similar to the way from foo import * in Python only imports names listed in the module’s __all__ list):

module ModH
    h1 = 1
    h2 = 2
    export h1
end
Main.ModH
using .ModH

println(h1)

try
    println(h2)
catch ex
    ex
end
1





UndefVarError(:h2)

Note that using Foo not only imports all exported names (like Python’s from foo import *), it also imports Foo itself (similarly, using Foo.Bar imports Bar itself):

ModH
Main.ModH

Even if a name is not exported, you can always access it using its full path, or using import:

ModH.h2
2
import .ModH.h2

h2
2

You can also import individual names like this:

module ModG
    g1 = 1
    g2 = 2
    export g2
end

using .ModG: g1, g2

println(g1)
println(g2)
1
2

Notice that this syntax gives you access to any name you want, whether or not it was exported. In other words, whether a name is exported or not only affects the using Foo syntax.

Importantly, when you want to expand a function which is defined in a module, you must import the function using import, or you must specify the function’s path:

module ModH
    double(x) = x * 2
    triple(x) = x * 3
end

import .ModH: double
double(x::AbstractString) = repeat(x, 2)

ModH.triple(x::AbstractString) = repeat(x, 3)

println(double(2))
println(double("Two"))

println(ModH.triple(3))
println(ModH.triple("Three"))
4
TwoTwo
9
ThreeThreeThree


WARNING: replacing module ModH.

You must never extend a function imported with using, unless you provide the function’s path:

module ModI
    quadruple(x) = x * 4
    export quadruple
end

using .ModI
ModI.quadruple(x::AbstractString) = repeat(x, 4) # OK
println(quadruple(4))
println(quadruple("Four"))

#quadruple(x::AbstractString) = repeat(x, 4) # uncomment to see the error
16
FourFourFourFour

There is no equivalent of Python’s import foo as x (yet), but you can do something like this:

import .ModI: quadruple
x = quadruple
quadruple (generic function with 2 methods)

In general, a module named Foo will be defined in a file named Foo.jl (along with its submodules). However, if the module becomes too big for a single file, you can split it into multiple files and include these files in Foo.jl using the include() function.

For example, let’s create three files: Awesome.jl, great.jl and amazing/Fantastic.jl, where:
* Awesome.jl defines the Awesome module and includes the other two files
* great.jl just defines a function
* amazing/Fantastic.jl defines the Fantastic submodule

code_awesome = """
module Awesome
include("great.jl")
include("amazing/Fantastic.jl")
end
"""

code_great = """
great() = "This is great!"
"""

code_fantastic = """
module Fantastic
fantastic = true
end
"""

open(f->write(f, code_awesome), "Awesome.jl", "w")
open(f->write(f, code_great), "great.jl", "w")
mkdir("amazing")
open(f->write(f, code_fantastic), "amazing/Fantastic.jl", "w")
38

If we try to execute import Awesome now, it won’t work since Julia does not search in the current directory by default. Let’s change this:

pushfirst!(LOAD_PATH, ".")
4-element Array{String,1}:
 "."
 "@"
 "@v#.#"
 "@stdlib"

Now when we import the Awesome module, Julia will look for a file named Awesome.jl in the current directory, or for Awesome/src/Awesome.jl, or for Awesome.jl/src/Awesome.jl. If it does not find any of these, it will look in the other places listed in the LOAD_PATH array (we will discuss this in more details in the “Package Management” section).

import Awesome
println(Awesome.great())
println("Is fantastic? ", Awesome.Fantastic.fantastic)
┌ Info: Precompiling Awesome [top-level]
└ @ Base loading.jl:1260


This is great!
Is fantastic? true

Let’s restore the original LOAD_PATH:

popfirst!(LOAD_PATH)
"."

In short:

Julia Python
import Foo import foo
import Foo.Bar from foo import bar
import Foo.Bar: a, b from foo.bar import a, b
import Foo.Bar.a, Foo.Bar.b from foo.bar import a, b
import .Foo import .foo
import ..Foo.Bar from ..foo import bar
import ...Foo.Bar from ...foo import bar
import .Foo: a, b from .foo import a, b
   
using Foo from foo import *; import foo
using Foo.Bar from foo.bar import *; from foo import bar
using Foo.Bar: a, b from foo.bar import a, b
Extending function Foo.f() Result
import Foo.f # or Foo: f
f(x::Int64) = ...
OK
import Foo
Foo.f(x::Int64) = ...
OK
using Foo
Foo.f(x::Int64) = ...
OK
import Foo.f # or Foo: f
Foo.f(x::Int64) = ...
ERROR: Foo not defined
using Foo
f(x::Int64) = ...
ERROR: Foo.f must be explicitly imported
using Foo: f
f(x::Int64) = ...
ERROR: Foo.f must be explicitly imported

57. Scopes

Julia has two types of scopes: global and local.

Every module has its own global scope, independent from all other global scopes. There is no overarching global scope.

Modules, macros and types (including structs) can only be defined in a global scope.

Most code blocks, including function, struct, for, while, etc., have their own local scope. For example:

for q in 1:3
    println(q)
end

try
    println(q) # q is not available here
catch ex
    ex
end
1
2
3





UndefVarError(:q)

A local scope inherits from its parent scope:

z = 5
for i in 1:3
    w = 10
    println(i * w * z) # i and w are local, z is from the parent scope
end
50
100
150

An inner scope can assign to a variable in the parent scope, if the parent scope is not global:

for i in 1:3
    s = 0
    for j in 1:5
        s = j # variable s is from the parent scope
    end
    println(s)
end
5
5
5

You can force a variable to be local by using the local keyword:

for i in 1:3
    s = 0
    for j in 1:5
        local s = j # variable s is local now
    end
    println(s)
end
0
0
0

To assign to a global variable, you must declare the variable as global in the local scope:

for i in 1:3
    global p
    p = i
end
p
3

There is one exception to this rule: when executing code directly in the REPL (since Julia 1.5) or in IJulia, you do not need to declare a variable as global if the global variable already exists:

s = 0
for i in 1:3
    s = i # implicitly global s: only in REPL Julia 1.5+ or IJulia
end
s
3

In functions, assigning to a variable which is not explicitly declared as global always makes it local (even in the REPL and IJulia):

s, t = 1, 2 # globals

function foo()
   s = 10 * t # s is local, t is global
end

println(foo())
println(s)
20
1

Just like in Python, functions can capture variables from the enclosing scope (not from the scope the function is called from):

t = 1

foo() = t # foo() captures t from the global scope

function bar()
    t = 5 # this is a new local variable
    println(foo()) # foo() still uses t from the global scope
end

bar()
1
function quz()
    global t
    t = 5 # we change the global t
    println(foo()) # and this affects foo()
end

quz()
5

Closures work much like in Python:

function create_multiplier(n)
    function mul(x)
        x * n # variable n is captured from the parent scope
    end
end

mul2 = create_multiplier(2)
mul2(5)
10

An inner function can modify variables from its parent scope:

function create_counter()
    c = 0
    inc() = c += 1 # this inner function modifies the c from the outer function
end

cnt = create_counter()
println(cnt())
println(cnt())
1
2

Consider the following code, and see if you can figure out why it prints the same result multiple times:

funcs = []
i = 1
while i ≤ 5
    push!(funcs, ()->i^2)
    global i += 1
end
for fn in funcs
    println(fn())
end
36
36
36
36
36

The answer is that there is a single variable i, which is captured by all 5 closures. By the time these closures are executed, the value of i is 6, so the square is 36, for every closure.

If we use a for loop, we don’t have this problem, since a new local variable is created at every iteration:

funcs = []
for i in 1:5
    push!(funcs, ()->i^2)
end
for fn in funcs
    println(fn())
end
1
4
9
16
25

Any local variable created within a for loop, a while loop or a comprehension also get a new copy at each iteration. So we could code the above example like this:

funcs = []
i = 1
while i ≤ 5  # since we are in a while loop...
    global i
    local j = i # ...and j is created here, it's a new `j` at each iteration
    push!(funcs, ()-&gt;j^2)
    i += 1
end
for fn in funcs
    println(fn())
end
1
4
9
16
25

Another way to get the same result is to use a let block, which also creates a new local variable every time it is executed:

funcs = []
i = 0
while i < 5
    let i=i
        push!(funcs, ()->i^2)
    end
    global i += 1
end
for fn in funcs
    println(fn())
end
0
1
4
9
16

This let i=i block defines a new local variable i at every iteration, and initializes it with the value of i from the parent scope. Therefore each closure captures a different local variable i.

Variables in a let block are initialized from left to right, so they can access variables on their left:

a = 1
let a=a+1, b=a
    println("a=$a, b=$b")
end
a=2, b=2

In this example, the local variable a is initialized with the value of a + 1, where a comes from the parent scope (i.e., it’s the global a in this case). However, b is initialized with the value of the local a, since it now hides the variable a from the parent scope.

Default values in function arguments also have this left-to-right scoping logic:

a = 1
foobar(a=a+1, b=a) = println("a=$a, b=$b")
foobar()
foobar(5)
a=2, b=2
a=5, b=5

In this example, the first argument’s default value is a + 1, where a comes from the parent scope (i.e., the global a in this case). However, the second argument’s default value is a, where a in this case is the value of the first argument (not the parent scope’s a).

Note that if blocks and begin blocks do not have their own local scope, they just use the parent scope:

a = 1
if true
    a = 2 # same `a` as above
end
a
2
a = 1
begin
    a = 2  # same `a` as above
end
a
2

57. Package Management

Basic Workflow

The simplest way to write a Julia program is to create a .jl file somewhere and run it using julia. You would usually do this with your favorite editor, but in this notebook we must do this programmatically. For example:

code = """
println("Hello world")
"""

open(f->write(f, code), "my_program1.jl", "w")
23

Then let’s run the program using a shell command:

;julia my_program1.jl
Hello world

If you need to use a package which is not part of the standard library, such as PyCall, you first need to install it using Julia’s package manager Pkg:

using Pkg
Pkg.add("PyCall")
[32m[1m   Updating[22m[39m registry at `~/.julia/registries/General`


[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`


[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m

Alternatively, in interactive mode, you can enter the Pkg mode by typing ], then type a command:

]add PyCall
[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m

You can also precompile the new package to avoid the compilation delay when the package is first used:

]add PyCall; precompile;
[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...

One last alternative is to use pkg"..." strings to run commands in your programs:

pkg"add PyCall; precompile;"
[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...

Now you can import PyCall in any of your Julia programs:

code = """
using PyCall
py"print('1 + 2 =', 1 + 2)"
"""

open(f->write(f, code), "my_program2.jl", "w")
41
;julia my_program2.jl
1 + 2 = 3

You can also add packages by providing their URL (typically on github). This is useful when you want to use a package which is not in the official Julia Package registry, or when you want the very latest version of a package:

]add https://github.com/JuliaLang/Example.jl
[?25l    

[32m[1m    Cloning[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[2K[?25h[?25l    

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[2K[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m

You can install a specific package version like this:

]add [email protected]
[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m PyCall ─ v1.91.3
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[95m ↓ PyCall v1.91.4 ⇒ v1.91.3[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[95m ↓ PyCall v1.91.4 ⇒ v1.91.3[39m
[32m[1m   Building[22m[39m PyCall → `~/.julia/packages/PyCall/kAhnQ/deps/build.log`

If you only specify version 1 or version 1.91, Julia will get the latest version with that prefix. For example, ]add [email protected] would install the latest version 0.91.x.

You can also update a package to its latest version:

]update PyCall
[32m[1m   Updating[22m[39m registry at `~/.julia/registries/General`


[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`


[?25h

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[93m ↑ PyCall v1.91.3 ⇒ v1.91.4[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[93m ↑ PyCall v1.91.3 ⇒ v1.91.4[39m

You can update all packages to their latest versions:

]update
[32m[1m   Updating[22m[39m registry at `~/.julia/registries/General`


[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`


[?25h[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[?25h

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m

If you don’t want a particular package to be updated the next time you call ]update, you can pin it:

]pin PyCall
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⇒ v1.91.4 ⚲[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⇒ v1.91.4 ⚲[39m

To unpin the package:

]free PyCall
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⚲ ⇒ v1.91.4[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⚲ ⇒ v1.91.4[39m

You can also run the tests defined in a package:

]test Example
[32m[1m    Testing[22m[39m Example
[32m[1mStatus[22m[39m `/tmp/jl_2kZjcq/Manifest.toml`
 [90m [7876af07][39m[37m Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
 [90m [2a0f44e3][39m[37m Base64 [39m
 [90m [8ba89e20][39m[37m Distributed [39m
 [90m [b77e0a4c][39m[37m InteractiveUtils [39m
 [90m [56ddb016][39m[37m Logging [39m
 [90m [d6f4376e][39m[37m Markdown [39m
 [90m [9a3f8284][39m[37m Random [39m
 [90m [9e88b42a][39m[37m Serialization [39m
 [90m [6462fe0b][39m[37m Sockets [39m
 [90m [8dfed614][39m[37m Test [39m
[32m[1m    Testing[22m[39m Example tests passed 

Of course, you can remove a package:

]rm Example
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [7876af07][39m[91m - Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [7876af07][39m[91m - Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m

Lastly, you can check which packages are installed using ]status (or ]st for short):

]st
[32m[1mStatus[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [6e4b80f9][39m[37m BenchmarkTools v0.5.0[39m
 [90m [052768ef][39m[37m CUDA v1.0.2[39m
 [90m [7073ff75][39m[37m IJulia v1.21.2[39m
 [90m [438e738f][39m[37m PyCall v1.91.4[39m
 [90m [d330b81b][39m[37m PyPlot v2.9.0[39m

For more Pkg commands, type ]help.

Julia (in interactive mode) Python (in a terminal)
]status pip freeze
or
conda list
]add Foo pip install foo
or
conda install foo
]add [email protected] pip install foo==1.2
or
conda install foo=1.2
]update Foo pip install --upgrade foo
or
conda update foo
]pin Foo foo== in requirements.txt
or
foo= in environment.yml
]free Foo foo in requirements.txt
or
foo in environment.yml
]test Foo python -m unittest foo
]rm Foo pip uninstall foo
or
conda remove foo
]help pip --help

This workflow is fairly simple, but it means that all of your programs will be using the same version of each package. This is analog to installing packages using pip install without using virtual environments.

58. Projects

If you want to have multiple projects, each with different libraries and library versions, you should define projects. These are analog to Python virtual environments.

A project is just a directory containing a Project.toml file and a Manifest.toml file:

my_project/
    Project.toml
    Manifest.toml
  • Project.toml is similar to a requirements.txt file (for pip) or environment.yml (for conda): it lists the dependencies of the project, and compatibility constraints (e.g., SomeDependency = 2.5).
  • Manifest.toml is an automatically generated file which lists the exact versions and unique IDs (UUIDs) of all the packages that Julia found, based on Project.toml. It includes all the implicit dependencies of the project’s packages. This is useful to reproduce an environment precisely. Analog to the output of pip --freeze.

By default, the active project is located in ~/.julia/environments/v#.# (where #.# is the Julia version you are using, such as 1.4). You can set a different project when starting Julia:

# BASH
julia --project=/path/to/my_project

Or you can set the JULIA_PROJECT environment variable:

# BASH
export JULIA_PROJECT=/path/to/my_project
julia

Or you can just activate a project directly in Julia (this is analog to running source my_project/env/bin/activate when using virtualenv):

Pkg.activate("my_project")
[32m[1m Activating[22m[39m new environment at `/content/my_project/Project.toml`

The my_project directory does not exist yet, but it gets created automatically, along with the Project.toml and Manifest.toml files, when you first add a package:

]add PyCall
[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `/content/my_project/Project.toml`
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
[32m[1m   Updating[22m[39m `/content/my_project/Manifest.toml`
 [90m [8f4d0f93][39m[92m + Conda v1.4.1[39m
 [90m [682c06a0][39m[92m + JSON v0.21.0[39m
 [90m [1914dd2f][39m[92m + MacroTools v0.5.5[39m
 [90m [69de0a69][39m[92m + Parsers v1.0.6[39m
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
 [90m [81def892][39m[92m + VersionParsing v1.2.0[39m
 [90m [2a0f44e3][39m[92m + Base64 [39m
 [90m [ade2ca70][39m[92m + Dates [39m
 [90m [8ba89e20][39m[92m + Distributed [39m
 [90m [b77e0a4c][39m[92m + InteractiveUtils [39m
 [90m [8f399da3][39m[92m + Libdl [39m
 [90m [37e2e46d][39m[92m + LinearAlgebra [39m
 [90m [56ddb016][39m[92m + Logging [39m
 [90m [d6f4376e][39m[92m + Markdown [39m
 [90m [a63ad114][39m[92m + Mmap [39m
 [90m [de0858da][39m[92m + Printf [39m
 [90m [9a3f8284][39m[92m + Random [39m
 [90m [9e88b42a][39m[92m + Serialization [39m
 [90m [6462fe0b][39m[92m + Sockets [39m
 [90m [8dfed614][39m[92m + Test [39m
 [90m [4ec0a83e][39m[92m + Unicode [39m

You can also add a package via its URL:

]add https://github.com/JuliaLang/Example.jl
[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `/content/my_project/Project.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m   Updating[22m[39m `/content/my_project/Manifest.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m

Let’s also add a package with a specific version:

]add [email protected]
[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m Example ─ v0.3.3
[32m[1m   Updating[22m[39m `/content/my_project/Project.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl) ⇒ v0.3.3[39m
[32m[1m   Updating[22m[39m `/content/my_project/Manifest.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl) ⇒ v0.3.3[39m

Now the Project.toml and Manifest.toml files were created:

;find my_project
my_project
my_project/Manifest.toml
my_project/Project.toml

Notice that the packages we added to the project were not placed in the my_project directory itself. They were saved in the ~/.julia/packages directory, the compiled files were placed in ~/.julia/compiled director, logs were written to ~/.julia/logs and so on.

If several projects use the same package, it will only be downloaded and built once (well, once per version). The ~/.julia/packages directory can hold multiple versions of the same package, so it’s fine if different projects use different versions of the same package. There will be no conflict, no “dependency hell”.

The Project.toml just says that the project depends on PyCall and Example, and it specifies the UUID of this package:

print(read("my_project/Project.toml", String))
[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"

UUIDs are useful to avoid name conflicts. If several people name their package CoolStuff, then the UUID will clarify which one we are referring to.

The Manifest.toml file is much longer, since it contains all the packages which PyCall and Example depend on, along with their versions (except for the standard library packages), and the dependency graph. This file should never be modified manually:

print(read("my_project/Manifest.toml", String))
# This file is machine-generated - editing it directly is not advised

[[Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"

[[Conda]]
deps = ["JSON", "VersionParsing"]
git-tree-sha1 = "7a58bb32ce5d85f8bf7559aa7c2842f9aecf52fc"
uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
version = "1.4.1"

[[Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"

[[Distributed]]
deps = ["Random", "Serialization", "Sockets"]
uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"

[[Example]]
git-tree-sha1 = "276fa06109ac5c80035cff711b0a18ad5b3117cc"
uuid = "7876af07-990d-54b4-ab0e-23690620f79a"
version = "0.3.3"

[[InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"

[[JSON]]
deps = ["Dates", "Mmap", "Parsers", "Unicode"]
git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e"
uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
version = "0.21.0"

[[Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

[[LinearAlgebra]]
deps = ["Libdl"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"

[[Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"

[[MacroTools]]
deps = ["Markdown", "Random"]
git-tree-sha1 = "f7d2e3f654af75f01ec49be82c231c382214223a"
uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
version = "0.5.5"

[[Markdown]]
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"

[[Mmap]]
uuid = "a63ad114-7e13-5084-954f-fe012c677804"

[[Parsers]]
deps = ["Dates", "Test"]
git-tree-sha1 = "20ef902ea02f7000756a4bc19f7b9c24867c6211"
uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
version = "1.0.6"

[[Printf]]
deps = ["Unicode"]
uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"

[[PyCall]]
deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"]
git-tree-sha1 = "3a3fdb9000d35958c9ba2323ca7c4958901f115d"
uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
version = "1.91.4"

[[Random]]
deps = ["Serialization"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[[Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"

[[Sockets]]
uuid = "6462fe0b-24de-5631-8697-dd941f90decc"

[[Test]]
deps = ["Distributed", "InteractiveUtils", "Logging", "Random"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[[Unicode]]
uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"

[[VersionParsing]]
git-tree-sha1 = "80229be1f670524750d905f8fc8148e5a8c4537f"
uuid = "81def892-9a0e-5fdd-b105-ffc91e053289"
version = "1.2.0"

Note that Manifest.toml contains the precise version of the Example package that was installed, but the Project.toml file does not specify that version 0.3 is required. That’s because Julia cannot know whether your project is supposed to work only with any version 0.3.x, or whether it could work with other versions as well. So if you want to specify a version constraint for the Example package, you must add it manually in Project.toml. You would normally use your favorite editor to do this, but in this notebook we’ll update Project.toml programmatically:

append_config = """

[compat]
Example = "0.3"
"""

open(f->write(f, append_config), "my_project/Project.toml", "a")
26

Here is the updated Project.toml file:

print(read("my_project/Project.toml", String))
[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"

[compat]
Example = "0.3"

Now if we try to replace Example 0.3 with version 0.2, we get an error:

try
    pkg"add [email protected]"
catch ex
    ex
end
[32m[1m  Resolving[22m[39m package versions...





Pkg.Resolve.ResolverError("empty intersection between [email protected] and project compatibility 0.3", nothing)

Now you can run a program based on this project, and it will have the possibility to use all the packages which have been added to this project, with their specific versions. If you import a package which was not explicitly added to this project, Julia will fallback to the default project:

code = """
import PyCall # found in the project
import PyPlot # not found, so falls back to default project
println("Success!")
"""

open(f->write(f, code), "my_program3.jl", "w")
117
;julia --project=my_project my_program3.jl
Success!

59. Packages

Falling back to the default project is fine, as long as you run the code on your own machine, but if you want to share your code with other people, it would be brittle to count on packages installed in their default project. Instead, if you plan to share your code, you should clearly specify which packages it depends on, and use only these packages. Such a shareable project is called a package.

A package is a regular project (as defined above), but with a few extras:
* the Project.toml file must specify a name, a version and a uuid.
* there must be a src/PackageName.jl file containing a module named PackageName.
* you generally want to specify the authors and description, and maybe also the license, repository (e.g., the package’s github URL), and some keywords, but all of these are optional.

It is very easy to create a new package using the ]generate command. To define the authors field, Pkg will look up the user.name and user.email git config entries, so let’s define them before we generate the package:

;git config --global user.name "Alice Bob"
;git config --global user.email "[email protected]"
]generate MyPackages/Hello
[32m[1m Generating[22m[39m  project Hello:
    MyPackages/Hello/Project.toml
    MyPackages/Hello/src/Hello.jl

This generated the MyPackages/Hello/Project.toml file (along with the enclosing directories) and the MyPackages/Hello/src/Hello.jl file. Let’s take a look at the Project.toml file:

print(read("MyPackages/Hello/Project.toml", String))
name = "Hello"
uuid = "b1200148-98bf-43d1-9bb1-85f7b4552217"
authors = ["Alice Bob "]
version = "0.1.0"

Notice that the project has no dependencies yet, but it has a name, a unique UUID, and a version (plus an author).

Note: if Pkg does not find a your name or email in the git config, it falls back to environment variables (GIT_AUTHOR_NAME, GIT_COMMITTER_NAME, USER, USERNAME, NAME and GIT_AUTHOR_EMAIL, GIT_COMMITTER_EMAIL, EMAIL).

And let’s look at the src/Hello.jl file:

print(read("MyPackages/Hello/src/Hello.jl", String))
module Hello

greet() = print("Hello World!")

end # module

Let’s try to use the greet() function from the Hello package:

try
    import Hello
    Hello.greet()
catch ex
    ex
end
ArgumentError("Package Hello not found in current path:\n- Run `import Pkg; Pkg.add(\"Hello\")` to install the Hello package.\n")

Julia could not find the Hello package. When you’re working on a package, don’t forget to activate it first!

]activate MyPackages/Hello
[32m[1m Activating[22m[39m environment at `/content/MyPackages/Hello/Project.toml`
import Hello
Hello.greet()
┌ Info: Precompiling Hello [b1200148-98bf-43d1-9bb1-85f7b4552217]
└ @ Base loading.jl:1260


Hello World!

It works!

If the Hello package depends on other packages, we must add them:

]add PyCall Example
[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m Example ─ v0.5.3
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[92m + Example v0.5.3[39m
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
 [90m [8f4d0f93][39m[92m + Conda v1.4.1[39m
 [90m [7876af07][39m[92m + Example v0.5.3[39m
 [90m [682c06a0][39m[92m + JSON v0.21.0[39m
 [90m [1914dd2f][39m[92m + MacroTools v0.5.5[39m
 [90m [69de0a69][39m[92m + Parsers v1.0.6[39m
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
 [90m [81def892][39m[92m + VersionParsing v1.2.0[39m
 [90m [2a0f44e3][39m[92m + Base64 [39m
 [90m [ade2ca70][39m[92m + Dates [39m
 [90m [8ba89e20][39m[92m + Distributed [39m
 [90m [b77e0a4c][39m[92m + InteractiveUtils [39m
 [90m [8f399da3][39m[92m + Libdl [39m
 [90m [37e2e46d][39m[92m + LinearAlgebra [39m
 [90m [56ddb016][39m[92m + Logging [39m
 [90m [d6f4376e][39m[92m + Markdown [39m
 [90m [a63ad114][39m[92m + Mmap [39m
 [90m [de0858da][39m[92m + Printf [39m
 [90m [9a3f8284][39m[92m + Random [39m
 [90m [9e88b42a][39m[92m + Serialization [39m
 [90m [6462fe0b][39m[92m + Sockets [39m
 [90m [8dfed614][39m[92m + Test [39m
 [90m [4ec0a83e][39m[92m + Unicode [39m

You must not use any package which has not been added to the project. If you do, you will get a warning.

Once you are happy with your package, you can deploy it to github (or anywhere else). Then you can add it to your own projects just like any other package.

If you want to make your package available to the world via the official Julia registry, you just need to send a Pull Request to https://github.com/JuliaRegistries/General. However, it’s highly recommended to automate this using the Registrator.jl github app.

If you want to use other registries (including private registries), check out this page.

Also check out the PkgTemplate package, which provides more sophisticated templates for creating new packages, for example with continuous integration, code coverage tests, etc.

60. Fixing Issues in a Dependency

Sometimes you may run into an issue inside one of the packages your project depends on. When this happens, you can use Pkg‘s dev command to fix the issue. For example, let’s pretend the Example package has a bug:

]dev Example
[?25l    

[32m[1m    Cloning[22m[39m git-repo `https://github.com/JuliaLang/Example.jl.git`


[2K[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[93m ↑ Example v0.5.3 ⇒ v0.5.4 [`~/.julia/dev/Example`][39m
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
 [90m [7876af07][39m[93m ↑ Example v0.5.3 ⇒ v0.5.4 [`~/.julia/dev/Example`][39m

This command cloned the repo into ~/.julia/dev/Example:

;ls -l "~/.julia/dev"
total 4
drwxr-xr-x 7 root root 4096 Jul  2 00:06 Example

It also updated the Hello package’s Manifest.toml file to ensure the package now uses the Example clone. You can see this using ]status:

]st
[36m[1mProject [22m[39mHello v0.1.0
[32m[1mStatus[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[37m Example v0.5.4 [`~/.julia/dev/Example`][39m
 [90m [438e738f][39m[37m PyCall v1.91.4[39m

So you would now go ahead and edit the clone and fix the bug. Of course, you would also want to send a PR to the package’s owners so the source package gets fixed. Once that happens, you can go back to the official Example package easily:

]free Example
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 [`~/.julia/dev/Example`] ⇒ v0.5.3[39m
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 [`~/.julia/dev/Example`] ⇒ v0.5.3[39m
]st
[36m[1mProject [22m[39mHello v0.1.0
[32m[1mStatus[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[37m Example v0.5.3[39m
 [90m [438e738f][39m[37m PyCall v1.91.4[39m

61. Instantiating a Project

If you want to run someone else’s project and you want to make sure you are using the exact same package versions, you can clone the project, and assuming it has a Manifest.toml file, you can activate the project and run ]instantiate to install all the appropriate packages. For example, let’s instantiate the Registrator.jl project:

;git clone https://github.com/JuliaRegistries/Registrator.jl
Cloning into 'Registrator.jl'...
]activate Registrator.jl
[32m[1m Activating[22m[39m environment at `/content/Registrator.jl/Project.toml`
]instantiate
[32m[1m  Installed[22m[39m TableTraits ───────────────── v1.0.0
[32m[1m  Installed[22m[39m AutoHashEquals ────────────── v0.2.0
[32m[1m  Installed[22m[39m Hiccup ────────────────────── v0.2.2
[32m[1m  Installed[22m[39m DataAPI ───────────────────── v1.2.0
[32m[1m  Installed[22m[39m Lazy ──────────────────────── v0.14.0
[32m[1m  Installed[22m[39m WebSockets ────────────────── v1.5.2
[32m[1m  Installed[22m[39m JSON2 ─────────────────────── v0.3.1
[32m[1m  Installed[22m[39m HTTP ──────────────────────── v0.8.14
[32m[1m  Installed[22m[39m IniFile ───────────────────── v0.5.0
[32m[1m  Installed[22m[39m ZMQ ───────────────────────── v1.2.0
[32m[1m  Installed[22m[39m GitForge ──────────────────── v0.1.5
[32m[1m  Installed[22m[39m AssetRegistry ─────────────── v0.1.0
[32m[1m  Installed[22m[39m TimeToLive ────────────────── v0.3.0
[32m[1m  Installed[22m[39m DataValueInterfaces ───────── v1.0.0
[32m[1m  Installed[22m[39m IteratorInterfaceExtensions ─ v1.0.0
[32m[1m  Installed[22m[39m ZeroMQ_jll ────────────────── v4.3.2+2
[32m[1m  Installed[22m[39m Tables ────────────────────── v1.0.4
[32m[1m  Installed[22m[39m Mux ───────────────────────── v0.7.1
[32m[1m  Installed[22m[39m Parsers ───────────────────── v1.0.2
[32m[1m  Installed[22m[39m MbedTLS_jll ───────────────── v2.16.0+2
[32m[1m  Installed[22m[39m Mustache ──────────────────── v1.0.2
[32m[1m  Installed[22m[39m Pidfile ───────────────────── v1.1.0
[32m[1m  Installed[22m[39m GitHub ────────────────────── v5.1.5
[32m[1m  Installed[22m[39m RegistryTools ─────────────── v1.5.0
######################################################################### 100.0%
######################################################################### 100.0%

Usually, that’s all you need to know about projects and packages, but let’s look at bit under the hood, so you can handle less common cases.

62. Load Path

When you import a package, Julia searches for it in the environments listed in the LOAD_PATH array. An environment can be a project or a directory containing a bunch of packages directly. By default, the LOAD_PATH array contains three elements:

LOAD_PATH
3-element Array{String,1}:
 "@"
 "@v#.#"
 "@stdlib"

Here’s what these elements mean:
* "@" represents the active project, if any: that’s the project activated via --project, JULIA_PROJECT, ]activate or Pkg.activate().
* "@v#.#" represents the default shared project for the version of Julia we are running. That’s why it is used by default when there is no active project.
* "@stdlib" represents the standard library. This is not a project: it’s a directory containing many packages.

If you want to see the actual paths, you can call Base.load_path():

Base.load_path()
3-element Array{String,1}:
 "/content/Registrator.jl/Project.toml"
 "/root/.julia/environments/v1.4/Project.toml"
 "/usr/local/share/julia/stdlib/v1.4"

You can change the load path if you want to. For example, if you want Julia to look only in the active project and in the standard library, without looking in the default project, then you can set the JULIA_LOAD_PATH environment variable to "@:@stdlib".

If you try to run my_program3.jl this way, it will successfully import PyCall, but it will fail to import PyPlot, since it is not listed in Project.toml (however, it would successfully import any package from the standard library):

try
    withenv("JULIA_LOAD_PATH"=&gt;"@:@stdlib") do
        run(`julia --project=my_project my_program3.jl`)
    end
catch ex
    ex
end
ERROR: LoadError: ArgumentError: Package PyPlot not found in current path:
- Run `import Pkg; Pkg.add("PyPlot")` to install the PyPlot package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] exec_options(::Base.JLOptions) at ./client.jl:288
 [4] _start() at ./client.jl:484
in expression starting at /content/my_program3.jl:2





ProcessFailedException(Base.Process[Process(`[4mjulia[24m [4m--project=my_project[24m [4mmy_program3.jl[24m`, ProcessExited(1))])

You can also modify the LOAD_PATH array programmatically, for example to make all the packages in the my_packages/ directory available to the project:

push!(LOAD_PATH, "my_packages")
4-element Array{String,1}:
 "@"
 "@v#.#"
 "@stdlib"
 "my_packages"

Now any package added to this directory will be directly available to us:

]generate my_packages/Hello2
[32m[1m Generating[22m[39m  project Hello2:
    my_packages/Hello2/Project.toml
    my_packages/Hello2/src/Hello2.jl
using Hello2
Hello2.greet()
┌ Info: Precompiling Hello2 [b76a3422-75bc-4a82-ad3b-dff89fdf93f4]
└ @ Base loading.jl:1260


Hello World!

This is a convenience for development, as we didn’t have to push this package to a repository or even add it to the project. However, it’s just for development: once you’re happy with your package, make sure to push it to a repo, and add it to the project normally.

63. Depots

As we saw earlier, new packages you add to a project are placed in the ~/.julia/packages directory, logs are placed in ~/.julia/logs, and so on.

A directory like ~/.julia which contains Pkg related content is called a depot. Julia installs all new packages in the default depot, which is the first directory in the DEPOT_PATH array (this array can be modified manually in Julia, or set via the JULIA_DEPOT_PATH environment variable):

DEPOT_PATH
3-element Array{String,1}:
 "/root/.julia"
 "/usr/local/local/share/julia"
 "/usr/local/share/julia"

The default depot needs to be writeable for the current user, since that’s where new packages will be written to (as well as logs and other stuff). The other depots can be read-only: they’re typically used for private package registries.

You can occasionally run the ]gc command, which will remove all unused package versions (Pkg will use the logs to located existing projects).

In summary: when some code runs using Foo or import Foo, the LOAD_PATH is used to determine which specific package Foo refers to, while the DEPOT_PATH is used to determine where it is. The exception is when the LOAD_PATH contains directories which directly contain packages: for these packages, the DEPOT_PATH is not used.

64. Parallel Computing

Julia supports coroutines (aka green threads), multithreading without a GIL like CPython!, multiprocessing and distributed computing.

65. Coroutines

Let’s go back to the fibonacci() generator function:

function fibonacci(n)
    Channel() do ch
        a, b = 1, 1
        for i in 1:n
            put!(ch, a)
            a, b = b, a + b
        end
    end
end

for f in fibonacci(10)
    println(f)
end
1
1
2
3
5
8
13
21
34
55

Under the hood, Channel() do ... end creates a Channel object, and spawns an asynchronous Task to execute the code in the do ... end block. The task is scheduled to execute immediately, but when it calls the put!() function on the channel to yield a value, it blocks until another task calls the take!() function to grab that value. You do not see the take!() function explicitly in this code example, since it is executed automatically in the for loop, in the main task. To demonstrate this, we can just call the take!() function 10 times to get all the items from the channel:

ch = fibonacci(10)
for i in 1:10
    println(take!(ch))
end
1
1
2
3
5
8
13
21
34
55

This channel is bound to the task, therefore it is automatically closed when the task ends. So if we try to get one more element, we will get an exception:

try
    take!(ch)
catch ex
    ex
end
InvalidStateException("Channel is closed.", :closed)

Here is a more explicit version of the fibonacci() function:

function fibonacci(n)
  function generator_func(ch, n)
    a, b = 1, 1
    for i in 1:n
        put!(ch, a)
        a, b = b, a + b
    end
  end
  ch = Channel()
  task = @task generator_func(ch, n) # creates a task without starting it
  bind(ch, task) # the channel will be closed when the task ends
  schedule(task) # start running the task asynchronously
  ch
end
fibonacci (generic function with 1 method)

And here is a more explicit version of the for loop:

ch = fibonacci(10)
while isopen(ch)
  value = take!(ch)
  println(value)
end
1
1
2
3
5
8
13
21
34
55

Note that asynchronous tasks (also called “coroutines” or “green threads”) are not actually run in parallel: they cooperate to alternate execution. Some functions, such as put!(), take!(), and many I/O functions, interrupt the current task’s execution, at which point it lets Julia’s scheduler decide which task should resume its execution. This is just like Python’s coroutines.

For more details on coroutines and tasks, see the manual.

62. Multithreading

Julia also supports multithreading. Currently, you need to specify the number of O.S. threads upon startup, by setting the JULIA_NUM_THREADS environment variable (or setting the -t argument in Julia 1.5+). In the first cell, we configured the IJulia kernel so that set environment variable is set:

ENV["JULIA_NUM_THREADS"]
"4"

The actual number of threads started by Julia may be lower than that, as it is limited to the number of available cores on the machine (thanks to hyperthreading, each physical core may run two threads). Here is the number of threads that were actually started:

using Base.Threads
nthreads()
2

Now let’s run 10 tasks across these threads:

@threads for i in 1:10
    println("thread #", threadid(), " is starting task #$i")
    sleep(rand()) # pretend we're actually working
    println("thread #", threadid(), " is finished")
end
thread #1 is starting task #1
thread #2 is starting task #6
thread #2 is finished
thread #2 is starting task #7
thread #1 is finished
thread #1 is starting task #2
thread #2 is finished
thread #2 is starting task #8
thread #1 is finished
thread #1 is starting task #3
thread #1 is finished
thread #1 is starting task #4
thread #2 is finished
thread #2 is starting task #9
thread #1 is finished
thread #1 is starting task #5
thread #1 is finished
thread #2 is finished
thread #2 is starting task #10
thread #2 is finished

Here is a multithreaded version of the estimate_pi() function. Each thread computes part of the sum, and the parts are added at the end:

function parallel_estimate_pi(n)
    s = zeros(nthreads())
    nt = n ÷ nthreads()
    @threads for t in 1:nthreads()
        for i in (1:nt) .+ nt*(t - 1)
          @inbounds s[t] += (isodd(i) ? -1 : 1) / (2i + 1)
        end
    end
    return 4.0 * (1.0 + sum(s))
end

@btime parallel_estimate_pi(100_000_000)
  128.853 ms (16 allocations: 1.63 KiB)





3.1415926635894196

The @inbounds macro is an optimization: it tells the Julia compiler not to add any bounds check when accessing the array. It’s safe in this case since the s array has one element per thread, and t varies from 1 to nthreads(), so there is no risk for s[t] to be out of bounds.

Let’s compare this with the single-threaded implementation:

@btime estimate_pi(100_000_000)
  134.263 ms (0 allocations: 0 bytes)





3.141592663589326

If you are running this notebook on Colab, the parallel implementation is probably no faster than the single-threaded one. That’s because the Colab Runtime only has a single CPU, so there is no benefit from multithreading (plus there is a bit of overhead for managing threads). However, on my 8-core machine, using 16 threads, the parallel implementation is about 6 times faster than the single-threaded one.

Julia has a mapreduce() function which makes it easy to implement functions like parallel_estimate_pi():

function parallel_estimate_pi2(n)
    4.0 * mapreduce(i -> (isodd(i) ? -1 : 1) / (2i + 1), +, 0:n)
end
parallel_estimate_pi2 (generic function with 1 method)
@btime parallel_estimate_pi2(100_000_000)
  106.664 ms (0 allocations: 0 bytes)





3.1415926635897917

The mapreduce() function is well optimized, so it’s about twice faster than parallel_estimate_pi().

You can also spawn a task using Threads.@spawn. It will get executed on any one of the running threads (it will not start a new thread):

task = Threads.@spawn begin
    println("Thread starting")
    sleep(1)
    println("Thread stopping")
    42 # result
end

println("Hello!")

println("The result is: ", fetch(task))

Hello!
Thread starting
Thread stopping
The result is: 42

The fetch() function waits for the thread to finish, and fetches the result. You can also just call wait() if you don’t need the result.

Last but not least, you can use channels to synchronize and communicate across tasks, even if they are running across separate threads:

ch = Channel()
task1 = Threads.@spawn begin
    for i in 1:5
        sleep(rand())
        put!(ch, i^2)
    end
    println("Finished sending!")
    close(ch)
end

task2 = Threads.@spawn begin
    foreach(v->println("Received $v"), ch)
    println("Finished receiving!")
end

wait(task2)
Received 1
Received 4
Received 9
Received 16
Finished sending!
Received 25
Finished receiving!

For more details about multithreading, check out this page.

63. Multiprocessing & Distributed Programming

Julia can spawn multiple Julia processes upon startup if you specify the number of processes via the -p argument. You can also spawn extra processes from Julia itself:

using Distributed
addprocs(4)
workers() # array of worker process ids
4-element Array{Int64,1}:
 2
 3
 4
 5

The main process has id 1:

myid()
1

The @everywhere macro lets you run any code on all workers:

@everywhere println("Hi! I'm worker $(myid())")
Hi! I'm worker 1
      From worker 4:    Hi! I'm worker 4
      From worker 3:    Hi! I'm worker 3
      From worker 2:    Hi! I'm worker 2
      From worker 5:    Hi! I'm worker 5

You can also execute code on a particular worker by using @spawnat:

@spawnat 3 println("Hi! I'm worker $(myid())")
Future(3, 1, 14, nothing)

If you specify :any instead of a worker id, Julia chooses the worker for you:

@spawnat :any println("Hi! I'm worker $(myid())")
      From worker 3:    Hi! I'm worker 3





Future(2, 1, 15, nothing)

Both @everywhere and @spawnat return immediately. The output of @spawnat is a Future object. You can call fetch() on this object to wait for the result:

result = @spawnat 3 1+2+3+4
fetch(result)
10

If you import some package in the main process, it is not automatically imported in the workers. For example, the following code fails because the worker does not know what pyimport is:

using PyCall

result = @spawnat 4 (np = pyimport("numpy"); np.log(10))

try
    fetch(result)
catch ex
    ex
end
      From worker 2:    Hi! I'm worker 2





RemoteException(4, CapturedException(UndefVarError(:pyimport), Any[(#121 at macros.jl:87, 1), (#101 at process_messages.jl:290, 1), (run_work_thunk at process_messages.jl:79, 1), (run_work_thunk at process_messages.jl:88, 1), (#94 at task.jl:358, 1)]))

You must use @everywhere or @spawnat to import the packages you need in each worker:

@everywhere using PyCall

result = @spawnat 4 (np = pyimport("numpy"); np.log(10))

fetch(result)
2.302585092994046

Similarly, if you define a function in the main process, it is not automatically available in the workers. You must define the function in every worker:

@everywhere addtwo(n) = n + 2
result = @spawnat 4 addtwo(40)
fetch(result)
42

You can pass a Future to @everywhere or @spawnat, as long as you wrap it in a fetch() function:

M = @spawnat 2 rand(5)
result = @spawnat 3 fetch(M) .* 10.0
fetch(result)
5-element Array{Float64,1}:
 4.475589942138973
 3.7844448153428067
 6.199227766558075
 8.66410018066203
 3.364462310811107

In this example, worker 2 creates a random array, then worker 3 fetches this array and multiplies each element by 10, then the main process fetches the result and displays it.

64. GPU

Julia has excellent GPU support. As you may know, GPUs are devices which can run thousands of threads in parallel. Each thread is slower and more limited than on a CPU, but there are so many of them that plenty of tasks can be executed much faster on a GPU than on a CPU, provided these tasks can be parallelized.

Let’s check which GPU device is installed:

;nvidia-smi
Thu Jul  2 00:08:11 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

If you’re running on Colab, your runtime will generally have an Nvidia Tesla K80 GPU with 12GB of RAM installed, but sometimes other GPUs like Nvidia Tesla T4 16GB, or Nvidia Tesla P100).

If no GPU is detected, go to Runtime > Change runtime type, set Hardware accelerator to GPU, then go to Runtime > Factory reset runtime, then reinstall Julia by running the first cell again, then reload the page and come back here). If you’re running on your own machine, make sure you have a compatible GPU card installed, with the appropriate drivers.

Now let’s create a large matrix and time how long it takes to square it on the CPU:

using BenchmarkTools

M = rand(2^11, 2^11)

function benchmark_matmul_cpu(M)
    M * M
    return
end

benchmark_matmul_cpu(M) # warm up
@btime benchmark_matmul_cpu($M)
  436.690 ms (2 allocations: 32.00 MiB)

Notes:
* For benchmarking, we wrapped the operation in a function which returns nothing.
* Why do we have a “warm up” line? Well, since Julia compiles code on the fly the first time it is executed, it’s good practice to execute the operation we want to benchmark at least once before starting the benchmark, or else the benchmark will include the compilation time.
* We used $M instead of M on the last line. This is a feature of the @btime macro: it evaluates M before benchmarking takes place, to avoid the extra delay that is incurred when benchmarking with global variables.

Now let’s benchmark this same operation on the GPU:

using CUDA

# Copy the data to the GPU. Creates a CuArray:
M_on_gpu = cu(M)

# Alternatively, create a new random matrix directly on the GPU:
#M_on_gpu = CUDA.CURAND.rand(2^11, 2^11)

function benchmark_matmul_gpu(M)
    CUDA.@sync M * M
    return
end

benchmark_matmul_gpu(M_on_gpu) # warm up
@btime benchmark_matmul_gpu($M_on_gpu)
[32m[1mDownloading[22m[39m artifact: CUDA10.1
[?25l

######################################################################### 100.0%


[1A[2K[?25h[32m[1mDownloading[22m[39m artifact: CUDNN+CUDA10.1
[?25l

######################################################################### 100.0%


[1A[2K[?25h[32m[1mDownloading[22m[39m artifact: CUTENSOR+CUDA10.1
[?25l

######################################################################### 100.0%


[1A[2K[?25h

┌ Warning: `haskey(::TargetIterator, name::String)` is deprecated, use `Target(; name = name) !== nothing` instead.
│   caller = llvm_compat(::VersionNumber) at compatibility.jl:181
└ @ CUDA /root/.julia/packages/CUDA/42B9G/deps/compatibility.jl:181


  2.360 ms (9 allocations: 368 bytes)

That’s much faster (185x faster in my test on Colab with an NVidia Tesla P100 GPU).

Importantly:
* Before the GPU can work on some data, it needs to be copied to the GPU (or generated there directly).
* the CUDA.@sync macro waits for the GPU operation to complete. Without it, the operation would happen in parallel on the GPU, while execution would continue on the CPU. So we would just be timing how long it takes to start the operation, not how long it takes to complete.
* In general, you don’t need CUDA.@sync, since many operations (including cu()) call it implicitly, and it’s usually a good idea to let the CPU and GPU work in parallel. Typically, the GPU will be working on the current batch of data while the CPU works on preparing the next batch.

Of course, the speed up will vary depending on the matrix size and the GPU type. Moreover, copying the data from the CPU to the GPU is often the slowest part of the operation, but we only benchmarked the matrix multiplication itself. Let’s see what we get if we include the data transfer in the benchmark:

That’s still much faster than on the CPU.

Let’s check how much RAM we have left on the GPU:

CUDA.memory_status()
Effective GPU memory usage: 99.93% (15.888 GiB/15.899 GiB)
CUDA allocator usage: 15.594 GiB
BinnedPool usage: 15.594 GiB (16.000 MiB allocated, 15.578 GiB cached)

Julia’s Garbage Collector will free CUDA arrays like any other object, when there’s no more reference to it. However, CUDA.jl uses a memory pool to make allocations faster on the GPU, so don’t be surprised if the allocated memory on the GPU does not go down immediately. Moreover, IJulia keeps a reference to the output of each cell, so if you let any cell output a CuArray, it will only be released when you execute Out[]=0. If you want to force the Garbage Collector to run, you an run GC.gc(). To reclaim memory from the memory pool, use CUDA.reclaim():

GC.gc()
CUDA.reclaim()
16726884352

Many other operations are implemented for CuArray (+, -, etc.) and dotted operations (.+, exp.(), etc). Importantly, loop fusion also works on the GPU. For example, if we want to compute M .* M .+ M, without loop fusion the GPU would first compute M .* M and create a temporary array, then it would add M to that array, like this:

function benchmark_without_fusion(M)
    P = M .* M
    CUDA.@sync P .+ M
    return
end

benchmark_without_fusion(M_on_gpu) # warm up
@btime benchmark_without_fusion($M_on_gpu)
  676.534 μs (140 allocations: 4.30 KiB)

Instead, loop fusion ensures that the array is only traversed once, without the need for a temporary array:

function benchmark_with_fusion(M)
    CUDA.@sync M .* M .+ M
    return
end

benchmark_with_fusion(M_on_gpu) # warm up
@btime benchmark_with_fusion($M_on_gpu)
  387.141 μs (87 allocations: 3.36 KiB)

That’s much faster (75% faster in my test on Colab). 😃

Lastly, you can actually write your own GPU kernels in Julia! In other words, rather than using GPU operations implemented in the CUDA.jl package (or others), you can write Julia code that will be compiled for the GPU, and executed there. This can occasionally be useful to speed up some algorithms where the standard kernels don’t suffice. For example, here’s a GPU kernel which implements u .+= v, where u and v are two (large) vectors:

function worker_gpu_add!(u, v)
    index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
    index ≤ length(u) && (@inbounds u[index] += v[index])
    return
end

function gpu_add!(u, v)
    numblocks = ceil(Int, length(u) / 256)
    @cuda threads=256 blocks=numblocks worker_gpu_add!(u, v)
    return u
end
gpu_add! (generic function with 1 method)

This code example is adapted from the CUDA.jl package’s documentation, which I highly encourage you to check out if you’re interested in writing your own kernels. Here are the key parts to understand this example, starting from the end:
* The gpu_add!() function first calculates numblocks, the number of blocks of threads to start, then it uses the @cuda macro to spawn numblocks blocks of GPU threads, each with 256 threads, and each thread runs worker_gpu_add!(u, v).
* The worker_gpu_add!() function computes u[index] += v[index] for a single value of index: in other words, each thread will just update a single value in the vector! Let’s see how the index is computed:
* The @cuda macro spawned many blocks of 256 threads each. These blocks are organized in a grid, which is one-dimensional by default, but it can be up to three-dimensional. Therefore each thread and each block have an (x, y, z) coordinate in this grid. See this diagram from the Nvidia blog post:
.
* threadIdx().x returns the current GPU thread’s x coordinate within its block (one difference with the diagram is that Julia is 1-indexed).
* blockIdx().x returns the current block’s x coordinate in the grid.
* blockDim().x returns the block size along the x axis (in this example, it’s 256).
* gridDim().x returns the number of blocks in the grid, along the x axis (in this example it’s numblocks).
* So the index that each thread must update in the array is (blockIdx().x - 1) * blockDim().x + threadIdx().x.
* As explained earlier, the @inbounds macro is an optimization that tells Julia that the index is guaranteed to be inbounds, so there’s no need for it to check.

Now writing your own GPU kernel won’t seem like something only top experts with advanced C++ skills can do: you can do it too!

Let’s check that the kernel works as expected:

u = rand(2^20)
v = rand(2^20)

u_on_gpu = cu(u)
v_on_gpu = cu(v)

u .+= v
gpu_add!(u_on_gpu, v_on_gpu)

@assert Array(u_on_gpu) ≈ u

Yes, it works well!

Note: the operator checks whether the operands are approximately equal within the float precision limit.

Let’s benchmark our custom kernel:

function benchmark_custom_assign_add!(u, v)
    CUDA.@sync gpu_add!(u, v)
    return
end

benchmark_custom_assign_add!(u_on_gpu, v_on_gpu)
@btime benchmark_custom_assign_add!($u_on_gpu, $v_on_gpu)
  98.689 μs (52 allocations: 1.31 KiB)

Let’s see how this compares to CUDA.jl‘s implementation:

function benchmark_assign_add!(u, v)
    CUDA.@sync u .+= v
    return
end

benchmark_assign_add!(u_on_gpu, v_on_gpu)
@btime benchmark_assign_add!($u_on_gpu, $v_on_gpu)
  137.072 μs (70 allocations: 1.89 KiB)

How about that? Our custom kernel is faster than CUDA.jl‘s kernel! But to be fair, our kernel would not work with huge vectors, since there’s a limit to the number of blocks & threads you can spawn (see Table 15 in CUDA’s documentation). To support such huge vectors, we need each worker to run a loop like this:

function worker_gpu_add!(u, v)
    index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
    stride = blockDim().x * gridDim().x
    for i = index:stride:length(u)
        @inbounds u[i] += v[i]
    end
    return
end
worker_gpu_add! (generic function with 1 method)

This way, if @cuda is executed with a smaller number of blocks than needed to have one thread per array item, the workers will loop appropriately.

This should get you started! For more info, check out CUDA.jl‘s documentation.

65. Command Line Arguments

Command line arguments are available via ARGS:

ARGS
1-element Array{String,1}:
 "/root/.local/share/jupyter/runtime/kernel-4b7aa9c6-4581-4d7b-acea-4e4dfaf036c8.json"

Unlike Python’s sys.argv, the first element of this array is not the program name. If you need the program name, use PROGRAM_FILE instead:

PROGRAM_FILE
"/root/.julia/packages/IJulia/DrVMH/src/kernel.jl"

You can get the current module, directory, file or line number:

@__MODULE__, @__DIR__, @__FILE__, @__LINE__
(Main, "/content", "In[406]", 1)

The equivalent of Python’s if __name__ == "__main__" is:

if abspath(PROGRAM_FILE) == @__FILE__
    println("Starting of the program")
end

66. Memory Management

Let’s check how many megabytes of RAM are available:

free() = println("Available RAM: ", Sys.free_memory() ÷ 10^6, " MB")

free()
Available RAM: 3120 MB

If a variable holds a large object that you don’t need anymore, you can either wait until the variable falls out of scope, or set it to nothing. Either way, the memory will only be freed when the Garbage Collector does its magic, which may not be immediate. In general, you don’t have to worry about that, but if you want, you can always call the GC directly:

function use_ram()
    M = rand(10000, 10000) # use 400+MB of RAM
    println("sum(M)=$(sum(M))")
end # M will be freed by the GC eventually after this

use_ram()

M = rand(10000, 10000) # use 400+MB of RAM
println("sum(M)=$(sum(M))")
M = nothing

GC.gc() # rarely needed
sum(M)=4.9997184380985916e7
sum(M)=5.000422876376158e7
free()
Available RAM: 1528 MB

Thanks!

I hope you enjoyed this introduction to Julia! I recommend you join the friendly and helpful Julia community on Slack or Discourse.

Cheers!

Aurélien Geron

Ref: Git repo for this post.

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science