added firststeps + reorganized website a bit
This commit is contained in:
417
material/1_mon/firststeps/firststeps_handout.qmd
Normal file
417
material/1_mon/firststeps/firststeps_handout.qmd
Normal file
@@ -0,0 +1,417 @@
|
||||
---
|
||||
code-annotations: select
|
||||
---
|
||||
# First Steps
|
||||
|
||||
## Getting started
|
||||
|
||||
::: callout-tip
|
||||
The [julia manual](https://docs.julialang.org/en/v1/manual/getting-started/) is excellent!
|
||||
:::
|
||||
|
||||
At this point we assume that you have Julia 1.9 installed, VSCode ready, and installed the VSCode Julia plugin. There are some more [recommended settings in VSCode](vscode.qmd) which are not necessary, but helpful.
|
||||
|
||||
We further recommend to not use the small "play" button on the top right (which opens a new julia process everytime you change something), but rather open a new Julia repl (`ctrl`+`shift`+`p` => `>Julia: Start Repl`) which you keep open as long as possible.
|
||||
|
||||
::: callout-tip
|
||||
VSCode automatically loads the `Revise.jl` package, which screens all your actively loaded packages/files and updates the methods instances whenever it detects a change. This is quite similar to `%autorelad 2` in python. If you use VSCode, you dont need to think about it, if you prefer a command line, you should put Revise.jl in your startup.jl file.
|
||||
:::
|
||||
|
||||
|
||||
## Syntax differences Python/R/MatLab
|
||||
|
||||
### In the beginning there was `nothing`
|
||||
`nothing`- but also `NaN` and also `Missing`.
|
||||
|
||||
Each of those has a specific purpose, but most likely we will only need `a = nothing` and `b = NaN`
|
||||
|
||||
### Control Structures
|
||||
|
||||
**Matlab User?** Syntax will be *very* familiar.
|
||||
|
||||
**R User?** Forget about all the `{}` brackets
|
||||
|
||||
**Python User?** We don't need no intendation, and we also have 1-index
|
||||
|
||||
``` julia
|
||||
myarray = zeros(6) # <1>
|
||||
for k = 1:length(myarray) # <2>
|
||||
if iseven(k)
|
||||
myarray[k] = sum(myarray[1:k]) # <3>
|
||||
elseif k == 5
|
||||
myarray = myarray .- 1 # <4>
|
||||
else
|
||||
myarray[k] = 5
|
||||
end # <5>
|
||||
end
|
||||
```
|
||||
|
||||
1. initialize a vector (check with `typeof(myArray)`)
|
||||
2. Control-Structure for-loop. 1-index!
|
||||
3. **MatLab**: Notice the `[` brackets to index Arrays!
|
||||
4. **Python/R**: `.` always means elementwise
|
||||
5. **Python/R**: `end` after each control sequence
|
||||
|
||||
### Functions
|
||||
```julia
|
||||
function myfunction(a,b=123;keyword1="defaultkeyword") #<1>
|
||||
if keyword1 == "defaultkeyword"
|
||||
c = a+b
|
||||
else
|
||||
c= a*b
|
||||
end
|
||||
return c
|
||||
end
|
||||
methods(myfunction) # <2>
|
||||
myfunction(0)
|
||||
myfunction(1;keyword1 = "notdefault")
|
||||
myfunction(0,5)
|
||||
myfunction(0,5;keyword1 = "notdefault")
|
||||
```
|
||||
1. everything before the `;` => positional, after => `kwargs`
|
||||
2. returns two functions, due to the `b=123` optional positional argument
|
||||
|
||||
```julia
|
||||
anonym = (x,y) -> x+y
|
||||
anonym(3,4)
|
||||
```
|
||||
|
||||
```julia
|
||||
myshortfunction(x) = x^2
|
||||
function mylongfunction(x)
|
||||
return x^2
|
||||
end
|
||||
```
|
||||
|
||||
#### elementwise-function / broadcasting
|
||||
Julia is very neat in regards of applying functions elementwise (also called broadcasting). (Matlab users know this already).
|
||||
|
||||
```julia
|
||||
a = [1,2,3,4]
|
||||
b = sqrt(a) # <1>
|
||||
c = sqrt.(a) # <2>
|
||||
```
|
||||
1. Error - there is no method defined for the `sqrt` of an `Vector`
|
||||
2. the small `.` applies the function to all elements of the container `a` - this works as "expected"
|
||||
|
||||
::: callout-important
|
||||
Broadcasting is very powerful, as julia can get a huge performance boost in chaining many operations, without requiring saving temporary arrays. For example:
|
||||
```julia
|
||||
a = [1,2,3,4,5]
|
||||
b = [6,7,8,9,10]
|
||||
|
||||
c = (a.^2 .+ sqrt.(a) .+ log.(a.*b))./5
|
||||
```
|
||||
|
||||
In many languages (matlab, python, R) you would need to do the following:
|
||||
```
|
||||
1. temp1 = a.*b
|
||||
2. temp2 = log.(temp1)
|
||||
3. temp3 = a.^2
|
||||
4. temp4 = sqrt.(a)
|
||||
5. temp5 = temp3 .+ temp4
|
||||
6. temp6 = temp5 + temp2
|
||||
7. output = temp6./5
|
||||
```
|
||||
Thus, we need to allocate ~7x the memory of the vector (not at the same time though)
|
||||
|
||||
In Julia, the elementwise code above rather translates to:
|
||||
|
||||
```julia
|
||||
c = similar(a) # <1>
|
||||
for k = 1:length(a)
|
||||
c[k] = (a[k]^2 + sqrt(a[k]) + log(a[k]*b[k]))./5
|
||||
end
|
||||
|
||||
```
|
||||
1. Function to initialize an `undef` array with the same size as `a`
|
||||
|
||||
The `temp` memory we need at each iteration is simply `c[k]`.
|
||||
And a nice sideeffect: by doing this, we get rid of any specialized "serialized" function e.g. to do sum, or + or whatever. Those are typically the inbuilt `C` functions in python/matlab/R, that really speed up things. In Julia **we do not need inbuilt functions for speed**.
|
||||
:::
|
||||
|
||||
|
||||
|
||||
## Style-conventions
|
||||
|
||||
| | |
|
||||
| -- | -- |
|
||||
| variables | lowercase, lower_case|
|
||||
| Types,Modules | UpperCamelCase|
|
||||
| functions, macro | lowercase |
|
||||
| inplace / side-effects | `endwith!()` |
|
||||
|
||||
# Task 1.
|
||||
Ok - lot of introduction, but I think you are ready for your first interactive task.
|
||||
|
||||
## Wait - how do I even run things in Julia/VScode?
|
||||
Typically, you work in a Julia script ending in `scriptname.jl`
|
||||
|
||||
You concurrently have a REPL open, to not reload all packages etc. everytime. Further you typically have `Revise.jl` running in the background to automatically update your custom Packages / Modules (more to that later).
|
||||
|
||||
You can mark some code and execute it using `ctrl` + `enter` - you can also generate code-blocks using `#---` and run a whole code-block using `alt`+`enter`
|
||||
|
||||
1. Open a new script `statistic_functions.jl` in VSCode in a folder of your choice.
|
||||
|
||||
2. implement a function called `rse_sum`^[rse = research software engineering, we could use `sum` in a principled way, but it requires some knowledge you likely don't have right now]. This function should return `true` if provided with the following test: `res_sum(1:36) == 666`. You should further make use of a for-loop.
|
||||
|
||||
3. implement a second function called `rse_mean`, which calculates the mean of the provided vector. Make sure to use the `rse_sum` function! Test it using `res_mean(-15:17) == 1`
|
||||
|
||||
4. Next implement a standard deviation function `rse_std`: $\sqrt{\frac{\sum(x-mean(x))}{n-1}}$, this time you should use elementwise/broadcasting operators. Test it with `rse_std(1:3) == 1`
|
||||
|
||||
5. Finally, we will implement `rse_tstat`, returning the t-value with `length(x)-1` DF, that the provided Array actually has a mean of 0. Test it with `rse_tstat(2:3) == 5`. Add the keyword argument `σ` that allows the user to optionally provide a pre-calculated standard deviation.
|
||||
|
||||
Well done! You now have all functions defined with which we will continue our journey.
|
||||
|
||||
# Julia Basics - II
|
||||
### Strings
|
||||
|
||||
```julia
|
||||
character = 'a'
|
||||
str = "abc"
|
||||
str[3] # <1>
|
||||
```
|
||||
1. returns `c`
|
||||
|
||||
##### characters
|
||||
```julia
|
||||
'a':'f' #<1>
|
||||
collect('a':'f') # <2>
|
||||
join('a':'f') # <3>
|
||||
```
|
||||
1. a `StepRange` between characters
|
||||
2. a `Array{Chars}`
|
||||
3. a `String`
|
||||
|
||||
##### concatenation
|
||||
|
||||
```julia
|
||||
a = "one"
|
||||
b = "two"
|
||||
ab = a * b # <1>
|
||||
|
||||
```
|
||||
1. Indeed, `*` and not `+` - as plus implies from algebra that `a+b == b+a` which obviously is not true for string concatenation. But `a*b !== b*a` - at least for matrices.
|
||||
|
||||
##### substrings
|
||||
```julia
|
||||
str = "long string"
|
||||
substr = SubString(str, 1, 4)
|
||||
whereis_str = findfirst("str",str)
|
||||
```
|
||||
|
||||
##### regexp
|
||||
```julia
|
||||
str = "any WORD written in CAPITAL?"
|
||||
occursin(r"[A-Z]+", str) # <1>
|
||||
m = match(r"[A-Z]+",str) # <2>
|
||||
```
|
||||
1. Returns `true`. Note the small `r` before the `r"regular expression"` - nifty!
|
||||
2. Returns a `::RegexMatch` - access via `m.match` & `m.offset` (index) - or `m.captures` / `m.offsets` if you defined capture-groups
|
||||
|
||||
##### Interpolation
|
||||
```julia
|
||||
a = 123
|
||||
str = "this is a: $a; this 2*a: $(2*a)"
|
||||
```
|
||||
|
||||
## Scopes
|
||||
All things (excepts modules) are in local scope (in scripts)
|
||||
|
||||
``` julia
|
||||
a = 0
|
||||
for k = 1:10
|
||||
a = 1
|
||||
end
|
||||
a #<1>
|
||||
```
|
||||
1. a = 0! - in a script; but a = 1 in the REPL!
|
||||
|
||||
Variables are in global scope in the REPL for debugging convenience
|
||||
|
||||
::: callout-tip
|
||||
Putting this code into a function automatically resolves this issue
|
||||
```julia
|
||||
function myfun()
|
||||
a = 0
|
||||
for k = 1:10
|
||||
a = 1
|
||||
end
|
||||
a #<1>
|
||||
return a
|
||||
end
|
||||
myfun() # <1>
|
||||
```
|
||||
1. returns 1 now in both REPL and include("myscript.jl")
|
||||
|
||||
:::
|
||||
|
||||
#### explicit global / local
|
||||
|
||||
``` julia
|
||||
a = 0
|
||||
global b
|
||||
b = 0
|
||||
for k = 1:10
|
||||
local a
|
||||
global b
|
||||
a = 1
|
||||
b = 1
|
||||
end
|
||||
a #<1>
|
||||
b #<2>
|
||||
```
|
||||
|
||||
1. a = 0
|
||||
2. b = 1
|
||||
|
||||
|
||||
#### Modifying containers works in any case
|
||||
```julia
|
||||
a = zeros(10)
|
||||
for k = 1:10
|
||||
|
||||
a[k] = k
|
||||
end
|
||||
a #<1>
|
||||
```
|
||||
1. This works "correctly" in the `REPL` as well as in a script, because we modify the content of `a`, not `a` itself
|
||||
|
||||
## Types
|
||||
Types play a super important role in Julia for several main reasons:
|
||||
|
||||
1) The allow for specialization e.g. `+(a::Int64,b::Float64)` might have a different (faster?) implementation compared to `+(a::Float64,b::Float64)`
|
||||
2) They allow for generalization using `abstract` types
|
||||
3) They act as containers, structuring your programs and tools
|
||||
|
||||
Everything in julia has a type! Check this out:
|
||||
```julia
|
||||
typeof(1)
|
||||
typeof(1.0)
|
||||
typeof(sum)
|
||||
typeof([1])
|
||||
typeof([(1,2),"5"])
|
||||
```
|
||||
----
|
||||
|
||||
We will discuss two types of types:
|
||||
|
||||
1) **`composite`** types
|
||||
2) `abstract` types.
|
||||
|
||||
::: {.callout-tip collapse="true"}
|
||||
## Click me for even more types!
|
||||
There is a third type, `primitive type` - but we will practically never use them
|
||||
Not much to say at this level, they are types like `Float64`. You could define your own one, e.g.
|
||||
```julia
|
||||
primitive type Float128 <: AbstractFloat 128 end
|
||||
```
|
||||
|
||||
And there are two more, `Singleton types` and `Parametric types` - which (at least the latter), you might use at some point. But not in this tutorial.
|
||||
|
||||
:::
|
||||
|
||||
|
||||
### composite types
|
||||
You can think of these types as containers for your variables, which allows you for specialization.
|
||||
```julia
|
||||
struct SimulationResults
|
||||
parameters::Vector
|
||||
results::Vector
|
||||
end
|
||||
|
||||
s = SimulationResults([1,2,3],[5,6,7,8,9,10,NaN])
|
||||
|
||||
function print(s::SimulationResults)
|
||||
println("The following simulation was run:")
|
||||
println("Parameters: ",s.parameters)
|
||||
println("And we got results!")
|
||||
println("Results: ",s.results)
|
||||
end
|
||||
|
||||
print(s)
|
||||
|
||||
function SimulationResults(parameters) # <1>
|
||||
results = run_simulation(parameters)
|
||||
return SimulationResults(parameters,results)
|
||||
end
|
||||
|
||||
function run_simulation(x)
|
||||
return cumsum(repeat(x,2))
|
||||
end
|
||||
|
||||
s = SimulationResults([1,2,3])
|
||||
print(s)
|
||||
|
||||
|
||||
```
|
||||
1. in case not all fields are directly defined, we can provide an outer constructor (there are also inner constructors, but we will not discuss them here)
|
||||
|
||||
|
||||
::: callout-warning
|
||||
once defined, a type-definition in the global scope of the REPL cannot be re-defined without restarting the julia REPL! This is annoying, there are some tricks arround it (e.g. defining the type in a module (see below), and then reloading the module)
|
||||
:::
|
||||
|
||||
# Task 2
|
||||
|
||||
1. Implement a type `StatResult` with fields for `x`, `n`, `std` and `tvalue`
|
||||
2. Implement an outer constructor that can run `StatResult(2:10)` and return the full type including the calculated t-values.
|
||||
3. Implement a function `length` for `StatResult` to multiple-dispatch on
|
||||
4. **Optional:** If you have time, optimize the functions, so that mean, sum, length, std etc. is not calculated multiple times - you might want to rewrite your type. Note: This is a bit tricky :)
|
||||
|
||||
# Julia Basics III
|
||||
## Modules
|
||||
```julia
|
||||
module MyStatsPackage
|
||||
include("src/statistic_functions.jl")
|
||||
export SimulationResults #<1>
|
||||
export rse_tstat
|
||||
end
|
||||
|
||||
using MyStatsPackage
|
||||
|
||||
```
|
||||
1. This makes the `SimulationResults` type immediately available after running `using MyStatsPackage`. To use the other "internal" functions, one would use `MyStatsPackage.rse_sum`.
|
||||
```julia
|
||||
import MyStatsPackage
|
||||
|
||||
MyStatsPackage.rse_tstat(1:10)
|
||||
|
||||
import MyStatsPackage: rse_sum
|
||||
rse_sum(1:10)
|
||||
```
|
||||
|
||||
## Macros
|
||||
Macros allow to programmers to edit the actual code **before** it is run. We will pretty much just use them, without learning how they work.
|
||||
|
||||
```julia
|
||||
@which cumsum
|
||||
@which(cumsum)
|
||||
a = "123"
|
||||
@show a
|
||||
```
|
||||
# Cheatsheets
|
||||
|
||||
## meta-tools
|
||||
|
||||
<!-- maybe move to own file "cheatsheets?" -->
|
||||
|
||||
| | Julia | Python |
|
||||
|------------------------|------------------------|------------------------|
|
||||
| Documentation | `?obj` | `help(obj)` |
|
||||
| Object content | `dump(obj)` | `print(repr(obj))` |
|
||||
| Exported functions | `names(FooModule)` | `dir(foo_module)` |
|
||||
| List function signatures with that name | `methods(myFun)` | |
|
||||
| List functions for specific type | `methodswith(SomeType)` | `dir(SomeType)` |
|
||||
| Where is ...? | `@which func` | `func.__module__` |
|
||||
| What is ...? | `typeof(obj)` | `type(obj)` |
|
||||
| Is it really a ...? | `isa(obj, SomeType)` | `isinstance(obj, SomeType)` |
|
||||
|
||||
## debugging
|
||||
|||
|
||||
|--|--|
|
||||
`@run sum(5+1)`| run debugger, stop at error/breakpoints
|
||||
`@enter sum(5+1)` | enter debugger, dont start code yet
|
||||
`@show variable` | prints: variable = variablecontent
|
||||
`@debug variable` | prints only to debugger, very convient in combination with `>ENV["JULIA_DEBUG"] = ToBeDebuggedModule` (could be `Main` as well)
|
||||
|
||||
Reference in New Issue
Block a user