At this point, we assume that you have Julia 1.9 installed, VSCode ready, and installed the VSCode Julia Language Support extension. There are some more [recommended settings in VSCode](vscode.qmd) which are not necessary, but helpful.
We further recommend to not use the small "play" button on the top right (which opens a new julia process everytime you change something), but rather open a new Julia repl (`ctrl`+`shift`+`p` => `>Julia: Start Repl`) which you keep open as long as possible.
VSCode automatically loads the `Revise.jl` package, which screens all your actively loaded packages/files and updates the methods instances whenever it detects a change. This is quite similar to `%autoreload 2` in python. If you use VSCode, you dont need to think about it, if you prefer a command line, you should put Revise.jl in your startup.jl file.
Broadcasting is very powerful, as Julia can get a huge performance boost in chaining many operations, without requiring saving temporary arrays. For example:
And a nice sideeffect: By doing this, we get rid of any specialized "serialized" function, e.g. to do sum, or + or whatever. Those are typically the inbuilt `C` functions in Python/Matlab/R, that really speed up things. In Julia **we do not need inbuilt functions for speed**.
Ok - lot of introduction, but I think you are ready for your first interactive task.
## Wait - how do I even run things in Julia/VScode?
Typically, you work in a Julia script ending in `scriptname.jl`
You concurrently have a REPL open, to not reload all packages etc. everytime. Further you typically have `Revise.jl` running in the background to automatically update your custom Packages / Modules (more to that later).
You can mark some code and execute it using `ctrl` + `enter` - you can also generate code-blocks using `#---` and run a whole code-block using `alt`+`enter`
1. Open a new script `statistic_functions.jl` in VSCode in a folder of your choice.
2. implement a function called `rse_sum`^[rse = research software engineering, we could use `sum` in a principled way, but it requires some knowledge you likely don't have right now]. This function should return `true` if provided with the following test: `res_sum(1:36) == 666`. You should further make use of a for-loop.
3. implement a second function called `rse_mean`, which calculates the mean of the provided vector. Make sure to use the `rse_sum` function! Test it using `res_mean(-15:17) == 1`
4. Next implement a standard deviation function `rse_std`: $\sqrt{\frac{\sum(x-mean(x))}{n-1}}$, this time you should use elementwise/broadcasting operators. Test it with `rse_std(1:3) == 1`
5. Finally, we will implement `rse_tstat`, returning the t-value with `length(x)-1` DF, that the provided Array actually has a mean of 0. Test it with `rse_tstat(2:3) == 5`. Add the keyword argument `σ` that allows the user to optionally provide a pre-calculated standard deviation.
Well done! You now have all functions defined with which we will continue our journey.
# Julia Basics - II
### Strings
```julia
character = 'a'
str = "abc"
str[3] # <1>
```
1. returns `c`
##### characters
```julia
'a':'f' #<1>
collect('a':'f') # <2>
join('a':'f') # <3>
```
1. a `StepRange` between characters
2. a `Array{Chars}`
3. a `String`
##### concatenation
```julia
a = "one"
b = "two"
ab = a * b # <1>
```
1. Indeed, `*` and not `+` - as plus implies from algebra that `a+b == b+a` which obviously is not true for string concatenation. But `a*b !== b*a` - at least for matrices.
##### substrings
```julia
str = "long string"
substr = SubString(str, 1, 4)
whereis_str = findfirst("str",str)
```
##### regexp
```julia
str = "any WORD written in CAPITAL?"
occursin(r"[A-Z]+", str) # <1>
m = match(r"[A-Z]+",str) # <2>
```
1. Returns `true`. Note the small `r` before the `r"regular expression"` - nifty!
2. Returns a `::RegexMatch` - access via `m.match` & `m.offset` (index) - or `m.captures` / `m.offsets` if you defined capture-groups
##### Interpolation
```julia
a = 123
str = "this is a: $a; this 2*a: $(2*a)"
```
## Scopes
All things (excepts modules) are in local scope (in scripts)
``` julia
a = 0
for k = 1:10
a = 1
end
a #<1>
```
1. a = 0! - in a script; but a = 1 in the REPL!
Variables are in global scope in the REPL for debugging convenience
::: callout-tip
Putting this code into a function automatically resolves this issue
```julia
function myfun()
a = 0
for k = 1:10
a = 1
end
a #<1>
return a
end
myfun() # <1>
```
1. returns 1 now in both REPL and include("myscript.jl")
:::
#### explicit global / local
``` julia
a = 0
global b
b = 0
for k = 1:10
local a
global b
a = 1
b = 1
end
a #<1>
b #<2>
```
1. a = 0
2. b = 1
#### Modifying containers works in any case
```julia
a = zeros(10)
for k = 1:10
a[k] = k
end
a #<1>
```
1. This works "correctly" in the `REPL` as well as in a script, because we modify the content of `a`, not `a` itself
## Types
Types play a super important role in Julia for several main reasons:
1) The allow for specialization e.g. `+(a::Int64,b::Float64)` might have a different (faster?) implementation compared to `+(a::Float64,b::Float64)`
2) They allow for generalization using `abstract` types
3) They act as containers, structuring your programs and tools
Everything in julia has a type! Check this out:
```julia
typeof(1)
typeof(1.0)
typeof(sum)
typeof([1])
typeof([(1,2),"5"])
```
----
We will discuss two types of types:
1) **`composite`** types
2) `abstract` types.
::: {.callout-tip collapse="true"}
## Click me for even more types!
There is a third type, `primitive type` - but we will practically never use them
Not much to say at this level, they are types like `Float64`. You could define your own one, e.g.
```julia
primitive type Float128 <: AbstractFloat 128 end
```
And there are two more, `Singleton types` and `Parametric types` - which (at least the latter), you might use at some point. But not in this tutorial.
:::
### composite types
You can think of these types as containers for your variables, which allows you for specialization.
```julia
struct SimulationResults
parameters::Vector
results::Vector
end
s = SimulationResults([1,2,3],[5,6,7,8,9,10,NaN])
function print(s::SimulationResults)
println("The following simulation was run:")
println("Parameters: ",s.parameters)
println("And we got results!")
println("Results: ",s.results)
end
print(s)
function SimulationResults(parameters) # <1>
results = run_simulation(parameters)
return SimulationResults(parameters,results)
end
function run_simulation(x)
return cumsum(repeat(x,2))
end
s = SimulationResults([1,2,3])
print(s)
```
1. in case not all fields are directly defined, we can provide an outer constructor (there are also inner constructors, but we will not discuss them here)
::: callout-warning
once defined, a type-definition in the global scope of the REPL cannot be re-defined without restarting the julia REPL! This is annoying, there are some tricks arround it (e.g. defining the type in a module (see below), and then reloading the module)
:::
# Task 2
1. Implement a type `StatResult` with fields for `x`, `n`, `std` and `tvalue`
2. Implement an outer constructor that can run `StatResult(2:10)` and return the full type including the calculated t-values.
3. Implement a function `length` for `StatResult` to multiple-dispatch on
4. **Optional:** If you have time, optimize the functions, so that mean, sum, length, std etc. is not calculated multiple times - you might want to rewrite your type. Note: This is a bit tricky :)
# Julia Basics III
## Modules
```julia
module MyStatsPackage
include("src/statistic_functions.jl")
export SimulationResults #<1>
export rse_tstat
end
using MyStatsPackage
```
1. This makes the `SimulationResults` type immediately available after running `using MyStatsPackage`. To use the other "internal" functions, one would use `MyStatsPackage.rse_sum`.
```julia
import MyStatsPackage
MyStatsPackage.rse_tstat(1:10)
import MyStatsPackage: rse_sum
rse_sum(1:10)
```
## Macros
Macros allow to programmers to edit the actual code **before** it is run. We will pretty much just use them, without learning how they work.