Revised with 1.6.1
This commit is contained in:
parent
6acd882163
commit
f855c1c2ea
170
DataFrames/01__Environment_setup.ipynb
Normal file
170
DataFrames/01__Environment_setup.ipynb
Normal file
@ -0,0 +1,170 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Environment setup for data frames tutorial\n",
|
||||
"\n",
|
||||
"## Bogumił Kamiński"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Welcome to DataFrames.jl introduction!\n",
|
||||
"\n",
|
||||
"This set of Jupyter notebooks is intended to give you an overwiew of what functionality DataFrames.jl has based on practical examples.\n",
|
||||
"\n",
|
||||
"You can find reviews of functionality of DataFrames.jl (not as exercises as this tutorial but task-type oriented) in the following locations:\n",
|
||||
"* an official manual at https://juliadata.github.io/DataFrames.jl/stable/\n",
|
||||
"* a tutorial going through all functionalities of DataFrames.jl at https://github.com/bkamins/Julia-DataFrames-Tutorial\n",
|
||||
"\n",
|
||||
"We also assume that you have a basic knowledge of the Julia language and the Julia ecosystem. There are great tutorials on this topic in [JuliaAcademy](https://juliaacademy.com/), so I encourage you to check them out.\n",
|
||||
"\n",
|
||||
"As this is a hands-on tutorial you can expect that the examples will be implemented in a way as I would write them when doing actual project."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The notebooks were prepared under Julia 1.5.3 and tested under Julia 1.6.1. If you have a different version of Julia installed change the kernel in *Kernel/Change kernel* option in menu (assuming you are on a Julia 1.x all examples should work without a problem)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"v\"1.6.1\""
|
||||
]
|
||||
},
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"VERSION"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Jupyter Notebook automatically activates project environment if it is found in the working directory.\n",
|
||||
"\n",
|
||||
"So first let us check if we have Project.toml and Manifest.toml files present (they should be present if you cloned the repository of this tutorial)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"2-element BitVector:\n",
|
||||
" 1\n",
|
||||
" 1"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"isfile.([\"Project.toml\", \"Manifest.toml\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You should get `1` printed (meaning `true`) in both entries of a vector.\n",
|
||||
"\n",
|
||||
"Now we are sure that you are going to use exactly the same versions of the packages that I use when running this tutorial.\n",
|
||||
"\n",
|
||||
"Let us check what packages (and in what versions) we will use."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"] status"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"These notebooks should work with DataFrames versions 0.22 and 1.1."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"if the command above gives a warning that some of the packages are not downloaded run the `instantiate` instruction from the following line"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"] instantiate"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"As you see we will use the following packages:\n",
|
||||
"\n",
|
||||
"Package | Description\n",
|
||||
":-|:-\n",
|
||||
"DataFrames.jl | a core package that is a subject of this tutorial; it is used for data manipulation; we use version 0.21.0 of this package\n",
|
||||
"CSV.jl | a package for reading/writing of CSV files\n",
|
||||
"FreqTables.jl | a very useful package for creating frequency tables\n",
|
||||
"GLM.jl | a package for fitting Generalized Linear Models (as no data science tutorial would be complete without building some predictive model)\n",
|
||||
"PyPlot.jl | a package for plotting; there are many options in the Julia ecosystem to choose from; in this tutorial we use PyPlot.jl as it is based on Matplotlib so if you have experience with the Python data science technology stack it should be familiar\n",
|
||||
"Pipe.jl | a package that makes chaining of operations super powerful (which is something you probably know from `%>%` in R)\n",
|
||||
"Arrow.jl | a package for working with data in Apache Arrow format\n",
|
||||
"Uniftul.jl | a package for working with physical units (like kg, cm, ...)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"@webio": {
|
||||
"lastCommId": null,
|
||||
"lastKernelId": null
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Julia 1.6.1",
|
||||
"language": "julia",
|
||||
"name": "julia-1.6"
|
||||
},
|
||||
"language_info": {
|
||||
"file_extension": ".jl",
|
||||
"mimetype": "application/julia",
|
||||
"name": "julia",
|
||||
"version": "1.6.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
Loading…
x
Reference in New Issue
Block a user