Overview
This course is oriented towards data analysts as well as research scientists. Julia is a rapidly emerging programming language with a strong focus on numerical accuracy, scientific computing and statistics. It has gained most of its reputation due to its speed of execution in conjunction with its ease of programming. What is less emphasized, although it is true, is that
- Julia has a wealth of built-in and external tools for distributed and parallel computing,
- it facilitates the construction of user-defined data structures,
- it makes it easy to do metaprogramming, therefore to also define your ownl DSLs,
- it allows interacting with several other programming languages such as C, Python and R,
- it provides a multiple-dispatch programming paradigm, which in many ways helps you organize your code and makes you a better programmer and software engineer.
Requirements
Some familiarity with programming is desirable, but not essential. The aim of the course is to teach you the basics of the Julia programming language in a self-contained fashion.
Course Outline
Introduction to Julia
- What niche is filled by Julia
- How can Julia help you with data analysis
- What you can expect to get out of this course
- Getting started with Julia’s REPL
- Alternative environments for Julia development: Juno, IJulia and Sublime-IJulia
- The Julia ecosystem: documentation and package search
- Getting more help: Julia forums and Julia community
Strings: Hello World
- Introduction to Julia REPL and batch execution via “Hello World”
- Julia String Types
Scalar Types
- What is a variable? Why do we use a name and a type for it?
- Integers
- Floating point numbers
- Complex numbers
- Rational numbers
Arrays
- Vectors
- Matrices
- Multi-dimensional arrays
- Heterogeneous arrays (cell arrays)
- Comprehensions
Other Elementary Types
- Tuples
- Ranges
- Dictionaries
- Symbols
Building Your Own Types
- Abstract types
- Composite types
- Parametric composite types
Functions
- How to define a function in Julia
- Julia functions as methods operating on types
- Multiple dispatch
- How multiple dispatch differs from traditional object-oriented programming
- Parametric functions
- Functions changing their input
- Anonymous functions
- Optional function arguments
- Required function arguments
Constructors
- Inner constructors
- Outer constructors
Control Flow
- Compound expressions and scoping
- Conditional evaluation
- Loops
- Exception Handling
- Tasks
Code Organization
- Modules
- Packages
Metaprogramming
- Symbols
- Expressions
- Quoting
- Internal representation
- Parsing
- Evaluation
- Interpolation
Reading and Writing Data
- Filesystem
- Data I/O
- Lower Level Data I/O
- Dataframes
Distributions and Statistics
- Defining distributions
- Interface for evaluating and sampling from distributions
- Mean, variance and covariance
- Hypothesis testing
- Generalized linear models: a linear regression example
Plotting
- Plotting packages: Gadfly, Winston, Gaston, PyPlot, Plotly, Vega
- Introduction to Gadfly
- Interact and Gadfly
Parallel Computing
- Introduction to Julia’s message passing implementation
- Remote calling and fetching
- Parallel map (pmap)
- Parallel for
- Scheduling via tasks
- Distributed arrays