About

Published

2022-07-09

Julia has been called the programming language of the 21st century for scientific computing, data science, and machine learning. As a high-level, high-performance, dynamic language, Julia is faster than other scripting languages because of smart design decisions like type-stability through specialization via multiple-dispatch. Julia’s code can be efficient and concise, which leads to clear performance gains. In addition, Julia’s environments are fully reproducible and it is easy to express object-oriented and functional programming patterns.

This tutorial will provide an introduction to key Data Science tools in Julia such as data management with Arrow.jl and Tables.jl and (Generalized) linear mixed models with GLMM.jl and MixedModels.jl. Unlike widely used R packages, all packages that we will describe are written 100% in Julia thus illustrating the language’s potential to overcome the two-language problem.

This tutorial will appeal to anyone interested in learning more about Julia and some of the existing Julia packages that are already available for Statistics and Data Science. In addition to lectures, participants will engage in hands-on exercises. For example, participants will bring a dataset of their choice along with an existing script written in another language (R or python) that performs certain data analyses. During the tutorial, participants will translate their work to Julia in-order to compare running times and ease of programming.