By the end of 2019, it is predicted that there will be over 60 million Jupyter Notebooks in Github, the world’s largest host of source code. In 2015 the site hosted just 270,000.
Project Jupyter has developed a form of computational notebook that is a free, open-source, interactive web tool. Computational notebooks have been around for decades, but Jupyter Notebooks have gained enormous popularity recently as they now support dozens of programming languages and have created passionate communities of users in a broad range of disciplines.
Jupyter Notebooks are documents that contain both executable code and rich text elements, such as links, equations and different ways of visualising data via graphs, tables and figures. Because of the mix of code and text, notebooks are an ideal place to bring together results and an analysis description.
Researchers in Australia now have access to a dedicated Jupyter Notebooks environment, SWAN (Service for Web-based ANalysis), via AARNet’s CloudStor. Since its launch in December 2018 this service has has seen growing numbers every month, with over 4,000 users in its first year.
Initially gaining popularity in programming and computer science, Jupyter Notebooks are the tool of choice for an ever-increasing number of researchers. A recent survey of SWAN users revealed users in linguistics, environmental science, water engineering, astronomy, history, physical geography and environmental geoscience, telecommunications, cognitive neuroscience, computer science, applied statistics, maritime engineering, business and management.
Health and Medical researchers are using Jupyter Notebooks for data manipulation, data cleaning, machine learning, and visualisation. There are multiple benefits for researchers in using Jupyter Notebooks relating to speed, deep learning, and the ease of working with massive datasets all within one environment. The availability and sharing of pre-written code enables researchers with limited knowledge to be able to comfortably navigate and manipulate large datasets for analysis and publication, and also allows quick and effective reproduction of code.
In the Humanities researchers are using Jupyter Notebooks for text mining and exploring large digital collections held by cultural institutions. One of the powerful advantages of the online tool is that it facilitates access to large datasets by bringing the computational power of notebooks to the data, without the need to move large amounts of data around. Researchers with no or low coding skills can now easily explore and analyse data held in the National Library of Australia (and Cloudstor) using Jupyter Notebooks created and openly shared by Associate Professor Tim Sherratt via the ‘GLAM Workbench’.
This spirit of creativity, openness and sharing is part of the success of the Jupyter Notebooks community and one that AARNet is keen to support in 2020 and beyond.
Authors: Dr Sara King, eResearch Analyst at AARNet and Genevieve Rosewall, Health and Medical Community Liaison at AARNet
Oct 26, 2020