Publish Interactive Information Visualizations for Free with Python and Marimo

Working in Information Science, it may be onerous to share insights from advanced datasets utilizing solely static figures. All of the sides that describe the form and which means of fascinating knowledge aren’t at all times captured in a handful of pre-generated figures. Whereas we’ve got highly effective applied sciences accessible for presenting interactive figures — the place a viewer can rotate, filter, zoom, and usually discover advanced knowledge — they at all times include tradeoffs.

Right here I current my expertise utilizing a lately launched Python library — marimo — which opens up thrilling new alternatives for publishing interactive visualizations throughout your entire subject of information science.

Interactive Information Visualization

The tradeoffs to think about when choosing an method for presenting knowledge visualizations could be damaged into three classes:

Capabilities — what visualizations and interactivity am I capable of current to the person?
Publication Price — what are the sources wanted for displaying this visualization to customers (e.g. operating servers, internet hosting web sites)?
Ease of Use – how a lot of a brand new skillset / codebase do I have to be taught upfront?

JavaScript is the muse of transportable interactivity. Each person has an internet browser put in on their laptop and there are a lot of totally different frameworks accessible for displaying any diploma of interactivity or visualization you may think (for instance, this gallery of fantastic issues folks have made with three.js). For the reason that software is operating on the person’s laptop, no pricey servers are wanted. Nevertheless, a major disadvantage for the info science neighborhood is ease of use, as JS doesn’t have lots of the high-level (i.e. easy-to-use) libraries that knowledge scientists use for knowledge manipulation, plotting, and interactivity.

Python gives a helpful level of comparability. Due to its frequently rising reputation, some have known as this the “Period of Python”. For knowledge scientists specifically, Python stands alongside R as one of many foundational languages for rapidly and successfully wielding advanced knowledge. Whereas Python could also be simpler to make use of than Javascript, there are fewer choices for presenting interactive visualizations. Some well-liked initiatives offering interactivity and visualization have been Flask, Sprint, and Streamlit (additionally value mentioning — bokeh, HoloViews, altair, and plotly). The most important tradeoff for utilizing Python has been the associated fee for publishing – delivering the instrument to customers. In the identical method that shinyapps require a operating laptop to serve up the visualization, these Python-based frameworks have completely been server-based. That is in no way prohibitive for authors with a finances to spend, but it surely does restrict the variety of customers who can reap the benefits of a specific challenge.

Pyodide is an intriguing center floor — Python code operating straight within the net browser utilizing WebAssembly (WASM). There are useful resource limitations (just one thread and 2GB reminiscence) that make this impractical for doing the heavy lifting of information science. Nevertheless, this may be greater than enough for constructing visualizations and updating primarily based on person enter. As a result of it runs within the browser, no servers are required for internet hosting. Instruments that use Pyodide as a basis are fascinating to discover as a result of they offer knowledge scientists a possibility to jot down Python code which runs straight on customers’ computer systems with out their having to put in or run something exterior of the net browser.

As an apart, I’ve been beforehand in one challenge that has tried this method: stlite, an in-browser implementation of Streamlit that allows you to deploy these versatile and highly effective apps to a broad vary of customers. Nevertheless, a core limitation is that Streamlit itself is distinct from stlite (the port of Streamlit to WASM), which implies that not all options are supported and that development of the challenge depends on two separate teams working alongside suitable strains.

Introducing: Marimo

This brings us to Marimo.

The first public bulletins of marimo had been in January 2024, so the challenge may be very new, and it has a singular mixture of options:

The interface resembles a Jupyter pocket book, which shall be acquainted to customers.
Execution of cells is reactive, in order that updating one cell will rerun all cells which rely on its output.
Person enter could be captured with a versatile set of UI parts.
Notebooks could be rapidly transformed into apps, hiding the code and displaying solely the enter/output parts.
Apps could be run regionally or transformed into static webpages utilizing WASM/Pyodide.

marimo balances the tradeoffs of know-how in a method that’s properly suited to the ability set of the everyday knowledge scientists:

Capabilities — person enter and visible show options are slightly in depth, supporting person enter by way of Altair and Plotly plots.
Publication Price — deploying as static webpages is principally free — no servers required
Ease of Use — for customers aware of Python notebooks, marimo will really feel very acquainted and be simple to choose up.

Publishing Marimo Apps on the Internet

One of the best place to begin with marimo is by studying their in depth documentation.

As a easy instance of the kind of show that may be helpful in knowledge science, consisting of explanatory textual content interspersed with interactive shows, I’ve created a barebones GitHub repository. Strive it out your self right here.

Instance publication created with marimo (picture created by writer)

Utilizing just a bit little bit of code, customers can:

Connect supply datasets
Generate visualizations with versatile interactivity
Write narrative textual content describing their findings
Publish to the net without cost (i.e. utilizing GitHub Pages)

For extra particulars, learn their documentation on net publishing and template repository for deploying to GitHub Pages.

Public App / Non-public Information

This new know-how affords an thrilling new alternative for collaboration — publish the app publicly to the world, however customers can solely see particular datasets that they’ve permission to entry.

Slightly than constructing a devoted knowledge backend for each app, person knowledge could be saved in a generic backend which could be securely authenticated and accessed utilizing a Python consumer library — all contained inside the person’s net browser. For instance, the person is given an OAuth login hyperlink that may authenticate them with the backend and permit the app to quickly entry enter knowledge.

As a proof of idea, I constructed a easy visualization app which connects to the Cirro knowledge platform, which is used at my establishment to handle scientific knowledge. Full disclosure: I used to be a part of the crew that constructed this platform earlier than it spun out as an unbiased firm. On this method customers can:

Load the general public visualization app — hosted on GitHub Pages
Join securely to their non-public knowledge retailer
Load the suitable dataset for show
Share a hyperlink which can direct approved collaborators to the identical knowledge

Strive it out your self right here.

Instance visualization app sourcing person managed knowledge (picture created by writer)

As a knowledge scientist, this method of publishing free and open-source visualization apps which can be utilized to work together with non-public datasets is extraordinarily thrilling. Constructing and publishing a brand new app can take hours and days as a substitute of weeks and years, letting researchers rapidly share their insights with collaborators after which publish them to the broader world.