Image of runners being monitored by a data application
Generated with AI by me using Microsoft Designer

How to Create a Data Web App for Free with Evidence

Fran Lozano
5 min readJun 8, 2024

--

Recently, I had the privilege of meeting David Gasquez and learning about his project, Datadex. This initiative collects and shares a catalog of open datasets, making them accessible for anyone to analyze. Inspired to contribute, David and I had several discussions about data and how to easily share discoveries and analyses from public datasets. David introduced me to several tools, including Evidence, which I realized could be a fantastic way to go beyond simply creating catalogs of open data and actually share stories and insights derived from the data.

One of the ideas I loved most about Datadex was how David managed to automate the ingestion, transformation, and publication of open data without spending a single euro. He used GitHub Actions to automate workflows and Hugging Face to expose the datasets. This concept of creating an end-to-end flow — from ingestion to visualization — at zero cost and with easy sharing capabilities, inspired me to start a new proof of concept using Evidence. Before we dive into the details, let me provide a bit of context.

What is Evidence?

Evidence is a JavaScript framework for creating data apps with Markdown and SQL. You can build a web application with pre-existing components like bar charts and dropdowns, and integrate them using SQL within your markdown. The final output is a static website that can be hosted on platforms like Netlify or Vercel.

The three things I liked most about Evidence are:

  1. Universal SQL: A query engine powered by DuckDB’s WebAssembly that allows you to query different data sources quickly and directly from your browser.
  2. Static site. Evidence generates a static site by running all the queries at build time, and it caches the data in parquet files that are queried from the browser, which is extremely fast.
  3. Templated pages. You can generate pages from data, so one single markdown file could generate a page for every data category (e.g. customer, city, etc).
How page templates work in the web application

For this project, I opted for Evidence over other options like Streamlit or Rill because I wanted to deploy something without having to set up a virtual machine on a cloud provider or use the cloud services of those libraries. Evidence was the only one that offered me that possibility by deploying as a static website. I also liked the fact that I could develop the application in Markdown and use SQL for data preparation.

I believe Evidence is an excellent option for analysts who lack Python skills or for individuals like myself who prefer not to delve into frontend development. To create more complex applications, the developer has the flexibility to enhance its functionality using custom components through Svelte.

Use case

One of my favorite hobbies is running. Apart from training, I love participating in races to test my current fitness level and get that adrenaline rush from running alongside others who share the same hobby. I live in a region of Spain called Cuenca, which hosts a series of 26 races in different towns throughout the region. The athletics club in my town organizes one of these 26 races, and they’re always keen to know the metrics and statistics of participants in each race to see if they’ve improved compared to the previous year. The company that provides the timing chips publishes the final times, but participation statistics aren’t public. So, I decided to develop an application that could help all the towns access the race statistics they organize.

Advertising poster of the competition, organized by the regional government.

With a Python script, I extracted raw data from the website that publishes the results, transformed it (actually, using dbt for processing a single model isn’t necessary, right?), and created a DuckDB database with the cleaned data. You can query the data using SQL here.

Thanks to DuckDB Wasm and DuckDB Web Shell, it is possible to query databases stored anywhere on the internet. Credit to David for showing me this tool.

The database is quite small, occupying only 3.26 MB, so I decided to publish it directly on GitHub because:

  1. It wouldn’t be updated very frequently (at most once a week).
  2. It would only be read at build time to generate the Evidence cache layer.
  3. It allowed deploying the application without the need to prepare the data or run any scripts.

There are free alternatives for storing these databases such as HuggingFace (with 50GB of ephemeral space) or Motherduck (with 10 GB of Storage), which would allow storing larger datasets with a higher update frequency. With Netlify’s starter plan, any website developed with Evidence can be published for free and quite easily (Be aware of the 100GB bandwidth limit!) . If data updates are needed, these can be automated with GitHub Actions, which include 2000 minutes per month for free.

Everything that is necessary to generate the webpage is stored in GitHub: Queries, markdown files and data (as Duckdb database). Thanks to templated pages, I was able to generate one page for every race with just one single markdown file. Dropdowns work really well to filter the data, and I used them to filter by age category.

Dropdowns are very useful to filter the data dynamically.

You can find the web application here.

Conclusion

In my latest project to visualize data from coches.net, I missed the part of sharing the results in a web application. Sure, you could download the data locally, set up a Dash server on your machine, and then visualize the data. But let’s be realistic, if you truly want the data to be used and explored, you need to provide a web application for it (whether through a BI tool or an ad-hoc application).

With Evidence, it’s very easy to create data products using the versatility of DuckDB, the simplicity of Markdown, and the power of SQL. And all without breaking the bank!

You can find the full code of the application on Github. Do you think Evidence was a good choice for this use case? I would love to read your opinions.

Stay curious!

--

--

Fran Lozano
Fran Lozano

Written by Fran Lozano

Data engineer, software developer, continuous learner, curious, stoic, investor.

No responses yet