themusiclab.org

A widely popular citizen science website where we ran > 2 million experiments in the first 2 years

A screenshot of themusiclab.org

themusiclab.org is a website where participants can take part in online experiments. It hosts an ever growing number of β€œgames” participants can play, where we collect data on how the mind perceives music.

As of today, not even 2 years after we launched the site, we ran more than 2 million experiments, with thousands more coming in each day.

The first paper to use data from the website has been published in the journal Science with new articles and experiments already in the works.

my involvement

I got involved with this project very early on and helped to get the first version running even before I interned at the lab. Throughout my time as an intern and later on as a full-time employee I added multiple new features and games to the site, reworked large portions of the code to reduce complexity and adjusted the front- and back-end to allow for automatic autoscaling to handle huge increases in traffic. I also created the data extraction and preprocessing pipeline used to handle the >100 GBs of complex, often nested and irregular JSON data generated by jsPsych on the site.

technology used

themusiclab.org is powered by an early version of pushkin a work-in-progress framework for large-scale online research. The frontend is built using react, individual experiments / games are created using jsPsych. One special experiment is powered by psychTestR and shiny. The backend is split into multiple Docker 🐳 containers, the majority of them running node.js. Containers communicate via rabbitmq πŸ‡ and data is mainly stored in a postgresql 🐘 database. The whole site is running on AWS using ECS to run the actual containers, RDS to contain data, Route 53 and EC2 Elastic Load Balancers for DNS and routing, S3 to store static content and CloudFront to serve static content. The site automatically scales up and down to accomodate enourmous variations in traffic.

A diagram showing different parts of themusiclab.org's AWS infrastructure

The above diagram illustrates the backend architecture of themusiclab.org, mostly hosted on AWS, with some on-premise data storage and processing.