essential.js: a JavaScript library for music and audio analysis on the web

Authors Albin Correya, Dmitry Bogdanov, Luis Joglar-Ongay, Xavier Serra,

Plaintext

ESSENTIA.JS: A JAVASCRIPT LIBRARY FOR MUSIC AND AUDIO
ANALYSIS ON THE WEB

Albin Correya1 Dmitry Bogdanov1 Luis Joglar-Ongay1,2 Xavier Serra1
1
Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain
2
SonoSuite, Barcelona, Spain
albin.correya@upf.edu, dmitry.bogdanov@upf.edu

ABSTRACT applications arguably makes it one of the most used com-
puter programming languages in the recent years [2]. JS is
Open-source software libraries for audio/music analysis also widely used as an entry-level programming language
and feature extraction have a significant impact on the de- by the thinkers from design, art, computer graphics, archi-
velopment of Audio Signal Processing and Music Infor- tecture, and spaces in between where audio processing and
mation Retrieval (MIR) systems. Despite the abundance analysis can be relevant.
of such tools on the native computing platforms, there is With the adoption of both HTML5 and the W3C Web
a lack of an extensive and easy-to-use reference library Audio API specifications [14], modern web browsers are
for audio feature extraction on the Web. In this paper, capable of audio processing, synthesis, and analysis with-
we present Essentia.js, an open-source JavaScript (JS) li- out any third-party dependencies on proprietary software.
brary for audio and music analysis on both web clients This allows developers to move most of the audio process-
and JS-based servers. Along with the Web Audio API, it ing code from the server to the client and can provide better
can be used for efficient and robust real-time audio fea- scalability and deployment, considering that the web-client
ture extraction on the web browsers. Essentia.js is modu- has computational resources for the required processing.
lar, lightweight, and easy-to-use, deploy, maintain and in- Web Audio API provides a JS interface to various prede-
tegrate into the existing plethora of JS libraries and Web fined nodes for audio processing, synthesis, and analysis.
technologies. It is powered by a WebAssembly back-end Even though the provided capabilities are limited, the API
of the Essentia C++ library, which facilitates a JS interface includes the ScriptProcessorNode for developers to add
to a wide range of low-level and high-level audio features. custom JS code for audio processing. 2 The design of this
It also provides a higher-level JS API and add-on MIR util- node has been criticized by the audio developer commu-
ity modules along with extensive documentation, usage ex- nity since it runs the JS audio processing code on the main
amples, and tutorials. We benchmark the proposed library UI thread, which can result in unreliable performance and
on two popular web browsers, Node.js engine, and An- audio glitching [15]. As an alternative, AudioWorklet [10]
droid devices, comparing it to the native performance of has been introduced to mitigate this design issue to a great
Essentia and Meyda JS library. extent by allowing running custom audio processing code
on a dedicated audio thread. It also allows bi-directional
communication between the audio thread and the control
1. INTRODUCTION
thread of Web Audio API asynchronously using the web
The Web has become one of the most used computing message ports.
platforms with billions of web pages served daily, and JS Despite all of these recent developments of Web Audio
being an essential part of building modern web applica- technologies, the Audio Signal Processing and MIR com-
tions. Using HTML, CSS, and JS, developers can make munities still lack reliable and modular software tools and
dynamic, interactive, and responsive web pages by imple- libraries that could be easily used for building audio and
menting custom client-side scripts. At the same time, the music analysis applications for web browsers and JS run-
developers can also use cross-platform run-time engines time engines. To the best of our knowledge, Meyda [11]
like Node.js 1 to write server-side code in JS. The flexi- and JS-Xtract [18] are the few available JS libraries that
bility of using JS on both server and client ends of web are ready-to-use and have implementations of a limited set
1 https://nodejs.org
of MIR audio features. 3 The lack of more extensive al-
ternatives is not surprising, given that writing a new JS
audio analysis library from scratch would require a lot of
c Albin Correya, Dmitry Bogdanov, Luis Joglar-Ongay,
effort. Also, JS code for audio processing are prone to per-
Xavier Serra. Licensed under a Creative Commons Attribution 4.0 In- formance issues due to the just-in-time (JIT) compilation
ternational License (CC BY 4.0). Attribution: Albin Correya, Dmitry and garbage collection overhead, which can be critical for
Bogdanov, Luis Joglar-Ongay, Xavier Serra. “Essentia.js: A JavaScript
Library for Music and Audio Analysis on the Web”, 21st International 2 https://www.w3.org/TR/webaudio/
Society for Music Information Retrieval Conference, Montréal, Canada, #scriptprocessornode
2020. 3 As of May 2020, Meyda only has 20 MIR algorithms.
real-time audio and music analysis tasks. Due to these rea-
sons, researchers and developers often rely on server-side
audio processing solutions using the existing native MIR
tools for writing the required web applications.
Over the last two decades, the existing software tools
for audio analysis have been mostly written in C/C++, Java
and Python and delivered as standalone applications, host
application plug-ins, or as software library packages. Soft-
ware libraries with a Python API, such as Essentia [7], Li-
brosa [23], Madmom [6], Yaafe [22] and Aubio [8], have
been especially popular within the MIR community due to
rapid prototyping and rich environment for scientific com- Figure 1: Overview of the Essentia.js library in terms of
putations. Meanwhile, the libraries with a native C/C++ its abstraction levels.
back-end offered faster analysis [24] and were often re-
quired for writing industrial audio applications. Various
web applications for audio processing and analysis have machine learning frameworks.
been developed using these tools. Spotify API 4 (formerly In [24], the authors evaluated a wide range of MIR soft-
Echonest), Freesound API [13] and AcousticBrainz [25] ware libraries in terms of coverage, effort, presentation,
are examples of web services providing precomputed au- time-lag and found Essentia 7 [7] to be an overall best per-
dio features to the end users via a REST API. Besides, nu- former with respect to these criteria. Essentia is an open-
merous web applications were built for aiding tasks such source library for audio and music analysis released under
as crowd sourcing audio annotations [9, 12], audio listen- the AGPLv3 license providing a wide range of optimized
ing tests [19, 26] and music education platforms [1, 21] to algorithms (over 250 algorithms) that are successfully used
mention a few. All these services manage their audio anal- in various academic and industrial large-scale applications.
ysis on the server, which may require a significant effort Essentia includes both low-level and high-level audio fea-
and resources to scale to many users. tures, along with some ready-to-use features extractors.
With the recent arrival of WebAssembly (WASM) sup- And, it provides an object-oriented interface to fine tune
port on most of the modern web browsers [16], one can each algorithm in detail. Given all these advantages and
safely port the existing C/C++ audio processing and anal- that the code repository is consistently maintained by its
ysis code into the Web Audio ecosystem using com- developers, it is a good potential choice for porting into
piler toolchains such as Emscripten. 5 WASM is a low- WASM target for the web platform.
level assembly-like language with a compact binary format In this paper, we present Essentia.js, an open-source JS
that runs with near-native performances on modern web library for audio and music analysis on the web, released
browsers or any WebAssembly-based stacks without com- under the AGPLv3 license. It allows building audio analy-
promising security, portability and load time. WASM code sis and MIR applications for web browsers and JS engines
was found to be comparatively faster than JS code [17] and such as Node.js. It provides straightforward integration
generates no garbage from the code and can run within Au- with the latest W3C Web Audio API specification allow-
dioWorkletProcessor. 6 This makes it an ideal solution to ing efficient real-time audio feature extraction on the web
the problems that were previously hindering us from build- browsers. Web applications written using the proposed li-
ing efficient and reliable JS MIR libraries for the web plat- brary can also be cross-compiled to native targets such as
form [20]. However, taking full advantage of all these fea- for PCs, smartphones, and IoT devices using the JS frame-
tures can be challenging because it requires understand- works like Electron 8 and React Native. 9
ing concurrent programming wrapped with several JS APIs The rest of the paper is organized as follows. Section 2
and dealing with cross-compilation and linking of the na- outlines the design choices, software architecture and var-
tive code. Even for experienced developers, compiling na- ious components of Essentia.js. An overview of potential
tive code to WASM targets, generating JS bindings, and in- use-cases and usage examples are detailed in Section 3.
tegrating them in a regular JS processing code pipeline can A detailed benchmarking of Essentia.js across and against
be cumbersome. Hence, it is essential that an ideal JS MIR various platforms and alternative JS libraries can found in
software library should encapsulate and abstract all these Section 4. Finally, we conclude in Section 5.
steps and be packaged as a compact build which is easy-to-
use and extendable using a high-level JS API. Besides the 2. ESSENTIA.JS
JS API, the potential users might also need utility tools that
are often required while building MIR-based projects, such Essentia.js is much more than just JS bindings to the Es-
as plotting audio features on an HTML page, ready-to-use sentia C++ library. It was developed with coherent design
feature extractors, and possible integration with web-based and functional objectives that are necessary for building an
efficient and easy-to-use MIR library for the Web. In this
4 https://developer.spotify.com/documentation
5 https://emscripten.org 7 https://essentia.upf.edu
6 https://www.w3.org/TR/webaudio/ 8 https://www.electronjs.org

#audioworkletprocessor 9 https://reactnative.dev
section, we discuss our design choices, the architecture, the majority of the algorithms in Essentia, 14 while the
and various components of Essentia.js. Figure 1 shows an few excluded algorithms can be still integrated into the
overview of these components. WASM backend by compiling and linking with the re-
quired third-party dependencies using our build tools
2.1 Design and Functionality (Section 2.5). Besides, all the JS code in the library is
passed through a code compression process to achieve
We chose the following goals and design decisions for de- lightweight builds for the web browsers. With all these
veloping the library: efforts we were able to achieve builds of Essentia.js, in-
cluding the WASM backend and the JS API, as small as
• User-friendly API and utility tools. The Web is a ubiq-
2.5MB approximately. We also provide tools for custom
uitous computing platform, and an ideal JS MIR library
lightweight builds of the library that only include a sub-
should provide a simple, user-friendly API while being
set of the selected algorithms to further reduce the build
highly configurable for advanced use cases. Essentia.js
size (Section 2.5).
ships with a simple JS API and add-on utility modules
with a fast learning curve for new users. The main JS • Reproducibility. Using the WASM backend of Essen-
API is composed of a singleton class with methods im- tia ensures identical analysis results across various de-
plementing most of the functionality (each method is an vices and native platforms on which Essentia has been
algorithm in Essentia). These methods are automatically previously extensively used and tested. Remarkably, Es-
generated from the upstream C++ code and documenta- sentia.js allows reproducing a large amount of existing
tion using code templates and scripting as described in code and research based on Essentia as well as, to a cer-
Sections 2.2 and 2.3. We also provide add-on modules tain extent, other libraries. In particular, it is possible to
with helper classes for feature extraction and visualisa- use Essentia.js to reproduce common input audio rep-
tion that can be used for rapid prototyping of web appli- resentations for the existing machine learning models,
cations. A quick preview of the JS API can be seen in enabling their usage in web applications.
Listing 2.
• Easy installation. Essentia.js is easy to install and inte-
• Modularity and extensibility. Just like Essentia itself, grate with new or existing web projects. It is available
the Essentia.js codebase is modular by design. It con- both as a package on NPM 15 and as static builds on
tains a large amount of configurable algorithms that can our public GitHub repository. In addition, we also pro-
be arranged into the desired audio processing chains. vide continuous delivery network (CDN) through open
The add-on utility modules shipped with the library source web services.
leverage this functionality to build custom feature ex-
tractors. • Extensive documentation. We provide a complete API
reference together with the instructions to get started,
• Web standards compliance.Web browsers provide a tutorials, and interactive web application examples. 16
standard set of tools for loading and processing au- The documentation is built automatically using JS-
dio files using the HTML5 Audio element 10 and the doc 17 and the algorithm reference is generated from
Web Audio API. Essentia.js rely on these standard fea- the upstream Essentia C++ documentation using Python
tures for loading audio files or for streaming real-time scripts.
audio from various device sources. It also provides
seamless integration with the latest tools in the Web 2.2 Essentia WASM backend
Audio ecosystem such as AudioWorklets, Web Work-
ers, 11 WASM and SharedArrayBuffer. In addition, JS As already mentioned, the core of the library is powered
conforms to the ECMAScript specification 12 and it is by the Essentia WASM backend. It contains a lightweight
evolving fast with new features added to the language WASM build of Essentia C++ library along with custom
every year. For backward and forward compatibility of JS bindings for using it in JS. This backend is generated in
our JS code, we wrote our JS API using Typescript (Sec- multiple steps.
tion 2.3). Firstly, the Essentia C++ library is compiled to LLVM
assembly 18 using Emscripten. Emscripten [28] is a LLVM
• Lightweight and few dependencies. It is important for to JS compiler which provides a wide range of tools for
a JS library to be lightweight since the load times of JS compiling the C/C++ code-base or the intermediate LLVM
code can directly impact the UI/UX and performance of builds into various targets such as asm.js 19 and WASM.
web applications. This includes having fewer dependen- Secondly, we need a custom C++ interface in order to gen-
cies, which also makes the library much more maintain- erate the corresponding JS bindings which would allow us
able. Taking this into account, Essentia WASM backend access the algorithms in Essentia on the JS side. We used
is built without any third-party software dependencies of
the Essentia library except for Kiss FFT. 13 It includes 14 As of May 2020, over 200 algorithms are supported.
15 https://www.npmjs.com
10 https://www.w3.org/html/wiki/Elements/audio 16 https://mtg.github.io/essentia.js
11 https://w3c.github.io/workers/ 17 https://jsdoc.app
12 http://ecma-international.org/ecma-262 18 https://llvm.org
13 https://github.com/mborgerding/kissfft 19 http://asmjs.org
Embind [4] for generating this C++ interface that binds Es-
sentia native code to JS.
Writing custom JS bindings for all Essentia algorithms
can be tedious considering their large amount. Hence, we
created Python scripts to automate the generation of the re-
quired C++ code for the C++ wrapper from the upstream
library Python bindings. Using this scripts, it is possible to
configure which algorithms to include in the bindings by
their name and category. Therefore, it is possible to cre-
ate extremely lightweight builds of the library with only a
few specific algorithms required for a particular applica-
tion. The Essentia WASM backend is built by compiling
the generated wrapper C++ code and linking with the pre-
compiled Essentia LLVM using Emscripten.
Essentia WASM backend provides compact WASM bi- Figure 2: Screenshot of a example web application that
nary files along with the JS bindings to over 200 Essentia use Essentia.js and its add-on modules.
algorithms. We provide these binaries and a JS glue code
for both asynchronous and synchronous import of Essen- • essentia.js-plot provides helper functions for visualiza-
tia WASM backend, covering a wide range of use cases. tion of MIR features, allowing both real-time and offline
The build for asynchronous import can be instantly loaded plotting in a given HTML element. It uses the Plotly.js
into a HTML page. The synchronous-import build sup- data visualization library, which has a design layout and
ports the new ES6 style class imports characteristic of the functionalities much alike of Matplotlib, 20 and is eas-
modern JS libraries. This build can be also seamlessly in- ily configurable. Currently, we provide object-oriented
tegrated with AudioWorklet and Web Workers for better classes for plotting basic MIR features like melody/pitch
performance demanding web applications. contours, spectrograms, chroma, MFCC, etc. The mod-
2.3 High-level JS API ule is functionally similar to the display module in Li-
brosa [23].
Even though it is possible to use the Essentia WASM back-
end with its bindings directly, they have limitations due to A full reference of the modules can be found in the doc-
the specifics of using Embind with Essentia: a user needs umentation of the library. Both modules can be easily ex-
to manually specify all parameter values for the algorithms tended with more functionalities as per the demands of the
because the default values are not supported. This is cum- user community.
bersome and to solve this issue we developed a high-level
2.5 Build and packaging tools
JS API written using Typescript [5]. Typescript is a typed
superset of JS that can be compiled to various ECMA tar- We provide tools for custom builds and packaging of Es-
gets of JS. In addition, this gives us the benefit of having a sentia.js for the developers and the end-level users:
typed JS API which can provide better exception handling. • Command-line interface (CLI). We provide CLI for
Again we used custom Python scripts and code templates building Essentia.js using some customised shell scripts.
to automatically generate the Typescript wrapper in a sim-
ilar way to the C++ wrapper for the WASM backend. The • Docker. We provide a Docker image with static builds
high-level JS API of Essentia.js provides a singleton class of Essentia with Emscripten and other development de-
Essentia with all the algorithms and helper functions en- pendencies required for building Essentia.js.
capsulated as its methods All the algorithm methods are • Web application. We also host a website for building
configurable similarly to the Essentia’s C++/Python API custom Essentia.js online. 21 It allows users to select
itself. Listing 1, shows an example of using the high-level specific set algorithms and build settings.
API of Essentia.js.
The official Essentia.js builds are bundled using
2.4 Add-on utility modules Rollup 22 and then packaged and hosted using NPM.
Essentia.js is shipped with a few add-on modules to facil-
itate common MIR tasks. These add-on modules are writ- 3. GETTING STARTED
ten entirely in Typescript using the Essentia.js high-level In this section, we outline several usage examples and ap-
JS API. Currently, we provide two add-on modules: plication scenarios for getting started with Essentia.js.
The library can be imported into a web application us-
• essentia.js-extractor contains predefined feature ex-
ing the following methods:
tractors for common MIR tasks, computing BPM, beat
positions, pitch, predominant melody, key, chords, • HTML <script> tag. The most simple way to use Es-
chroma features, MFCC, etc. Each extractor implements sentia.js is by using it with the HTML <script> tag.
the entire processing chain starting from audio input and 20 https://matplotlib.org
outputs the resulting track-level or frame-level features. 21 https://mtg.github.io/essentia.js-builder

These extractors are configurable as well. 22 https://rollupjs.org
• NPM. Essentia.js can be also installed from NPM using // Imports Essentia WASM backend
the command npm install essentia.js. import {EssentiaWASM} from ’essentia-wasm.module.js’;
// Imports Essentia.js core API
import Essentia from ’essentia.js-core.es.js’;
• ES6 class imports. Essentia.js can be imported using
the ES6 class style imports in JS. This allows to inte- // Creates Essentia.js instance
const essentia = new Essentia(EssentiaWASM);
grate the library into any modern JS framework. Listing
1 shows an example of using ES6 style imports for an // Instance of Web Audio API AudioContext
const audioContext = new AudioContext();
offline feature extraction task. // URL of an audio file
let audioURL = "https://freesound.org/data/previews
• CDN. We also provide CDN links for instantly serv- /328/328857_230356-lq.mp3";
ing Essentia.js online using free third-party web services // Decode audio file as Float32 typed array
such as Jsdelivr 23 and Unpkg. 24 const audioData = await essentia.
getAudioChannelDataFromURL(audioURL, audioContext,
0); // audioContext, channel number
There are a lot of potential web applications that can
be built with Essentia.js. The library provides algorithms // Convert audioData array into vector
const audioVector = essentia.arrayToVector(audioData);
for typical sound and music analysis tasks, including spec-
tral, tonal, and rhythmic characterization. In particular, // Onset detection with SuperFluxExtractor algorithm
let bt = essentia.SuperFluxExtractor(audioVector);
it is suitable for onset detection, beat tracking and tempo console.log(bt.onsets);
estimation, melody extraction, key and chord estimation,
// Pitch estimation with PitchYinProbabilistic
sound and music classification, cover song similarity, loud- algorithm
ness metering, and audio problems detection among oth- let pyYin = essentia.PitchYinProbabilistic(audioVector,
4096, 256); // frameSize, hopSize
ers. Figure 2 shows the screenshot of an example web ap-
plication that we include with the library. Below we outline console.log(pyYin.pitch);
some of the common application use cases of the library. // Shutdown Essentia.js instance and free memory
We provide an extensive collection of analysis examples essentia.shutdown();
essentia.delete();
on our website. 25
Listing 1: A simple example of offline audio feature
3.1 Offline feature extraction extraction using Essentia.js via ES6 style imports.

Many MIR use cases rely on an offline audio analysis and
feature extraction. Listing 1 shows a simple JS example
using AudioWorkletNode. 26 Currently, the only limita-
of using the library for offline analysis of pitch and on-
tion is that it is only supports in the latest Firefox and
sets. For features computed on overlapping frames, Es-
Chromium-based web browsers.
sentia.js provides the FrameGenerator method similarly to
Essentia’s Python API. Frames generated by this method
can be used as an input to other algorithms in the process- 3.3 Machine learning applications
ing chain. The offline feature extraction can be run inside
In the recent years, machine learning (ML) techniques, es-
a Web Worker to improve the efficiency in performance-
pecially deep learning have been widely used in a wide
demanding web applications.
range of MIR tasks. With the support of WebGL and
WASM, modern web browsers are also capable of running
3.2 Real-time feature extraction
ML applications. Essentia.js can be easily integrated with
Essentia.js can be used for efficient real-time audio/music popular JS ML frameworks such as TensorFlow.js [27] and
analysis in web browsers along with the Web Audio API. Onnx.js 27 for training and inference. Pre-trained audio
This can be done by using the ScriptProcessorNode or the ML models using features computed with Essentia as an
newly introduced AudioWorklet in the Web Audio API: input (e.g., mel-spectrograms, Constant-Q transform, or
chroma) can be easily ported and used for inference in
• ScriptProcessorNode allows users to provide custom web browsers. In particular, Essentia now ships with a
JS code for audio feature extraction in its onprocess collection of pre-trained TensorFlow models for music au-
callback. Even though the ScriptProcessorNode is dep- dio tagging and classification [3]. These models can be
recated according to the current W3C Web Audio API run for inference using Essentia.js and TensorFlow.js li-
specifications, it is still widely used by the developers braries. Our essentia.js-extractor add-on module provides
because of its cross-browser support. a mel-spectrogram extractor for computing the inputs to
• AudioWorklet design pattern [10] allows users to write these models.
high-performance real-time audio analysis on a dedi-
cated audio thread. Users can write custom analysis 4. BENCHMARK
code by extending the AudioWorkletProcessor and fur- We tested the performance of Essentia.js in terms of the
ther abstract it as a node in the Web Audio API graph analysis time for common MIR audio features on various
23 https://www.jsdelivr.com 26 https://www.w3.org/TR/webaudio/
24 https://unpkg.com #audioworkletnode
25 https://mtg.github.io/essentia.js/examples 27 https://github.com/Microsoft/onnxjs
(a) Essentia.js (b) Meyda

Figure 3: Average analysis times (in seconds) for common audio features on a 5-second music clip. "Python (Linux)"
corresponds to the analysis baseline using native Essentia with Python bindings.

platforms, and compared it to the native Essentia library. Chrome on Android, which was not expected. This is
In addition, we measured the analysis times for features probably because different vendors have slightly different
available in Meyda and compared them to their Essentia.js implementations of WASM support in their platforms or
counterparts. To this end, we built a set of test suites using due to some other reasons yet to be found. In addition,
the JS library benchmark.js and implemented the equiva- WASM is a relatively new technology in active develop-
lent features using both libraries. In our benchmark we ment. 29 Many proposals for improving its performance
measure the time it takes for the entire processing chain such as SIMD optimizations and multi-thread support are
to compute a feature given a 5-second audio segment as an under way. We also aim to do detailed benchmarking of
input. The code used by Essentia.js is equivalent to the one real-time use cases and using the Web Audio API Audio
for Essentia used in Python. The benchmarking of Python Worklets in our future work.
implementation was done using the library pytest with the
benchmark extension. We provide the code and website to 5. CONCLUSIONS
reproduce these experiments online. 28
We have presented Essentia.js, an open-source JavaScript
The results are reported in Figure 3. They include tests
library for music/audio analysis on the Web. It is based on
on five different environments:
the Essentia C++ library which is commonly used in MIR,
• Linux with Chrome 84.0.4147.89 run with disabled ex- ported to JS via WASM, and compatible with the latest
tensions. technologies in the Web Audio ecosystem. To the best of
our knowledge, this is the most comprehensive library for
• Linux with Firefox 78.0.2 in private browsing mode.
audio analysis and MIR, which can be run on web browsers
• Android 9 (LineageOS 16) with Chrome 84.0.4147.89 as well as JS server applications. We hope that the library
in incognito mode. will contribute to the creation of new online music technol-
• Android 9 (LineageOS 16) with Firefox Nightly 200727 ogy tools in educational, industrial, and creative contexts.
06:00 The source code of the library is publicly available in our
Github repository. 30 Everyone is encouraged to contribute
• Linux with Node.js v.13.13.0. to the library.
In our future work, we will focus on improving the per-
The Linux computer used for all runs is a 2017 DELL
formance of the library along with expanding the add-on
XPS-15 with a 2.80GHz x 8 Intel Core i7-7700HQ proces-
modules, particularly on providing pre-trained audio ML
sor, 16GB of RAM and Ubuntu 19.04 as OS. The mobile
models for audio analysis, classification, and synthesis on
phone is a Xiaomi Redmi Note 7 Pro with a Snapdragon
the web client. We also aim to develop interesting web ap-
Octa-core 1.7 GHz processor and 6GB RAM. All the tests
plications that go beyond typical MIR tasks to attract and
were done with the same 5 seconds audio file.
build a diverse user community. The detailed information
As we can see from Figure 3, the results shows that the
about the library is available at the official web page. 31 It
performance of Essentia.js algorithms were considerably
contains the complete documentation, usage examples and
slower when running on Node.js and Firefox and Chrome
tutorials needed for one to get started.
on Android compared to Firefox and Chrome on Linux.
29 https://webassembly.org/roadmap/
Interestingly, Node.js performed worse than Firefox and
30 https://github.com/MTG/essentia.js
28 https://mtg.github.io/essentia.js-benchmarks 31 https://essentia.upf.edu/essentiajs
6. REFERENCES Freesound datasets: a platform for the creation of open
[1] MusicCritic: An automatic assessment system for audio datasets. In International Society for Music In-
musical exercises. https://musiccritic.upf. formation Retrieval Conference (ISMIR 2017), pages
edu. Accessed: 2020-09-04. 486–93.

[2] Stack Overflow Annual Developer Survey. https:// [13] Frederic Font, Gerard Roma, and Xavier Serra.
insights.stackoverflow.com/survey. Ac- Freesound technical demo. In ACM International Con-
cessed: 2020-15-04. ference on Multimedia (MM 2013), page 411–412,
New York, NY, USA, 2013.
[3] Pablo Alonso-Jiménez, Dmitry Bogdanov, Jordi Pons,
and Xavier Serra. Tensorflow audio models in Essen- [14] W3C Audio Working Group. W3C Web Audio
tia. In IEEE International Conference on Acoustics, API specifications. https://www.w3.org/TR/
Speech and Signal Processing (ICASSP 2020), pages webaudio. Accessed: 2020-15-04.
266–270, 2020. [15] W3C Technical Architecture Group. Web Audio
[4] Chad Austin. CppCon 2014: Embind and Em- API Design Review. https://github.com/
scripten: Blending C++11, JavaScript, and the Web w3ctag/design-reviews/blob/master/
Browser. https://www.youtube.com/watch? 2013/07/WebAudio.md. Accessed: 2020-05-04.
v=Dsgws5zJiwk. Accessed: 2020-15-04. [16] Andreas Haas, Andreas Rossberg, Derek L Schuff,
[5] Gavin Bierman, Martín Abadi, and Mads Torgersen. Ben L Titzer, Michael Holman, Dan Gohman, Luke
Understanding typescript. In European Conference on Wagner, Alon Zakai, and JF Bastien. Bringing the
Object-Oriented Programming (ECOOP 2014), pages web up to speed with webassembly. In ACM SIGPLAN
257–281. Springer, 2014. Conference on Programming Language Design and
Implementation (PLDI 2017), pages 185–200, 2017.
[6] Sebastian Böck, Filip Korzeniowski, Jan Schlüter, Flo-
rian Krebs, and Gerhard Widmer. Madmom: A new [17] David Herrera, Hanfeng Chen, Erick Lavoie, and Lau-
python audio and music signal processing library. In rie Hendren. Numerical computing on the web: Bench-
ACM International Conference on Multimedia (MM marking for the future. In ACM SIGPLAN Interna-
2016), pages 1174–1178, 2016. tional Symposium on Dynamic Languages (DLS 2018),
pages 88–100, 2018.
[7] Dmitry Bogdanov, Nicolas Wack, Emilia Gómez,
Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Ger- [18] Nicholas Jillings, Jamie Bullock, and Ryan Stables. JS-
ard Roma, Justin Salamon, José Zapata, and Xavier Xtract: A realtime audio feature extraction library for
Serra. Essentia: An audio analysis library for music in- the Web. In International Society for Music Informa-
formation retrieval. In International Society for Music tion Retrieval Conference (ISMIR 2016) Late Breaking
Information Retrieval Conference (ISMIR 2013), pages Demo, 2016.
493–498, 2013.
[19] Nicholas Jillings, David Moffat, Brecht De Man, and
[8] Paul M Brossier. The aubio library at MIREX 2006. Joshua D. Reiss. Web Audio Evaluation Tool: A
In Music Information Retrieval Evaluation Exchange browser-based listening test environment. In Sound
(MIREX 2006), 2006. and Music Computing Conference (SMC 2015), 2015.

[9] Mark Cartwright, Ayanna Seals, Justin Salamon, Alex [20] Jari Kleimola and Oliver Larkin. Web audio mod-
Williams, Stefanie Mikloska, Duncan MacConnell, ules. In Sound and Music Computing Conference (SMC
Edith Law, Juan P Bello, and Oded Nov. Seeing sound: 2015), 2015.
Investigating the effects of visualizations and com-
[21] Anand Mahadevan, Jason Freeman, Brian Magerko,
plexity on crowdsourced audio annotations. Proceed-
and Juan Carlos Martinez. Earsketch: Teaching com-
ings of the ACM on Human-Computer Interaction,
putational music remixing in an online web audio
1(CSCW):1–21, 2017.
based learning environment. In Web Audio Conference
[10] Hongchan Choi. AudioWorklet: The future of web au- (WAC 2015), 2015.
dio. Ann Arbor, MI: Michigan Publishing, University
[22] Benoit Mathieu, Slim Essid, Thomas Fillon, Jacques
of Michigan Library, 2018.
Prado, and Gaël Richard. Yaafe, an easy to use and
[11] Jakub Fiala, Nevo Segal, and Hugh A. Rawlinson. efficient audio feature extraction software. In Interna-
Meyda: an audio feature extraction library for the Web tional Society for Music Information Retrieval Confer-
Audio API. In Web Audio Conference (WAC 2015), ence (ISMIR 2010), pages 441–446, 2010.
pages 1–6, 2015.
[23] Brian McFee, Colin Raffel, Dawen Liang, Daniel PW
[12] Eduardo Fonseca, Jordi Pons Puig, Xavier Favory, Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto.
Frederic Font Corbera, Dmitry Bogdanov, Andres Fer- librosa: Audio and music signal analysis in python. In
raro, Sergio Oramas, Alastair Porter, and Xavier Serra. Python in Science Conference (SciPy 2015), 2015.
[24] David Moffat, David Ronan, and Joshua D. Reiss. An
evaluation of audio feature extraction toolboxes. In In-
ternational Conference on Digital Audio Effects (DAFx
2015), pages 1–7, 2015.

[25] Alastair Porter, Dmitry Bogdanov, Robert Kaye, Ro-
man Tsukanov, and Xavier Serra. Acousticbrainz: a
community platform for gathering music information
obtained from audio. In International Society for Music
Information Retrieval Conference (ISMIR 2015), 2015.

[26] Michael Schoeffler, Fabian-Robert Stöter, Bernd Edler,
and Jürgen Herre. Towards the next generation of web-
based experiments: A case study assessing basic au-
dio quality following the ITU-R recommendation BS.
1534 (MUSHRA). In Web Audio Conference (WAC
2015), pages 1–6, 2015.

[27] Daniel Smilkov, Nikhil Thorat, Yannick Assogba, Ann
Yuan, Nick Kreeger, Ping Yu, Kangyi Zhang, Shanqing
Cai, Eric Nielsen, David Soergel, et al. Tensorflow. js:
Machine learning for the web and beyond. In Confer-
ence on Systems and Machine Learning (SysML 2019),
2019.

[28] Alon Zakai. Emscripten: an llvm-to-javascript com-
piler. In ACM SIGPLAN Conference on Object-
Oriented Programming, Systems, Languages, and Ap-
plications (OOPSLA 2011), pages 301–312, 2011.