pyAudioAnalysis: A Python Library for Audio Feature Extraction and Analysis

Summary
pyAudioAnalysis is an open-source Python library designed for a wide range of audio analysis tasks. It provides robust functionalities for feature extraction, classification, and segmentation of audio data, making it a valuable tool for researchers and developers. This library simplifies complex audio signal processing and machine learning applications.
Repository Info
Tags
Click on any tag to explore related repositories
Introduction
pyAudioAnalysis is a comprehensive Python library dedicated to various audio analysis tasks. It empowers users to extract diverse audio features and representations, such as MFCCs, spectrograms, and chromagrams. Beyond feature extraction, the library facilitates the training, parameter tuning, and evaluation of audio segment classifiers, enabling the classification of unknown sounds and the detection of specific audio events. It also supports both supervised and unsupervised segmentation, including speaker diarization, and allows for the training and application of audio regression models, with examples like emotion recognition. Furthermore, pyAudioAnalysis includes tools for dimensionality reduction, aiding in the visualization of audio data and content similarities.
Installation
Getting started with pyAudioAnalysis is straightforward. Follow these steps to install the library:
- Clone the repository:
git clone https://github.com/tyiannak/pyAudioAnalysis.git - Install dependencies: Navigate into the cloned directory and install the required packages.
cd pyAudioAnalysis pip install -r ./requirements.txt - Install the library:
pip install -e .
Examples
pyAudioAnalysis offers easy-to-use wrappers for executing audio analysis tasks, alongside command-line support for all its functionalities.
Audio Classification Example
This Python code snippet demonstrates how to train an audio segment classifier using WAV files organized into folders (each representing a different class), and then use the trained classifier to categorize an unknown audio WAV file.
from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")
The result of the classification would be similar to:
(0.0, array([ 0.90156761, 0.09843239]), ['music', 'speech'])
Command-Line Usage
For command-line enthusiasts, pyAudioAnalysis provides direct access to its features. For instance, to extract the spectrogram of an audio signal stored in a WAV file, you can use:
python audioAnalysis.py fileSpectrogram -i data/doremi.wav
For more detailed examples and comprehensive tutorials, refer to the official pyAudioAnalysis Wiki.
Why Use pyAudioAnalysis
pyAudioAnalysis stands out as a robust choice for audio analysis due to several key advantages:
- Comprehensive Functionality: It covers a broad spectrum of audio analysis tasks, from basic feature extraction to advanced machine learning applications like classification, segmentation, and regression, all within a single library.
- Ease of Use: The library is designed with user-friendliness in mind, offering simple Python wrappers and extensive command-line support, making it accessible to both beginners and experienced developers.
- Machine Learning Integration: It seamlessly integrates various machine learning models, allowing users to train, evaluate, and deploy classifiers and regression models for diverse audio-related problems.
- Active Development and Rich Resources: The project is actively maintained, as evidenced by recent news and updates. It also provides a wealth of documentation, including a detailed wiki and several insightful articles, to guide users.
- Open-Source and Extensible: Being an open-source project, pyAudioAnalysis is freely available, encouraging community contributions and allowing for customization and extension to suit specific research or application needs.
Links
Explore pyAudioAnalysis further through these resources:
- GitHub Repository: https://github.com/tyiannak/pyAudioAnalysis
- Official Wiki: https://github.com/tyiannak/pyAudioAnalysis/wiki
- Audio Handling Basics Article: https://hackernoon.com/audio-handling-basics-how-to-process-audio-files-using-python-cli-jo283u3y
- Intro to Audio Analysis Article: https://hackernoon.com/intro-to-audio-analysis-recognizing-sounds-using-machine-learning-qy2r3ufl
- Music Mood Lighting Use Case: https://hackernoon.com/how-to-use-machine-learning-to-color-your-lighting-based-on-music-mood-bi163u8l
- Research Publication: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0144610