{"name":"Kapre: Keras Audio Preprocessors for Real-time GPU Processing","description":"Kapre is a powerful Python library that provides Keras layers for real-time audio preprocessing directly on GPUs. It enables efficient computation of STFT, Melspectrograms, and other audio features within your deep learning models. This integration simplifies model deployment, allows for DSP parameter optimization, and ensures consistency compared to traditional pre-computation or custom implementations.","github":"https://github.com/keunwoochoi/kapre","url":"https://osrepos.com/repo/keunwoochoi-kapre","source":"osrepos.com","sourceDescription":"This repository profile is provided by osrepos.com, an open source repository discovery platform.","repositoryProfile":"https://osrepos.com/repo/keunwoochoi-kapre","generatedFor":"open source discovery and AI-assisted research","markdown":"https://osrepos.com/repo/keunwoochoi-kapre.md","json":"https://osrepos.com/repo/keunwoochoi-kapre.json","topics":["Python","Keras","TensorFlow","Audio Processing","Melspectrogram","Deep Learning","Audio Preprocessing","Spectrogram"],"keywords":["Python","Keras","TensorFlow","Audio Processing","Melspectrogram","Deep Learning","Audio Preprocessing","Spectrogram"],"stars":null,"summary":"Kapre is a powerful Python library that provides Keras layers for real-time audio preprocessing directly on GPUs. It enables efficient computation of STFT, Melspectrograms, and other audio features within your deep learning models. This integration simplifies model deployment, allows for DSP parameter optimization, and ensures consistency compared to traditional pre-computation or custom implementations.","content":"## Introduction\n\nKapre is a powerful Python library that offers Keras Audio Preprocessors, allowing you to compute essential audio features like STFT, ISTFT, Melspectrogram, and more, directly on the GPU in real-time. Designed for Python 3.8+ with type hints, Kapre integrates seamlessly into your deep learning workflow, making audio feature extraction an integral part of your Keras models.\n\n## Installation\n\nKapre can be easily installed using pip:\n\nsh\npip install kapre\n\n\n## Examples\n\nIntegrating Kapre into your Keras model is straightforward. Here's a one-shot example demonstrating how to add STFT and other processing layers to a sequential model:\n\npython\nfrom tensorflow.keras.models import Sequential\nfrom tensorflow.keras.layers import Conv2D, BatchNormalization, ReLU, GlobalAveragePooling2D, Dense, Softmax\nfrom kapre import STFT, Magnitude, MagnitudeToDecibel\nfrom kapre.composed import get_melspectrogram_layer, get_log_frequency_spectrogram_layer\n\n# 6 channels (!), maybe 1-sec audio signal, for an example.\ninput_shape = (44100, 6)\nsr = 44100\nmodel = Sequential()\n# A STFT layer\nmodel.add(STFT(n_fft=2048, win_length=2018, hop_length=1024,\n               window_name=None, pad_end=False,\n               input_data_format='channels_last', output_data_format='channels_last',\n               input_shape=input_shape))\nmodel.add(Magnitude())\nmodel.add(MagnitudeToDecibel())  # these three layers can be replaced with get_stft_magnitude_layer()\n# Alternatively, you may want to use a melspectrogram layer\n# melgram_layer = get_melspectrogram_layer()\n# or log-frequency layer\n# log_stft_layer = get_log_frequency_spectrogram_layer() \n\n# add more layers as you want\nmodel.add(Conv2D(32, (3, 3), strides=(2, 2)))\nmodel.add(BatchNormalization())\nmodel.add(ReLU())\nmodel.add(GlobalAveragePooling2D())\nmodel.add(Dense(10))\nmodel.add(Softmax())\n\n# Compile the model\nmodel.compile('adam', 'categorical_crossentropy') # if single-label classification\n\n# train it with raw audio sample inputs\n# for example, you may have functions that load your data as below.\nx = load_x() # e.g., x.shape = (10000, 6, 44100)\ny = load_y() # e.g., y.shape = (10000, 10) if it's 10-class classification\n# then..\nmodel.fit(x, y)\n# Done!\n\n\nFor more examples and detailed usage, refer to the [example folder](https://github.com/keunwoochoi/kapre/tree/master/examples){target=\"_blank\"} in the GitHub repository.\n\n## Why Use Kapre?\n\nKapre offers significant advantages over traditional audio preprocessing methods:\n\n### Versus Pre-computation\n\n*   You can optimize DSP parameters directly within your model training.\n*   Model deployment becomes simpler and more consistent, with fewer external dependencies.\n*   Your code and model have reduced dependencies.\n\n### Versus Your Own Implementation\n\n*   **Quick and Easy**: Integrate complex audio processing with minimal effort.\n*   **Consistency**: Ensures consistent handling with 1D/2D TensorFlow batch shapes and is data format agnostic (`channels_first` and `channels_last`).\n*   **Less Error Prone**: Kapre layers are rigorously tested against established libraries like Librosa, ensuring accuracy in tricky operations like STFT and decibel conversion.\n*   **Extended APIs**: Provides enhanced functionalities beyond default `tf.signals` implementations, such as a perfectly invertible `STFT` and `InverseSTFT` pair, and Mel-spectrogram with more options.\n*   **Reproducibility**: Available on pip with versioning for reliable use.\n\n## Links\n\n*   **GitHub Repository**: [https://github.com/keunwoochoi/kapre](https://github.com/keunwoochoi/kapre){target=\"_blank\"}\n*   **API Documentation**: [https://kapre.readthedocs.io](https://kapre.readthedocs.io){target=\"_blank\"}\n*   **Citation**: If you use Kapre in your work, please cite the following paper:\n\n    \n    @inproceedings{choi2017kapre,\n      title={Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras},\n      author={Choi, Keunwoo and Joo, Deokjin and Kim, Juho},\n      booktitle={Machine Learning for Music Discovery Workshop at 34th International Conference on Machine Learning},\n      year={2017},\n      organization={ICML}\n    }","metrics":{"detailViews":1,"githubClicks":3},"dates":{"published":null,"modified":"2026-05-03T12:28:19.000Z"}}