fastFM: A High-Performance Python Library for Factorization Machines

Summary
fastFM is a powerful Python library designed for Factorization Machines, offering high-performance implementations of various optimization routines. It integrates seamlessly with the scikit-learn API, making it accessible for machine learning practitioners. The library supports regression, classification, and ranking problems, leveraging C and Cython for speed-critical operations.
Repository Info
Tags
Click on any tag to explore related repositories
Introduction
fastFM is a high-performance Python library dedicated to Factorization Machines (FMs), a powerful class of models widely used in recommender systems and other machine learning tasks. It provides efficient implementations of various optimization routines, including Stochastic Gradient Descent (SGD), Coordinate Descent (CD), and Markov Chain Monte Carlo (MCMC) for Bayesian inference.
Designed with the familiar scikit-learn API, fastFM makes it easy for Python developers to integrate Factorization Machines into their workflows. The library's performance-critical code is written in C and wrapped with Cython, ensuring fast execution. fastFM supports a range of problems, including regression, classification, and ranking.
Bayer, I. "fastFM: A Library for Factorization Machines" Journal of Machine Learning Research 17, pp. 1-5 (2016)
Installation
fastFM offers both binary and source installation options. It is actively supported on Linux (Ubuntu 14.04 LTS) and OS X Mavericks.
Binary Install (64bit only)
The simplest way to get started is via pip:
pip install fastFM
Source Install
For source installation, ensure your Python and OS bit versions agree.
# Install cblas and python-dev header (Linux only).
# - cblas can be installed with libatlas-base-dev or libopenblas-dev (Ubuntu)
sudo apt-get install python-dev libopenblas-dev
# Clone the repo including submodules
git clone --recursive https://github.com/ibayer/fastFM.git
# Enter the root directory
cd fastFM
# Install Python dependencies (Cython>=0.22, numpy, pandas, scipy, scikit-learn)
pip install -r ./requirements.txt
# Compile the C extension.
make # build with default python version (python)
PYTHON=python3 make # build with custom python version (python3)
# Install fastFM
pip install .
Examples
Using fastFM is straightforward, thanks to its scikit-learn compatible API. Here's a quick example for regression:
from fastFM import als
fm = als.FMRegression(n_iter=1000, init_stdev=0.1, rank=2, l2_reg_w=0.1, l2_reg_V=0.5)
fm.fit(X_train, y_train)
y_pred = fm.predict(X_test)
More detailed usage instructions and tutorials can be found in the online documentation and on arXiv.
Why Use fastFM?
fastFM stands out as an excellent choice for implementing Factorization Machines due to several key advantages:
- High Performance: Leveraging C and Cython, fastFM delivers speed-critical operations efficiently, making it suitable for large datasets.
- Scikit-learn API: Its compatibility with the widely adopted scikit-learn API ensures ease of use and integration into existing machine learning pipelines.
- Versatile Solvers: The library offers a range of optimization routines, including SGD, CD, and MCMC, catering to different problem types and preferences.
- Comprehensive Task Support: Whether you're tackling regression, classification, or ranking problems, fastFM provides the necessary tools.
- Bayesian Inference: With MCMC support, fastFM allows for Bayesian inference, providing richer insights into model uncertainty.
- Active Community: The project encourages contributions and has clear guidelines for developers, fostering a growing community.