HybrA-Filterbanks¶

Auditory-inspired filterbanks for deep learning

Welcome to HybrA-Filterbanks, a PyTorch library providing state-of-the-art auditory-inspired filterbanks for audio processing and deep learning applications.

Overview¶

This library contains the official implementations of:

ISAC (paper): Invertible and Stable Auditory filterbank with Customizable kernels for ML integration
HybrA (paper): Hybrid Auditory filterbank that extends ISAC with learnable filters
ISACSpec: Spectrogram variant with temporal averaging for robust feature extraction
ISACCC: Cepstral coefficient extractor for speech recognition applications

Key Features¶

✨ PyTorch Integration: All filterbanks are implemented as nn.Module for seamless integration into neural networks

🎯 Auditory Modeling: Based on human auditory perception principles (mel, ERB, bark scales)

⚡ Fast Implementation: Optimized using FFT-based circular convolution

🔧 Flexible Configuration: Customizable kernel sizes, frequency ranges, and scales

📊 Frame Theory: Built-in functions for frame bounds, condition numbers, and stability analysis

🎨 Visualization: Rich plotting capabilities for filter responses and time-frequency representations

Installation¶

We publish all releases on PyPi. You can install the current version by running:

pip install hybra

Quick Start¶

Basic ISAC Filterbank¶

import torch
from hybra import ISAC

# Create ISAC filterbank
filterbank = ISAC(
    kernel_size=128,
    num_channels=40,
    fs=16000,
    L=16000,
    scale='mel'
)

# Process audio signal
x = torch.randn(1, 16000)  # Random signal for demo
coefficients = filterbank(x)
reconstructed = filterbank.decoder(coefficients)

# Visualize
filterbank.plot_response()
filterbank.ISACgram(x, log_scale=True)

HybrA with Learnable Filters¶

from hybra import HybrA

# Create hybrid filterbank with learnable components
hybrid_fb = HybrA(
    kernel_size=128,
    learned_kernel_size=23,
    num_channels=40,
    fs=16000,
    L=16000
)

# Forward pass (supports gradients)
x = torch.randn(1, 16000, requires_grad=True)
y = hybrid_fb(x)

# Check condition number for stability
print(f"Condition number: {hybrid_fb.condition_number():.2f}")

ISAC Spectrograms and MFCCs¶

from hybra import ISACSpec, ISACCC

# Spectrogram with temporal averaging
spectrogram = ISACSpec(
    num_channels=40,
    fs=16000,
    L=16000,
    power=2.0,
    is_log=True
)

# MFCC-like cepstral coefficients
mfcc_extractor = ISACCC(
    num_channels=40,
    num_cc=13,
    fs=16000,
    L=16000
)

x = torch.randn(1, 16000)
spec = spectrogram(x)
mfccs = mfcc_extractor(x)

It is also straightforward to include them in any model, e.g., as an encoder/decoder pair.

HybrA model example¶

import torch
import torch.nn as nn
import torchaudio
from hybra import HybrA

class Net(nn.Module):
    def __init__(self):
        super().__init__()

        self.linear_before = nn.Linear(40, 400)

        self.gru = nn.GRU(
            input_size=400,
            hidden_size=400,
            num_layers=2,
            batch_first=True,
        )

     self.linear_after = nn.Linear(400, 600)
     self.linear_after2 = nn.Linear(600, 600)
     self.linear_after3 = nn.Linear(600, 40)


 def forward(self, x):

     x = x.permute(0, 2, 1)
     x = torch.relu(self.linear_before(x))
     x, _ = self.gru(x)
     x = torch.relu(self.linear_after(x))
     x = torch.relu(self.linear_after2(x))
     x = torch.sigmoid(self.linear_after3(x))
     x = x.permute(0, 2, 1)

     return x

class HybridfilterbankModel(nn.Module):
    def __init__(self):
        super().__init__()

        self.nsnet = Net()
        self.fb = HybrA()

    def forward(self, x):
        x = self.fb(x)
        mask = self.nsnet(torch.log10(torch.max(x.abs()**2, 1e-8 * torch.ones_like(x, dtype=torch.float32))))
        return self.fb.decoder(x*mask)

if __name__ == '__main__':
    audio, fs = torchaudio.load('your_audio.wav')
    model = HybridfilterbankModel()
    model(audio)

Citation¶

If you find our work valuable, please cite

@article{HaiderTight2024,
  title={Hold me Tight: Trainable and stable hybrid auditory filterbanks for speech enhancement},
  author={Haider, Daniel and Perfler, Felix and Lostanlen, Vincent and Ehler, Martin and Balazs, Peter},
  journal={arXiv preprint arXiv:2408.17358},
  year={2024}
}
@article{HaiderISAC2025,
      title={ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration},
      author={Daniel Haider and Felix Perfler and Peter Balazs and Clara Hollomey and Nicki Holighaus},
      year={2025},
      url={arXiv preprint arXiv:2505.07709},

}

Documentation:

API Reference

Links:

HybrA-Filterbanks¶

Overview¶

Key Features¶

Installation¶

Quick Start¶

Basic ISAC Filterbank¶

HybrA with Learnable Filters¶

ISAC Spectrograms and MFCCs¶

Citation¶

Indices and tables¶

HybrA-Filterbanks

Navigation

Related Topics

Versions