HybrA-Filterbanks

About

This repository contains the official implementations of HybrA and ISAC. ISAC is an invertible and stable auditory filterbank with customizable kernel size, and HybrA extends ISAC via an additional set of learnable kernels. The two filterbanks are implemented as PyTorch nn.Module and therefore easily integrable into any neural network. As an essential mathematical foundation for the construction of ISAC and HybrA, the repository contains many fast frame-theoretic functions, such as the computation of framebounds, aliasing terms, and regularizers for tightening.

Installation

We publish all releases on PyPi. You can install the current version by running:

pip install hybra

Usage

Construct an ISAC and HybrA filterbank, and plot the filter frequency responses. Transform an input audio signal into the corresponding learnable time-frequency representation, and plot it.

ISAC / HybrA example
 1import torchaudio
 2from hybra import ISAC, HybrA, ISACgram
 3
 4x, fs = torchaudio.load("your_audio.wav")
 5x = torch.tensor(x, dtype=torch.float32).unsqueeze(0)
 6L = x.shape[-1]
 7
 8isac_fb = ISAC(kernel_size=1024, num_channels=128, L=L, fs=fs)
 9isac_fb.plot_response()
10
11y = isac_fb(x)
12x_tilde = isac_fb.decoder(y)
13ISACgram(y, isac_fb.fc, L=L, fs=fs)

It is also straightforward to include them in any model, e.g., as an encoder/decoder pair.

HybrA model example
 1import torch
 2import torch.nn as nn
 3import torchaudio
 4from hybra import HybrA
 5
 6class Net(nn.Module):
 7    def __init__(self):
 8        super().__init__()
 9
10        self.linear_before = nn.Linear(40, 400)
11
12        self.gru = nn.GRU(
13            input_size=400,
14            hidden_size=400,
15            num_layers=2,
16            batch_first=True,
17        )
18
19     self.linear_after = nn.Linear(400, 600)
20     self.linear_after2 = nn.Linear(600, 600)
21     self.linear_after3 = nn.Linear(600, 40)
22
23
24 def forward(self, x):
25
26     x = x.permute(0, 2, 1)
27     x = torch.relu(self.linear_before(x))
28     x, _ = self.gru(x)
29     x = torch.relu(self.linear_after(x))
30     x = torch.relu(self.linear_after2(x))
31     x = torch.sigmoid(self.linear_after3(x))
32     x = x.permute(0, 2, 1)
33
34     return x
35
36class HybridfilterbankModel(nn.Module):
37    def __init__(self):
38        super().__init__()
39
40        self.nsnet = Net()
41        self.fb = HybrA()
42
43    def forward(self, x):
44        x = self.fb(x)
45        mask = self.nsnet(torch.log10(torch.max(x.abs()**2, 1e-8 * torch.ones_like(x, dtype=torch.float32))))
46        return self.fb.decoder(x*mask)
47
48if __name__ == '__main__':
49    audio, fs = torchaudio.load('your_audio.wav')
50    model = HybridfilterbankModel()
51    model(audio)

Citation

If you find our work valuable, please cite

@article{HaiderTight2024,
  title={Hold me Tight: Trainable and stable hybrid auditory filterbanks for speech enhancement},
  author={Haider, Daniel and Perfler, Felix and Lostanlen, Vincent and Ehler, Martin and Balazs, Peter},
  journal={arXiv preprint arXiv:2408.17358},
  year={2024}
}
@article{HaiderISAC2025,
      title={ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration},
      author={Daniel Haider and Felix Perfler and Peter Balazs and Clara Hollomey and Nicki Holighaus},
      year={2025},
      url={arXiv preprint arXiv:2505.07709},

}