hybra¶
- class hybra.HybrA(kernel_size: int = 128, learned_kernel_size: int = 23, num_channels: int = 40, stride: int | None = None, fc_max: float | int | None = None, fs: int = 16000, L: int = 16000, supp_mult: float = 1, scale: str = 'mel', tighten: bool = False, det_init: bool = False, verbose: bool = True)[source]¶
Bases:
Module
Constructor for a HybrA filterbank.
- Parameters:
- __init__(kernel_size: int = 128, learned_kernel_size: int = 23, num_channels: int = 40, stride: int | None = None, fc_max: float | int | None = None, fs: int = 16000, L: int = 16000, supp_mult: float = 1, scale: str = 'mel', tighten: bool = False, det_init: bool = False, verbose: bool = True)[source]¶
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- property condition_number¶
- forward(x: Tensor) Tensor [source]¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hybra.ISAC(kernel_size: int | None = 128, num_channels: int = 40, fc_max: float | int | None = None, stride: int | None = None, fs: int = 16000, L: int = 16000, supp_mult: float = 1, scale: str = 'mel', tighten=False, is_encoder_learnable=False, fit_decoder=False, is_decoder_learnable=False, verbose: bool = True)[source]¶
Bases:
Module
Constructor for an ISAC filterbank.
- Parameters:
- __init__(kernel_size: int | None = 128, num_channels: int = 40, fc_max: float | int | None = None, stride: int | None = None, fs: int = 16000, L: int = 16000, supp_mult: float = 1, scale: str = 'mel', tighten=False, is_encoder_learnable=False, fit_decoder=False, is_decoder_learnable=False, verbose: bool = True)[source]¶
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- property condition_number¶
- property condition_number_decoder¶
- forward(x)[source]¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hybra.ISACMelSpectrogram(kernel_size: int | None = None, num_channels: int = 40, fc_max: float | int | None = None, stride: int | None = None, fs: int = 16000, L: int = 16000, bw_multiplier: float = 1, scale: str = 'erb', is_encoder_learnable=False, is_averaging_kernel_learnable=False, is_log=False)[source]¶
Bases:
Module
- __init__(kernel_size: int | None = None, num_channels: int = 40, fc_max: float | int | None = None, stride: int | None = None, fs: int = 16000, L: int = 16000, bw_multiplier: float = 1, scale: str = 'erb', is_encoder_learnable=False, is_averaging_kernel_learnable=False, is_log=False)[source]¶
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: Tensor) Tensor [source]¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
hybra.utils¶
- hybra.utils.ISACgram(coefficients, fc=None, L=None, fs=None, fc_max=None, log_scale=True, vmin=None, cmap='inferno')[source]¶
Plot the ISAC coefficients with optional log scaling and colorbar.
- Parameters:
coefficients (numpy.Array or torch.Tensor) – Filterbank coefficients.
fc (numpy.Array) – Center frequencies.
L (int) – Signal length.
fs (int) – Sampling rate.
fc_max (float or None) – Max frequency to display.
log_scale (bool) – Apply log scaling to coefficients.
vmin (float or None) – Minimum value for dynamic range clipping.
cmap (str) – Matplotlib colormap name.
- hybra.utils.alias(w: Tensor, d: int, Ls: int | None = None, diag_only: bool = False) Tensor [source]¶
Computes the norm of the aliasing terms. :param w: Impulse responses of the filterbank as 2-D Tensor torch.tensor[num_channels, sig_length] :param d: Decimation factor, must divide filter length!
- Output:
A: Energy of the aliasing terms
- hybra.utils.audfilters(kernel_size: int | None = None, num_channels: int = 96, fc_max: float | int | None = None, fs: int = 16000, L: int = 16000, supp_mult: float = 1, scale: str = 'mel') tuple[Tensor, int, int, int | float, int | float, int, int, int] [source]¶
Generate FIR filter kernels with length kernel_size equidistantly spaced on auditory frequency scales.
- Parameters:
kernel_size (int) – Size of the filter kernels (equals maximum window length).
num_channels (int) – Number of channels.
fc_max (int) – Maximum frequency (in Hz) that should lie on the aud scale.
fs (int) – Sampling rate.
L (int) – Signal length.
supp_mult (float) – Support multiplier.
scale (str) – Auditory scale.
- Returns:
kernels (torch.Tensor): Generated kernels. d (int): Downsampling rates. fc (list): Center frequencies. fc_min (int, float): First transition frequency. fc_max (int, float): Second transition frequency. kernel_min (int): Minimum kernel size. kernel_size (int): Maximum kernel size. L (int): Admissible signal length.
- Return type:
- hybra.utils.audspace(fmin: float | int | Tensor, fmax: float | int | Tensor, num_channels: int, scale: str = 'erb')[source]¶
Computes a vector of values equidistantly spaced on the selected auditory scale.
- Parameters:
- Returns:
y (ndarray): Array of frequencies equidistantly scaled on the auditory scale.
- Return type:
- hybra.utils.audspace_mod(fc_low: float | int | Tensor, fc_high: float | int | Tensor, fs: int, num_channels: int, scale: str = 'erb')[source]¶
Generate M frequency samples that are equidistant in the modified auditory scale.
- hybra.utils.audtofreq(aud: float | int | Tensor, scale: str = 'erb', fs: int | None = None)[source]¶
- hybra.utils.audtofreq_mod(aud: float | int | Tensor, fc_low: float | int | Tensor, fc_high: float | int | Tensor, scale='erb', fs=None)[source]¶
Inverse of freqtoaud_mod to map auditory scale back to frequency.
- hybra.utils.bwtofc(bw: float | int | Tensor, scale='erb')[source]¶
Computes the center frequency corresponding to a given critical bandwidth.
- Parameters:
- Returns:
Center frequency corresponding to the given bandwidth.
- Return type:
ndarray or float
- hybra.utils.can_tight(w: Tensor, d: int, Ls: int) Tensor [source]¶
Computes the canonical tight filterbank of w (time domain) using the polyphase representation. :param w: Impulse responses of the filterbank as 2-d Tensor torch.tensor[num_channels, signal_length] :param d: Decimation factor, must divide signal_length!
- Returns:
Canonical tight filterbank of W (torch.tensor[num_channels, signal_length])
- Return type:
W
- hybra.utils.condition_number(w: Tensor, d: int, Ls: int | None = None) Tensor [source]¶
Computes the condition number of a filterbank. :param w: Impulse responses of the filterbank as 2-D Tensor torch.tensor[num_channels, signal_length] :param d: Decimation factor (stride), must divide signal_length!
- Returns:
Condition number
- Return type:
kappa
- hybra.utils.fctobw(fc: float | int | Tensor, scale='erb')[source]¶
Computes the critical bandwidth of a filter at a given center frequency.
- Parameters:
- Returns:
Critical bandwidth at each center frequency.
- Return type:
ndarray or float
- hybra.utils.fir_tightener3000(w: Tensor, supp: int, d: int, eps: float = 1.01, Ls: int | None = None)[source]¶
Iterative tightening procedure with fixed support for a given filterbank w :param w: Impulse responses of the filterbank as 2-D Tensor torch.tensor[num_channels, signal_length]. :param supp: Desired support of the resulting filterbank :param d: Decimation factor, must divide filter length! :param eps: Desired precision for the condition number :param Ls: System length (if not already given by w). If set, the resulting filterbank is padded with zeros to length Ls.
- Returns:
Filterbank with condition number eps and support length supp. If length=supp then the resulting filterbank is the canonical tight filterbank of w.
- hybra.utils.frame_bounds(w: Tensor, d: int, Ls: int | None = None) Tuple[Tensor, Tensor] [source]¶
Computes the frame bounds of a filterbank given in impulse responses using the polyphase representation. :param w: Impulse responses of the filterbank as 2-D Tensor torch.tensor[num_channels, length] :param d: Decimation (or downsampling) factor, must divide filter length!
- Returns:
A, B: Frame bounds
- Return type:
- hybra.utils.freqtoaud(freq: float | int | Tensor, scale: str = 'erb', fs: int | None = None)[source]¶
Converts frequencies (Hz) to auditory scale units.
- hybra.utils.freqtoaud_mod(freq: float | int | Tensor, fc_low: float | int | Tensor, fc_high: float | int | Tensor, scale='erb', fs=None)[source]¶
Modified auditory scale function with linear region below fc_crit.
- hybra.utils.frequency_correlation(w: Tensor, d: int, Ls: int | None = None, diag_only: bool = False) Tensor [source]¶
Computes the frequency correlation functions (vectorized version). :param w: (J, K) - Impulse responses :param d: Decimation factor :param Ls: FFT length (default: nearest multiple of d ≥ 2K-1) :param diag_only: If True, only return diagonal (i.e., PSD)
- Returns:
(d, Ls) complex tensor with frequency correlations
- Return type:
G
- hybra.utils.modulate(g: Tensor, fc: float | int | Tensor, fs: int)[source]¶
Modulate a filters.
- Parameters:
g (list of torch.Tensor) – Filters.
fc (list) – Center frequencies.
fs (int) – Sampling rate.
- Returns:
Modulated filters.
- Return type:
g_mod (list of torch.Tensor)
- hybra.utils.plot_response(g, fs, scale='mel', plot_scale=False, fc_min=None, fc_max=None, kernel_min=None, decoder=False)[source]¶
Plotting routine for the frequencs scale and the frequency responses of the filters.
- Parameters:
g (numpy.Array) – Filters.
fs (int) – Sampling rate for plotting Hz.
scale (str) – Auditory scale.
plot_scale (bool) – Plot the scale or not.
fc_min (float) – Lower transition frequency in Hz.
fc_max (float) – Upper transition frequency in Hz.
kernel_min (int) – Minimum kernel size.
decoder (bool) – Plot for the synthesis fb.