Model Configuration¶
spectrans.config.models ¶
Full model configuration schemas for spectrans.
This module provides complete configuration models for entire spectrans models, including all their components and parameters. These are the top-level configurations that would be loaded from YAML files.
Classes:
| Name | Description |
|---|---|
ModelConfig |
Base configuration for all spectrans models. |
FNetModelConfig |
Configuration for FNet transformer models. |
GFNetModelConfig |
Configuration for Global Filter Network models. |
AFNOModelConfig |
Configuration for Adaptive Fourier Neural Operator models. |
LSTModelConfig |
Configuration for Linear Spectral Transform models. |
SpectralAttentionModelConfig |
Configuration for Spectral Attention transformer models. |
Notes
Full model configurations compose together layer, block, and other component configurations to define complete model architectures. These are what would typically be loaded from YAML configuration files.
Examples:
>>> from spectrans.config.models import FNetModelConfig
>>> config = FNetModelConfig(
... hidden_dim=768,
... num_layers=12,
... sequence_length=512
... )
>>> print(config.model_type)
'fnet'
Classes¶
ModelConfig ¶
Bases: BaseModel
Base configuration for all spectrans models.
Attributes:
| Name | Type | Description |
|---|---|---|
model_type |
str
|
Type identifier for the model. |
hidden_dim |
int
|
Hidden dimension size, must be positive. |
num_layers |
int
|
Number of transformer layers, must be positive. |
sequence_length |
int
|
Maximum input sequence length, must be positive. |
dropout |
float
|
Global dropout probability, defaults to 0.0. |
vocab_size |
int | None
|
Vocabulary size for token embeddings, optional. |
num_classes |
int | None
|
Number of output classes for classification, optional. |
ffn_hidden_dim |
int | None
|
Hidden dimension for feedforward network, optional. |
use_positional_encoding |
bool
|
Whether to use positional encoding, defaults to True. |
positional_encoding_type |
PositionalEncodingType
|
Type of positional encoding ('sinusoidal', 'learned', 'rotary', 'alibi', 'none'), defaults to 'sinusoidal'. |
norm_eps |
float
|
Layer normalization epsilon, defaults to 1e-12. |
output_type |
OutputHeadType
|
Type of output head ('classification', 'regression', 'sequence', 'lm', 'none'), defaults to 'classification'. |
gradient_checkpointing |
bool
|
Whether to use gradient checkpointing, defaults to False. |
FNetModelConfig ¶
Bases: ModelConfig
Configuration for FNet transformer models.
FNet models use Fourier mixing layers instead of attention.
Attributes:
| Name | Type | Description |
|---|---|---|
use_real_fft |
bool
|
Whether to use real FFT for efficiency, defaults to True. |
GFNetModelConfig ¶
Bases: ModelConfig
Configuration for Global Filter Network models.
GFNet models use learnable global filters in the frequency domain.
Attributes:
| Name | Type | Description |
|---|---|---|
filter_activation |
str
|
Activation function for filters ('sigmoid' or 'tanh'), defaults to 'sigmoid'. |
AFNOModelConfig ¶
Bases: ModelConfig
Configuration for Adaptive Fourier Neural Operator models.
AFNO models use adaptive Fourier mode truncation for efficient token mixing.
Attributes:
| Name | Type | Description |
|---|---|---|
n_modes |
int | None
|
Number of Fourier modes to retain in sequence dimension, optional. |
modes_seq |
int | None
|
Number of Fourier modes in sequence dimension (alias for n_modes), optional. |
modes_hidden |
int | None
|
Number of Fourier modes in hidden dimension, optional. |
compression_ratio |
float
|
Compression ratio for modes_hidden when using n_modes, defaults to 0.5. |
mlp_ratio |
float
|
MLP expansion ratio in frequency domain, defaults to 2.0. |
LSTModelConfig ¶
Bases: ModelConfig
Configuration for Linear Spectral Transform models.
LST models use linear spectral transforms (DCT/DST/Hadamard) for sequence mixing, achieving O(n log n) complexity through fast transform algorithms.
Attributes:
| Name | Type | Description |
|---|---|---|
transform_type |
TransformLSTType
|
Type of spectral transform to use, defaults to "dct". |
use_conv_bias |
bool
|
Whether to use bias in spectral convolution, defaults to True. |
SpectralAttentionModelConfig ¶
Bases: ModelConfig
Configuration for Spectral Attention transformer models.
Spectral attention models use Random Fourier Features (RFF) to approximate attention with linear complexity O(n) instead of quadratic O(n²).
Attributes:
| Name | Type | Description |
|---|---|---|
num_features |
int | None
|
Number of random Fourier features, defaults to None (uses hidden_dim). |
kernel_type |
KernelType
|
Type of kernel ('gaussian', 'softmax'), defaults to 'gaussian'. |
use_orthogonal |
bool
|
Whether to use orthogonal random features, defaults to True. |
num_heads |
int
|
Number of attention heads, defaults to 8. |
FNOTransformerConfig ¶
Bases: ModelConfig
Configuration for Fourier Neural Operator transformer models.
FNO models use spectral convolutions in the Fourier domain to learn mappings between function spaces with O(n log n) complexity.
Attributes:
| Name | Type | Description |
|---|---|---|
modes |
int
|
Number of Fourier modes to retain (frequency truncation), defaults to 32. |
mlp_ratio |
float
|
Expansion ratio for the MLP in FNO blocks, defaults to 2.0. |
use_2d |
bool
|
Whether to use 2D spectral convolutions for spatial data, defaults to False. |
spatial_dim |
int | None
|
Spatial dimension when using 2D convolutions (sequence = spatial_dim²), optional. |
WaveletTransformerConfig ¶
Bases: ModelConfig
Configuration for Wavelet Transformer models.
Wavelet transformers use discrete wavelet transforms (DWT) for sequence mixing, providing multi-resolution analysis with O(n) complexity.
Attributes:
| Name | Type | Description |
|---|---|---|
wavelet |
WaveletType
|
Type of wavelet to use ('db4', 'sym6', 'coif3', etc.), defaults to 'db4'. |
levels |
int
|
Number of decomposition levels (typically 1-5), defaults to 3. |
mixing_mode |
str
|
How to mix wavelet coefficients ('pointwise', 'channel', 'level'), defaults to 'pointwise'. |
HybridModelConfig ¶
Bases: ModelConfig
Configuration for Hybrid Spectral-Spatial Transformer models.
Hybrid models alternate between different mixing strategies (e.g., spectral and spatial) across layers, combining strengths of multiple approaches.
Attributes:
| Name | Type | Description |
|---|---|---|
spectral_type |
str
|
Type of spectral mixing ('fourier', 'wavelet', 'afno', 'gfnet'), defaults to 'fourier'. |
spatial_type |
str
|
Type of spatial mixing ('attention', 'spectral_attention', 'lst'), defaults to 'attention'. |
alternation_pattern |
str
|
How to alternate layers ('even_spectral', 'alternate', 'custom'), defaults to 'even_spectral'. |
num_heads |
int
|
Number of attention heads for spatial layers, defaults to 8. |
spectral_config |
dict | None
|
Additional configuration for spectral layers, optional. |
spatial_config |
dict | None
|
Additional configuration for spatial layers, optional. |