Configuration System¶
spectrans.config ¶
Configuration system for spectral transformer models.
This module provides configuration management for spectrans models and components using Pydantic for type safety and validation. The configuration system supports YAML-based configuration files, programmatic configuration, and dynamic model building.
Modules:
| Name | Description |
|---|---|
builder |
YAML configuration loading and model building. |
core |
Base configuration classes. |
layers |
Layer-specific configuration schemas. |
models |
Model configuration schemas. |
Classes:
| Name | Description |
|---|---|
AFNOMixingConfig |
Configuration for AFNO mixing layers. |
AFNOModelConfig |
Configuration for AFNO models. |
AttentionLayerConfig |
Base configuration for attention layers. |
BaseLayerConfig |
Base configuration for all layers. |
ConfigBuilder |
Builder for creating models from configuration. |
ConfigurationError |
Exception for configuration errors. |
DCTAttentionConfig |
Configuration for DCT attention layers. |
FilterLayerConfig |
Configuration for filter-based layers. |
FNetModelConfig |
Configuration for FNet models. |
FNOTransformerConfig |
Configuration for FNO transformer models. |
FourierMixingConfig |
Configuration for Fourier mixing layers. |
GFNetModelConfig |
Configuration for GFNet models. |
GlobalFilterMixingConfig |
Configuration for global filter mixing. |
HadamardAttentionConfig |
Configuration for Hadamard attention. |
HybridModelConfig |
Configuration for hybrid models. |
LSTAttentionConfig |
Configuration for LST attention. |
LSTModelConfig |
Configuration for LST models. |
MixedTransformAttentionConfig |
Configuration for mixed transform attention. |
ModelConfig |
Base configuration for all models. |
SpectralAttentionConfig |
Configuration for spectral attention. |
SpectralAttentionModelConfig |
Configuration for spectral attention models. |
SpectralKernelAttentionConfig |
Configuration for spectral kernel attention. |
UnitaryLayerConfig |
Configuration for unitary layers. |
WaveletMixing2DConfig |
Configuration for 2D wavelet mixing. |
WaveletMixingConfig |
Configuration for wavelet mixing. |
WaveletTransformerConfig |
Configuration for wavelet transformers. |
Functions:
| Name | Description |
|---|---|
build_component_from_config |
Build a component from configuration dictionary. |
build_model_from_config |
Build a model from configuration dictionary. |
load_yaml_config |
Load configuration from YAML file. |
Examples:
Loading and building from YAML:
>>> from spectrans.config import ConfigBuilder
>>>
>>> builder = ConfigBuilder()
>>> model = builder.build_model("configs/fnet.yaml")
>>> print(model.num_parameters())
Programmatic configuration:
>>> from spectrans.config import FNetModelConfig, build_model_from_config
>>>
>>> config = FNetModelConfig(
... hidden_dim=768,
... num_layers=12,
... vocab_size=50000,
... max_seq_len=512
... )
>>> model = build_model_from_config(config.model_dump())
Custom layer configuration:
>>> from spectrans.config import GlobalFilterMixingConfig
>>>
>>> layer_config = GlobalFilterMixingConfig(
... hidden_dim=512,
... sequence_length=1024,
... activation="sigmoid",
... filter_regularization=0.01
... )
>>> layer = build_component_from_config(layer_config.model_dump())
Notes
Configuration System Design:
The configuration system uses Pydantic for: - Type validation and coercion - Default value management - Nested configuration structures - JSON/YAML serialization
Configuration hierarchy:
- Base classes (BaseLayerConfig, ModelConfig)
- Specialized layer configs (mixing, attention, etc.)
- Model configurations
- Builder system for instantiation
All configurations support: - Validation of parameter ranges - Type checking at configuration time - Serialization to/from YAML and JSON - Programmatic and file-based configuration
See Also
spectrans.config.builder : Configuration building utilities.
spectrans.config.models : Model configuration schemas.
spectrans.config.layers : Layer configuration schemas.
Classes¶
ConfigBuilder ¶
Type-safe builder for spectrans models and components.
The ConfigBuilder provides methods to load YAML configurations and construct models/components with full type safety and validation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strict_validation
|
bool
|
Whether to use strict validation mode, defaults to True. |
True
|
Methods:
| Name | Description |
|---|---|
load_yaml |
Load and parse YAML configuration file. |
build_model |
Build a model from a YAML configuration file. |
build_model_from_dict |
Build a model from a configuration dictionary. |
build_layer |
Build a layer from configuration dictionary. |
validate_config |
Validate a configuration dictionary without building components. |
Source code in spectrans/config/builder.py
Functions¶
load_yaml ¶
Load and parse YAML configuration file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str | Path
|
Path to the YAML configuration file. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Parsed configuration dictionary. |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If the file cannot be loaded or parsed. |
Source code in spectrans/config/builder.py
build_model ¶
Build a model from a YAML configuration file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str | Path
|
Path to the YAML configuration file. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The constructed model instance. |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If the model cannot be built from the configuration. |
Source code in spectrans/config/builder.py
build_model_from_dict ¶
Build a model from a configuration dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_dict
|
dict[str, Any]
|
Configuration dictionary containing model parameters. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The constructed model instance. |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If the model cannot be built from the configuration. |
Source code in spectrans/config/builder.py
build_layer ¶
Build a layer from configuration dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layer_type
|
str
|
Type of layer to build. |
required |
config_dict
|
dict[str, Any]
|
Configuration dictionary containing layer parameters. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The constructed layer instance. |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If the layer cannot be built from the configuration. |
Source code in spectrans/config/builder.py
validate_config ¶
Validate a configuration dictionary without building components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_dict
|
dict[str, Any]
|
Configuration dictionary to validate. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Validated configuration dictionary. |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If the configuration is invalid. |
Source code in spectrans/config/builder.py
ConfigurationError ¶
Bases: Exception
Exception raised for configuration-related errors.
AttentionLayerConfig ¶
Bases: BaseLayerConfig
Configuration for attention-based layers.
Attributes:
| Name | Type | Description |
|---|---|---|
num_heads |
int
|
Number of attention heads, must be positive, defaults to 8. |
head_dim |
int | None
|
Dimension per head, defaults to None (auto-computed). |
BaseLayerConfig ¶
Bases: BaseModel
Base configuration for all neural network layers.
Attributes:
| Name | Type | Description |
|---|---|---|
hidden_dim |
int
|
Hidden dimension size, must be positive. |
dropout |
float
|
Dropout probability, must be between 0.0 and 1.0, defaults to 0.0. |
FilterLayerConfig ¶
Bases: BaseLayerConfig
Configuration for layers using learnable spectral filters.
Attributes:
| Name | Type | Description |
|---|---|---|
sequence_length |
int
|
Input sequence length, must be positive. |
learnable_filters |
bool
|
Whether filters are learnable, defaults to True. |
fft_norm |
FFTNorm
|
FFT normalization mode, defaults to "ortho". |
filter_init_std |
float
|
Standard deviation for filter initialization, defaults to 0.02. |
UnitaryLayerConfig ¶
Bases: BaseLayerConfig
Configuration for layers that preserve energy/unitarity.
Attributes:
| Name | Type | Description |
|---|---|---|
norm_eps |
float
|
Epsilon for numerical stability in normalization, defaults to 1e-5. |
energy_tolerance |
float
|
Tolerance for energy preservation checks, defaults to 1e-4. |
fft_norm |
FFTNorm
|
FFT normalization mode, defaults to "ortho". |
AFNOMixingConfig ¶
Bases: BaseLayerConfig
Configuration for Adaptive Fourier Neural Operator mixing layers.
Attributes:
| Name | Type | Description |
|---|---|---|
max_sequence_length |
int
|
Maximum sequence length for mode truncation, must be positive. |
modes_seq |
int | None
|
Number of Fourier modes in sequence dimension, defaults to max_sequence_length // 2. |
modes_hidden |
int | None
|
Number of Fourier modes in hidden dimension, defaults to hidden_dim // 2. |
mlp_ratio |
float
|
MLP expansion ratio in frequency domain, defaults to 2.0. |
activation |
ActivationType
|
Activation function for MLP, defaults to "gelu". |
DCTAttentionConfig ¶
Bases: AttentionLayerConfig
Configuration for DCT-based attention layer.
Attributes:
| Name | Type | Description |
|---|---|---|
dct_type |
int
|
Type of DCT transform (typically 2), defaults to 2. |
learnable_scale |
bool
|
Whether to use learnable diagonal scaling, defaults to True. |
FourierMixingConfig ¶
Bases: UnitaryLayerConfig
Configuration for standard Fourier mixing layers.
Attributes:
| Name | Type | Description |
|---|---|---|
keep_complex |
bool
|
If True, keeps complex values from FFT. If False (default), takes only the real part as in original FNet. |
GlobalFilterMixingConfig ¶
Bases: FilterLayerConfig
Configuration for global filter mixing layers.
Attributes:
| Name | Type | Description |
|---|---|---|
activation |
ActivationType
|
Activation function for filters, defaults to "sigmoid". |
HadamardAttentionConfig ¶
Bases: AttentionLayerConfig
Configuration for Hadamard-based attention layer.
Attributes:
| Name | Type | Description |
|---|---|---|
scale_by_sqrt |
bool
|
Whether to scale by sqrt(n), defaults to True. |
learnable_scale |
bool
|
Whether to use learnable diagonal scaling, defaults to True. |
LSTAttentionConfig ¶
Bases: AttentionLayerConfig
Configuration for Linear Spectral Transform Attention.
Attributes:
| Name | Type | Description |
|---|---|---|
transform_type |
TransformLSTType
|
Type of spectral transform ('dct', 'dst', 'hadamard', 'mixed'), defaults to 'dct'. |
learnable_scale |
bool
|
Whether to use learnable diagonal scaling, defaults to True. |
normalize |
bool
|
Whether to normalize transform output, defaults to True. |
use_bias |
bool
|
Whether to use bias in projections, defaults to True. |
MixedTransformAttentionConfig ¶
Bases: AttentionLayerConfig
Configuration for mixed transform attention layer.
Attributes:
| Name | Type | Description |
|---|---|---|
use_fft |
bool
|
Whether to use FFT transforms, defaults to True. |
use_dct |
bool
|
Whether to use DCT transforms, defaults to True. |
use_hadamard |
bool
|
Whether to use Hadamard transforms, defaults to True. |
SpectralAttentionConfig ¶
Bases: AttentionLayerConfig
Configuration for Spectral Attention with Random Fourier Features.
Attributes:
| Name | Type | Description |
|---|---|---|
num_features |
int | None
|
Number of random Fourier features, defaults to None (uses head_dim). |
kernel_type |
KernelType
|
Type of kernel ('gaussian' or 'softmax'), defaults to 'softmax'. |
use_orthogonal |
bool
|
Whether to use orthogonal random features, defaults to True. |
feature_redraw |
bool
|
Whether to redraw features during training, defaults to False. |
use_bias |
bool
|
Whether to use bias in projections, defaults to True. |
SpectralKernelAttentionConfig ¶
Bases: AttentionLayerConfig
Configuration for spectral kernel attention.
Attributes:
| Name | Type | Description |
|---|---|---|
kernel_type |
SpectralKernelType
|
Type of spectral kernel ('gaussian', 'polynomial', 'spectral'), defaults to 'gaussian'. |
rank |
int | None
|
Rank for low-rank approximation, defaults to None (uses min(64, head_dim)). |
num_features |
int | None
|
Number of features for approximation, defaults to None. |
WaveletMixing2DConfig ¶
Bases: BaseModel
Configuration model for WaveletMixing2D layer.
Attributes:
| Name | Type | Description |
|---|---|---|
channels |
int
|
Number of input channels, must be positive. |
wavelet |
WaveletType
|
Wavelet family name, defaults to "db4". |
levels |
int
|
Number of decomposition levels, must be between 1 and 6, defaults to 2. |
mixing_mode |
Literal['subband', 'channel']
|
2D mixing operation mode, defaults to "subband". |
Methods:
| Name | Description |
|---|---|
validate_wavelet |
Validate that wavelet name is supported. |
WaveletMixingConfig ¶
Bases: BaseLayerConfig
Configuration model for WaveletMixing layer.
Attributes:
| Name | Type | Description |
|---|---|---|
wavelet |
WaveletType
|
Wavelet family name, defaults to "db4". |
levels |
int
|
Number of decomposition levels, must be between 1 and 6, defaults to 3. |
mixing_mode |
Literal['pointwise', 'subband']
|
Mixing operation mode, defaults to "pointwise". |
Methods:
| Name | Description |
|---|---|
validate_wavelet |
Validate that wavelet name is supported. |
AFNOModelConfig ¶
Bases: ModelConfig
Configuration for Adaptive Fourier Neural Operator models.
AFNO models use adaptive Fourier mode truncation for efficient token mixing.
Attributes:
| Name | Type | Description |
|---|---|---|
n_modes |
int | None
|
Number of Fourier modes to retain in sequence dimension, optional. |
modes_seq |
int | None
|
Number of Fourier modes in sequence dimension (alias for n_modes), optional. |
modes_hidden |
int | None
|
Number of Fourier modes in hidden dimension, optional. |
compression_ratio |
float
|
Compression ratio for modes_hidden when using n_modes, defaults to 0.5. |
mlp_ratio |
float
|
MLP expansion ratio in frequency domain, defaults to 2.0. |
FNetModelConfig ¶
Bases: ModelConfig
Configuration for FNet transformer models.
FNet models use Fourier mixing layers instead of attention.
Attributes:
| Name | Type | Description |
|---|---|---|
use_real_fft |
bool
|
Whether to use real FFT for efficiency, defaults to True. |
FNOTransformerConfig ¶
Bases: ModelConfig
Configuration for Fourier Neural Operator transformer models.
FNO models use spectral convolutions in the Fourier domain to learn mappings between function spaces with O(n log n) complexity.
Attributes:
| Name | Type | Description |
|---|---|---|
modes |
int
|
Number of Fourier modes to retain (frequency truncation), defaults to 32. |
mlp_ratio |
float
|
Expansion ratio for the MLP in FNO blocks, defaults to 2.0. |
use_2d |
bool
|
Whether to use 2D spectral convolutions for spatial data, defaults to False. |
spatial_dim |
int | None
|
Spatial dimension when using 2D convolutions (sequence = spatial_dim²), optional. |
GFNetModelConfig ¶
Bases: ModelConfig
Configuration for Global Filter Network models.
GFNet models use learnable global filters in the frequency domain.
Attributes:
| Name | Type | Description |
|---|---|---|
filter_activation |
str
|
Activation function for filters ('sigmoid' or 'tanh'), defaults to 'sigmoid'. |
HybridModelConfig ¶
Bases: ModelConfig
Configuration for Hybrid Spectral-Spatial Transformer models.
Hybrid models alternate between different mixing strategies (e.g., spectral and spatial) across layers, combining strengths of multiple approaches.
Attributes:
| Name | Type | Description |
|---|---|---|
spectral_type |
str
|
Type of spectral mixing ('fourier', 'wavelet', 'afno', 'gfnet'), defaults to 'fourier'. |
spatial_type |
str
|
Type of spatial mixing ('attention', 'spectral_attention', 'lst'), defaults to 'attention'. |
alternation_pattern |
str
|
How to alternate layers ('even_spectral', 'alternate', 'custom'), defaults to 'even_spectral'. |
num_heads |
int
|
Number of attention heads for spatial layers, defaults to 8. |
spectral_config |
dict | None
|
Additional configuration for spectral layers, optional. |
spatial_config |
dict | None
|
Additional configuration for spatial layers, optional. |
LSTModelConfig ¶
Bases: ModelConfig
Configuration for Linear Spectral Transform models.
LST models use linear spectral transforms (DCT/DST/Hadamard) for sequence mixing, achieving O(n log n) complexity through fast transform algorithms.
Attributes:
| Name | Type | Description |
|---|---|---|
transform_type |
TransformLSTType
|
Type of spectral transform to use, defaults to "dct". |
use_conv_bias |
bool
|
Whether to use bias in spectral convolution, defaults to True. |
ModelConfig ¶
Bases: BaseModel
Base configuration for all spectrans models.
Attributes:
| Name | Type | Description |
|---|---|---|
model_type |
str
|
Type identifier for the model. |
hidden_dim |
int
|
Hidden dimension size, must be positive. |
num_layers |
int
|
Number of transformer layers, must be positive. |
sequence_length |
int
|
Maximum input sequence length, must be positive. |
dropout |
float
|
Global dropout probability, defaults to 0.0. |
vocab_size |
int | None
|
Vocabulary size for token embeddings, optional. |
num_classes |
int | None
|
Number of output classes for classification, optional. |
ffn_hidden_dim |
int | None
|
Hidden dimension for feedforward network, optional. |
use_positional_encoding |
bool
|
Whether to use positional encoding, defaults to True. |
positional_encoding_type |
PositionalEncodingType
|
Type of positional encoding ('sinusoidal', 'learned', 'rotary', 'alibi', 'none'), defaults to 'sinusoidal'. |
norm_eps |
float
|
Layer normalization epsilon, defaults to 1e-12. |
output_type |
OutputHeadType
|
Type of output head ('classification', 'regression', 'sequence', 'lm', 'none'), defaults to 'classification'. |
gradient_checkpointing |
bool
|
Whether to use gradient checkpointing, defaults to False. |
SpectralAttentionModelConfig ¶
Bases: ModelConfig
Configuration for Spectral Attention transformer models.
Spectral attention models use Random Fourier Features (RFF) to approximate attention with linear complexity O(n) instead of quadratic O(n²).
Attributes:
| Name | Type | Description |
|---|---|---|
num_features |
int | None
|
Number of random Fourier features, defaults to None (uses hidden_dim). |
kernel_type |
KernelType
|
Type of kernel ('gaussian', 'softmax'), defaults to 'gaussian'. |
use_orthogonal |
bool
|
Whether to use orthogonal random features, defaults to True. |
num_heads |
int
|
Number of attention heads, defaults to 8. |
WaveletTransformerConfig ¶
Bases: ModelConfig
Configuration for Wavelet Transformer models.
Wavelet transformers use discrete wavelet transforms (DWT) for sequence mixing, providing multi-resolution analysis with O(n) complexity.
Attributes:
| Name | Type | Description |
|---|---|---|
wavelet |
WaveletType
|
Type of wavelet to use ('db4', 'sym6', 'coif3', etc.), defaults to 'db4'. |
levels |
int
|
Number of decomposition levels (typically 1-5), defaults to 3. |
mixing_mode |
str
|
How to mix wavelet coefficients ('pointwise', 'channel', 'level'), defaults to 'pointwise'. |
Functions¶
build_component_from_config ¶
Build component from configuration dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
component_type
|
str
|
Type of component to build. |
required |
config_dict
|
dict[str, Any]
|
Configuration dictionary. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The constructed component. |
Source code in spectrans/config/builder.py
build_model_from_config ¶
Build model from configuration dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_dict
|
dict[str, Any]
|
Configuration dictionary. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The constructed model. |
Source code in spectrans/config/builder.py
load_yaml_config ¶
Load YAML configuration file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str | Path
|
Path to the configuration file. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Parsed configuration dictionary. |