Cosine Transforms¶
spectrans.transforms.cosine ¶
Discrete Cosine and Sine Transform implementations.
This module implements the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) families, which are orthogonal transforms widely used in signal processing, image compression, and spectral neural networks. The implementations support various normalization conventions.
The DCT and DST transforms provide energy compaction for natural signals while maintaining orthogonality properties for neural network stability.
Classes:
| Name | Description |
|---|---|
DCT |
Discrete Cosine Transform Type-II (most common DCT variant). |
DCT2D |
2D Discrete Cosine Transform for image-like data. |
DST |
Discrete Sine Transform Type-I. |
MDCT |
Modified Discrete Cosine Transform for audio processing. |
Examples:
Basic DCT usage:
>>> import torch
>>> from spectrans.transforms.cosine import DCT
>>> dct = DCT(normalized=True)
>>> signal = torch.randn(32, 512)
>>> dct_coeffs = dct.transform(signal, dim=-1)
>>> reconstructed = dct.inverse_transform(dct_coeffs, dim=-1)
2D DCT for image processing:
>>> from spectrans.transforms.cosine import DCT2D
>>> dct2d = DCT2D(normalized=True)
>>> image = torch.randn(32, 64, 64) # Batch of 64x64 images
>>> dct_image = dct2d.transform(image, dim=(-2, -1))
DST for sine-based analysis:
>>> from spectrans.transforms.cosine import DST
>>> dst = DST(normalized=True)
>>> dst_coeffs = dst.transform(signal, dim=-1)
MDCT for overlapped transforms:
>>> from spectrans.transforms.cosine import MDCT
>>> mdct = MDCT(window_length=1024, hop_length=512)
>>> overlapped_coeffs = mdct.transform(audio_signal)
Notes
Mathematical Formulations:
DCT Type-II (most common):
Where \(\alpha_k = \sqrt{\frac{1}{N}}\) if \(k=0\), \(\alpha_k = \sqrt{\frac{2}{N}}\) if \(k>0\) (for orthonormal normalization)
DST Type-I:
Orthogonality Properties:
- DCT and DST matrices are orthogonal: \(\mathbf{T}^T \mathbf{T} = \mathbf{I}\)
- Perfect reconstruction: \(\mathbf{x} = \text{DCT}^{-1}(\text{DCT}(\mathbf{x}))\)
- Energy conservation: \(\|\text{DCT}(\mathbf{x})\|^2 = \|\mathbf{x}\|^2\) (with proper normalization)
Computational Complexity:
- DCT/DST: \(O(N \log N)\) via FFT-based algorithms
- Direct computation: \(O(N^2)\)
Implementation Details:
- Uses FFT-based algorithms for \(O(N \log N)\) complexity
- Supports both normalized and unnormalized variants
- Proper handling of boundary conditions for different transform types
- Gradient-compatible for neural network training
Performance Characteristics:
- In-place computation where possible
- GPU accelerated through CUDA kernels
- Proper scaling and normalization
- Batch processing support
References
Nasir Ahmed, T. Natarajan, and K. R. Rao. 1974. Discrete cosine transform. IEEE Transactions on Computers, C-23(1):90-93.
K. R. Rao and P. Yip. 1990. Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, Boston.
William B. Pennebaker and Joan L. Mitchell. 1993. JPEG: Still Image Data Compression Standard. Van Nostrand Reinhold, New York.
See Also
spectrans.transforms.base : Base classes for orthogonal transforms spectrans.transforms.fourier : Related Fourier transform implementations spectrans.layers.mixing : Neural layers using DCT/DST transforms
Classes¶
DCT ¶
Bases: OrthogonalTransform
Discrete Cosine Transform (Type-II).
The DCT-II is the most commonly used DCT variant, often referred to as simply "the DCT". It's widely used in signal compression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
normalized
|
bool
|
Whether to use orthonormal normalization. |
True
|
Methods:
| Name | Description |
|---|---|
transform |
Apply DCT-II transform. |
inverse_transform |
Apply inverse DCT (DCT-III). |
Source code in spectrans/transforms/cosine.py
Functions¶
transform ¶
Apply DCT-II transform.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor. |
required |
dim
|
int
|
Dimension along which to apply DCT. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
DCT coefficients. |
Source code in spectrans/transforms/cosine.py
inverse_transform ¶
Apply inverse DCT (DCT-III).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
DCT coefficients. |
required |
dim
|
int
|
Dimension along which to apply inverse DCT. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Reconstructed signal. |
Source code in spectrans/transforms/cosine.py
DST ¶
Bases: OrthogonalTransform
Discrete Sine Transform (Type-II).
The DST-II is analogous to the DCT-II but uses sine functions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
normalized
|
bool
|
Whether to use orthonormal normalization. |
True
|
Methods:
| Name | Description |
|---|---|
transform |
Apply DST-II transform. |
inverse_transform |
Apply inverse DST (DST-III). |
Source code in spectrans/transforms/cosine.py
Functions¶
transform ¶
Apply DST-II transform.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor. |
required |
dim
|
int
|
Dimension along which to apply DST. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
DST coefficients. |
Source code in spectrans/transforms/cosine.py
inverse_transform ¶
Apply inverse DST (DST-III).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
DST coefficients. |
required |
dim
|
int
|
Dimension along which to apply inverse DST. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Reconstructed signal. |
Source code in spectrans/transforms/cosine.py
DCT2D ¶
Bases: SpectralTransform2D
2D Discrete Cosine Transform.
Applies DCT-II along both spatial dimensions, commonly used in image compression (e.g., JPEG).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
normalized
|
bool
|
Whether to use orthonormal normalization. |
True
|
Methods:
| Name | Description |
|---|---|
transform |
Apply 2D DCT. |
inverse_transform |
Apply inverse 2D DCT. |
Source code in spectrans/transforms/cosine.py
Functions¶
transform ¶
Apply 2D DCT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor. |
required |
dim
|
tuple[int, int]
|
Dimensions along which to apply 2D DCT. |
(-2, -1)
|
Returns:
| Type | Description |
|---|---|
Tensor
|
2D DCT coefficients. |
Source code in spectrans/transforms/cosine.py
inverse_transform ¶
Apply inverse 2D DCT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
2D DCT coefficients. |
required |
dim
|
tuple[int, int]
|
Dimensions along which to apply inverse 2D DCT. |
(-2, -1)
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Reconstructed signal. |
Source code in spectrans/transforms/cosine.py
MDCT ¶
Bases: OrthogonalTransform
Modified Discrete Cosine Transform.
The MDCT is a lapped transform based on DCT-IV with 50% overlap, commonly used in audio compression (MP3, AAC).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
block_size
|
int
|
Size of the transform block (must be even). |
required |
window
|
str
|
Window function to use: "sine" or "vorbis". |
"sine"
|
Methods:
| Name | Description |
|---|---|
transform |
Apply MDCT. |
inverse_transform |
Apply inverse MDCT. |
Source code in spectrans/transforms/cosine.py
Functions¶
transform ¶
Apply MDCT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor. Length along dim must be multiple of half_block. |
required |
dim
|
int
|
Dimension along which to apply MDCT. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
MDCT coefficients. |
Source code in spectrans/transforms/cosine.py
inverse_transform ¶
Apply inverse MDCT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
MDCT coefficients. |
required |
dim
|
int
|
Dimension along which to apply inverse MDCT. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Reconstructed signal with overlap-add. |