Cosine Transforms¶

spectrans.transforms.cosine ¶

Discrete Cosine and Sine Transform implementations.

This module implements the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) families, which are orthogonal transforms widely used in signal processing, image compression, and spectral neural networks. The implementations support various normalization conventions.

The DCT and DST transforms provide energy compaction for natural signals while maintaining orthogonality properties for neural network stability.

Classes:

Name	Description
`DCT`	Discrete Cosine Transform Type-II (most common DCT variant).
`DCT2D`	2D Discrete Cosine Transform for image-like data.
`DST`	Discrete Sine Transform Type-I.
`MDCT`	Modified Discrete Cosine Transform for audio processing.

Examples:

Basic DCT usage:

>>> import torch
>>> from spectrans.transforms.cosine import DCT
>>> dct = DCT(normalized=True)
>>> signal = torch.randn(32, 512)
>>> dct_coeffs = dct.transform(signal, dim=-1)
>>> reconstructed = dct.inverse_transform(dct_coeffs, dim=-1)

2D DCT for image processing:

>>> from spectrans.transforms.cosine import DCT2D
>>> dct2d = DCT2D(normalized=True)
>>> image = torch.randn(32, 64, 64)  # Batch of 64x64 images
>>> dct_image = dct2d.transform(image, dim=(-2, -1))

DST for sine-based analysis:

>>> from spectrans.transforms.cosine import DST
>>> dst = DST(normalized=True)
>>> dst_coeffs = dst.transform(signal, dim=-1)

MDCT for overlapped transforms:

>>> from spectrans.transforms.cosine import MDCT
>>> mdct = MDCT(window_length=1024, hop_length=512)
>>> overlapped_coeffs = mdct.transform(audio_signal)

Notes

Mathematical Formulations:

DCT Type-II (most common):

\[ \text{DCT}[k] = \alpha_k \sum_{n=0}^{N-1} \mathbf{x}[n] \cos\left(\frac{\pi(2n+1)k}{2N}\right) \]

Where \(\alpha_k = \sqrt{\frac{1}{N}}\) if \(k=0\), \(\alpha_k = \sqrt{\frac{2}{N}}\) if \(k>0\) (for orthonormal normalization)

DST Type-I:

\[ \text{DST}[k] = \sum_{n=1}^{N-1} \mathbf{x}[n] \sin\left(\frac{\pi n k}{N}\right) \]

Orthogonality Properties:

DCT and DST matrices are orthogonal: \(\mathbf{T}^T \mathbf{T} = \mathbf{I}\)
Perfect reconstruction: \(\mathbf{x} = \text{DCT}^{-1}(\text{DCT}(\mathbf{x}))\)
Energy conservation: \(\|\text{DCT}(\mathbf{x})\|^2 = \|\mathbf{x}\|^2\) (with proper normalization)

Computational Complexity:

DCT/DST: \(O(N \log N)\) via FFT-based algorithms
Direct computation: \(O(N^2)\)

Implementation Details:

Uses FFT-based algorithms for \(O(N \log N)\) complexity
Supports both normalized and unnormalized variants
Proper handling of boundary conditions for different transform types
Gradient-compatible for neural network training

Performance Characteristics:

In-place computation where possible
GPU accelerated through CUDA kernels
Proper scaling and normalization
Batch processing support

References

Nasir Ahmed, T. Natarajan, and K. R. Rao. 1974. Discrete cosine transform. IEEE Transactions on Computers, C-23(1):90-93.

K. R. Rao and P. Yip. 1990. Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, Boston.

William B. Pennebaker and Joan L. Mitchell. 1993. JPEG: Still Image Data Compression Standard. Van Nostrand Reinhold, New York.

Classes¶

DCT ¶

DCT(normalized: bool = True)

Bases: OrthogonalTransform

Discrete Cosine Transform (Type-II).

The DCT-II is the most commonly used DCT variant, often referred to as simply "the DCT". It's widely used in signal compression.

Parameters:

Name	Type	Description	Default
`normalized`	`bool`	Whether to use orthonormal normalization.	`True`

Methods:

Name	Description
`transform`	Apply DCT-II transform.
`inverse_transform`	Apply inverse DCT (DCT-III).

Source code in spectrans/transforms/cosine.py

def __init__(self, normalized: bool = True):
    super().__init__()
    self.normalized = normalized

Functions¶

transform ¶

transform(x: Tensor, dim: int = -1) -> Tensor

Apply DCT-II transform.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor.	required
`dim`	`int`	Dimension along which to apply DCT.	`-1`

Returns:

Type	Description
`Tensor`	DCT coefficients.

Source code in spectrans/transforms/cosine.py

def transform(self, x: Tensor, dim: int = -1) -> Tensor:
    """Apply DCT-II transform.

    Parameters
    ----------
    x : Tensor
        Input tensor.
    dim : int, default=-1
        Dimension along which to apply DCT.

    Returns
    -------
    Tensor
        DCT coefficients.
    """
    n = x.shape[dim]

    # Create DCT matrix
    dct_matrix = self._create_dct_matrix(n, x.device, x.dtype)

    # Apply DCT via matrix multiplication
    if dim == -1 or dim == x.ndim - 1:
        result = torch.matmul(x, dct_matrix.T)
    else:
        # Move dimension to last position
        x_moved = x.transpose(dim, -1)
        result = torch.matmul(x_moved, dct_matrix.T)
        result = result.transpose(dim, -1)

    return result

inverse_transform ¶

inverse_transform(x: Tensor, dim: int = -1) -> Tensor

Apply inverse DCT (DCT-III).

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	DCT coefficients.	required
`dim`	`int`	Dimension along which to apply inverse DCT.	`-1`

Returns:

Type	Description
`Tensor`	Reconstructed signal.

Source code in spectrans/transforms/cosine.py

def inverse_transform(self, x: Tensor, dim: int = -1) -> Tensor:
    """Apply inverse DCT (DCT-III).

    Parameters
    ----------
    x : Tensor
        DCT coefficients.
    dim : int, default=-1
        Dimension along which to apply inverse DCT.

    Returns
    -------
    Tensor
        Reconstructed signal.
    """
    n = x.shape[dim]

    # Create inverse DCT matrix (DCT-III)
    idct_matrix = self._create_idct_matrix(n, x.device, x.dtype)

    # Apply inverse DCT via matrix multiplication
    if dim == -1 or dim == x.ndim - 1:
        result = torch.matmul(x, idct_matrix.T)
    else:
        # Move dimension to last position
        x_moved = x.transpose(dim, -1)
        result = torch.matmul(x_moved, idct_matrix.T)
        result = result.transpose(dim, -1)

    return result

DST ¶

DST(normalized: bool = True)

Bases: OrthogonalTransform

Discrete Sine Transform (Type-II).

The DST-II is analogous to the DCT-II but uses sine functions.

Parameters:

Name	Type	Description	Default
`normalized`	`bool`	Whether to use orthonormal normalization.	`True`

Methods:

Name	Description
`transform`	Apply DST-II transform.
`inverse_transform`	Apply inverse DST (DST-III).

Source code in spectrans/transforms/cosine.py

def __init__(self, normalized: bool = True):
    super().__init__()
    self.normalized = normalized

Functions¶

transform ¶

transform(x: Tensor, dim: int = -1) -> Tensor

Apply DST-II transform.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor.	required
`dim`	`int`	Dimension along which to apply DST.	`-1`

Returns:

Type	Description
`Tensor`	DST coefficients.

Source code in spectrans/transforms/cosine.py

def transform(self, x: Tensor, dim: int = -1) -> Tensor:
    """Apply DST-II transform.

    Parameters
    ----------
    x : Tensor
        Input tensor.
    dim : int, default=-1
        Dimension along which to apply DST.

    Returns
    -------
    Tensor
        DST coefficients.
    """
    n = x.shape[dim]

    # Create DST matrix
    dst_matrix = self._create_dst_matrix(n, x.device, x.dtype)

    # Apply DST via matrix multiplication
    if dim == -1 or dim == x.ndim - 1:
        result = torch.matmul(x, dst_matrix.T)
    else:
        # Move dimension to last position
        x_moved = x.transpose(dim, -1)
        result = torch.matmul(x_moved, dst_matrix.T)
        result = result.transpose(dim, -1)

    return result

inverse_transform ¶

inverse_transform(x: Tensor, dim: int = -1) -> Tensor

Apply inverse DST (DST-III).

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	DST coefficients.	required
`dim`	`int`	Dimension along which to apply inverse DST.	`-1`

Returns:

Type	Description
`Tensor`	Reconstructed signal.

Source code in spectrans/transforms/cosine.py

def inverse_transform(self, x: Tensor, dim: int = -1) -> Tensor:
    """Apply inverse DST (DST-III).

    Parameters
    ----------
    x : Tensor
        DST coefficients.
    dim : int, default=-1
        Dimension along which to apply inverse DST.

    Returns
    -------
    Tensor
        Reconstructed signal.
    """
    n = x.shape[dim]

    # Create inverse DST matrix (DST-III)
    idst_matrix = self._create_idst_matrix(n, x.device, x.dtype)

    # Apply inverse DST via matrix multiplication
    if dim == -1 or dim == x.ndim - 1:
        result = torch.matmul(x, idst_matrix.T)
    else:
        # Move dimension to last position
        x_moved = x.transpose(dim, -1)
        result = torch.matmul(x_moved, idst_matrix.T)
        result = result.transpose(dim, -1)

    return result

DCT2D ¶

DCT2D(normalized: bool = True)

Bases: SpectralTransform2D

2D Discrete Cosine Transform.

Applies DCT-II along both spatial dimensions, commonly used in image compression (e.g., JPEG).

Parameters:

Name	Type	Description	Default
`normalized`	`bool`	Whether to use orthonormal normalization.	`True`

Methods:

Name	Description
`transform`	Apply 2D DCT.
`inverse_transform`	Apply inverse 2D DCT.

Source code in spectrans/transforms/cosine.py

def __init__(self, normalized: bool = True):
    super().__init__()
    self.normalized = normalized
    self.dct = DCT(normalized=normalized)

Functions¶

transform ¶

transform(x: Tensor, dim: tuple[int, int] = (-2, -1)) -> Tensor

Apply 2D DCT.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor.	required
`dim`	`tuple[int, int]`	Dimensions along which to apply 2D DCT.	`(-2, -1)`

Returns:

Type	Description
`Tensor`	2D DCT coefficients.

Source code in spectrans/transforms/cosine.py

def transform(self, x: Tensor, dim: tuple[int, int] = (-2, -1)) -> Tensor:
    """Apply 2D DCT.

    Parameters
    ----------
    x : Tensor
        Input tensor.
    dim : tuple[int, int], default=(-2, -1)
        Dimensions along which to apply 2D DCT.

    Returns
    -------
    Tensor
        2D DCT coefficients.
    """
    # Apply DCT along first dimension
    result = self.dct.transform(x, dim=dim[0])
    # Apply DCT along second dimension
    result = self.dct.transform(result, dim=dim[1])
    return result

inverse_transform ¶

inverse_transform(x: Tensor, dim: tuple[int, int] = (-2, -1)) -> Tensor

Apply inverse 2D DCT.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	2D DCT coefficients.	required
`dim`	`tuple[int, int]`	Dimensions along which to apply inverse 2D DCT.	`(-2, -1)`

Returns:

Type	Description
`Tensor`	Reconstructed signal.

Source code in spectrans/transforms/cosine.py

def inverse_transform(self, x: Tensor, dim: tuple[int, int] = (-2, -1)) -> Tensor:
    """Apply inverse 2D DCT.

    Parameters
    ----------
    x : Tensor
        2D DCT coefficients.
    dim : tuple[int, int], default=(-2, -1)
        Dimensions along which to apply inverse 2D DCT.

    Returns
    -------
    Tensor
        Reconstructed signal.
    """
    # Apply inverse DCT along second dimension
    result = self.dct.inverse_transform(x, dim=dim[1])
    # Apply inverse DCT along first dimension
    result = self.dct.inverse_transform(result, dim=dim[0])
    return result

MDCT ¶

MDCT(block_size: int, window: str = 'sine')

Bases: OrthogonalTransform

Modified Discrete Cosine Transform.

The MDCT is a lapped transform based on DCT-IV with 50% overlap, commonly used in audio compression (MP3, AAC).

Parameters:

Name	Type	Description	Default
`block_size`	`int`	Size of the transform block (must be even).	required
`window`	`str`	Window function to use: "sine" or "vorbis".	`"sine"`

Methods:

Name	Description
`transform`	Apply MDCT.
`inverse_transform`	Apply inverse MDCT.

Source code in spectrans/transforms/cosine.py

def __init__(self, block_size: int, window: str = "sine"):
    super().__init__()
    if block_size % 2 != 0:
        raise ValueError("Block size must be even for MDCT")

    self.block_size = block_size
    self.half_block = block_size // 2
    self.window_type = window

Functions¶

transform ¶

transform(x: Tensor, dim: int = -1) -> Tensor

Apply MDCT.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor. Length along dim must be multiple of half_block.	required
`dim`	`int`	Dimension along which to apply MDCT.	`-1`

Returns:

Type	Description
`Tensor`	MDCT coefficients.

Source code in spectrans/transforms/cosine.py

def transform(self, x: Tensor, dim: int = -1) -> Tensor:
    """Apply MDCT.

    Parameters
    ----------
    x : Tensor
        Input tensor. Length along dim must be multiple of half_block.
    dim : int, default=-1
        Dimension along which to apply MDCT.

    Returns
    -------
    Tensor
        MDCT coefficients.
    """
    n = x.shape[dim]
    if n % self.half_block != 0:
        raise ValueError(f"Input length {n} must be multiple of {self.half_block}")

    # Number of blocks
    num_blocks = (n - self.half_block) // self.half_block

    # Get window
    window = self._get_window(self.block_size, x.device, x.dtype)

    # Prepare output
    output_shape = list(x.shape)
    output_shape[dim] = num_blocks * self.half_block
    output = torch.zeros(output_shape, device=x.device, dtype=x.dtype)

    # Process overlapping blocks
    for i in range(num_blocks):
        start = i * self.half_block
        end = start + self.block_size

        # Extract and window block
        if dim == -1:
            block = x[..., start:end] * window
        else:
            indices = torch.arange(start, end, device=x.device)
            block = torch.index_select(x, dim, indices)
            block = block * window.reshape([-1] + [1] * (x.ndim - dim - 1))

        # Apply DCT-IV (simplified using DCT-II)
        block_dct = self._dct4(block, dim=-1 if dim == -1 else dim)

        # Store result
        out_start = i * self.half_block
        out_end = out_start + self.half_block

        if dim == -1:
            output[..., out_start:out_end] = block_dct[..., : self.half_block]
        else:
            # Handle arbitrary dimension
            indices = torch.arange(out_start, out_end, device=x.device)
            output.index_copy_(
                dim,
                indices,
                torch.index_select(
                    block_dct, dim, torch.arange(self.half_block, device=x.device)
                ),
            )

    return output

inverse_transform ¶

inverse_transform(x: Tensor, dim: int = -1) -> Tensor

Apply inverse MDCT.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	MDCT coefficients.	required
`dim`	`int`	Dimension along which to apply inverse MDCT.	`-1`

Returns:

Type	Description
`Tensor`	Reconstructed signal with overlap-add.

Source code in spectrans/transforms/cosine.py

def inverse_transform(self, x: Tensor, dim: int = -1) -> Tensor:
    """Apply inverse MDCT.

    Parameters
    ----------
    x : Tensor
        MDCT coefficients.
    dim : int, default=-1
        Dimension along which to apply inverse MDCT.

    Returns
    -------
    Tensor
        Reconstructed signal with overlap-add.
    """
    # Inverse MDCT implementation would require overlap-add reconstruction
    # This is complex and beyond the scope of this basic implementation
    raise NotImplementedError("Inverse MDCT requires overlap-add reconstruction")