Kernel Functions¶

spectrans.kernels ¶

Kernel functions for spectral transformers.

This module provides kernel functions and feature maps used in spectral attention mechanisms and other kernel-based methods. It includes both explicit kernel evaluations and implicit representations through random feature maps.

The kernels approximate attention mechanisms with linear complexity through random feature expansions and spectral decompositions.

Modules:

Name	Description
`base`	Base classes and interfaces for kernel functions.
`rff`	Random Fourier Features implementations.
`spectral`	Spectral kernel functions and decompositions.

Classes:

Name	Description
`CosineKernel`	Cosine similarity kernel.
`FourierKernel`	Kernel defined in Fourier domain.
`GaussianRFFKernel`	Gaussian kernel with RFF approximation.
`KernelFunction`	Abstract base class for kernel functions.
`KernelType`	Type literal for kernel selection.
`LaplacianRFFKernel`	Laplacian kernel with RFF approximation.
`LearnableSpectralKernel`	Spectral kernel with learnable parameters.
`OrthogonalRandomFeatures`	Orthogonal variant of random features.
`PolynomialKernel`	Polynomial kernel implementation.
`PolynomialSpectralKernel`	Polynomial kernel with spectral decomposition.
`RFFAttentionKernel`	RFF designed for attention mechanisms.
`RandomFeatureMap`	Abstract base class for random feature approximations.
`ShiftInvariantKernel`	Base class for shift-invariant kernels.
`SpectralKernel`	Base class for spectral kernels.
`TruncatedSVDKernel`	Kernel approximation via truncated SVD.

Examples:

Using Gaussian RFF kernel:

>>> from spectrans.kernels import GaussianRFFKernel
>>> kernel = GaussianRFFKernel(input_dim=64, num_features=256, sigma=1.0)
>>> x = torch.randn(32, 100, 64)
>>> features = kernel(x)
>>> assert features.shape == (32, 100, 256)

Using learnable spectral kernel:

>>> from spectrans.kernels import LearnableSpectralKernel
>>> kernel = LearnableSpectralKernel(input_dim=64, rank=16)
>>> K = kernel.compute(x, x)
>>> assert K.shape == (32, 100, 100)

Notes

Kernel approximation achieves linear complexity attention mechanisms through random feature expansions and spectral decompositions. Random Fourier Features, based on Bochner's theorem, approximate shift-invariant kernels via the factorization \(k(\mathbf{x}, \mathbf{y}) \approx \varphi(\mathbf{x})^T \varphi(\mathbf{y})\) where \(\varphi\) maps inputs to a feature space.

Spectral decomposition methods leverage eigendecomposition for kernel computation through low-rank approximations, while orthogonal feature variants apply orthogonalized random projections to reduce approximation variance. The approximation error decreases with \(O(1/\sqrt{D})\) where \(D\) is the number of random features.

References

Ali Rahimi and Benjamin Recht. 2007. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20 (NeurIPS 2007), pages 1177-1184.

Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, and Adrian Weller. 2021. Rethinking attention with performers. In Proceedings of the International Conference on Learning Representations (ICLR).

Classes¶

CosineKernel ¶

CosineKernel(eps: float = 1e-08)

Bases: KernelFunction

Cosine similarity kernel.

The kernel function is: \(k(\mathbf{x}, \mathbf{y}) =\) \(\frac{\langle \mathbf{x}, \mathbf{y} \rangle}{\|\mathbf{x}\| \|\mathbf{y}\|}\).

Parameters:

Name	Type	Description	Default
`eps`	`float`	Small value for numerical stability.	`1e-8`

Attributes:

Name	Type	Description
`eps`	`float`	Numerical stability parameter.

Methods:

Name	Description
`compute`	Compute cosine similarity kernel matrix.

Source code in spectrans/kernels/base.py

def __init__(self, eps: float = 1e-8):
    self.eps = eps

Functions¶

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute cosine similarity kernel matrix.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute cosine similarity kernel matrix.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    x_norm = torch.norm(x, dim=-1, keepdim=True)  # (..., n, 1)
    y_norm = torch.norm(y, dim=-1, keepdim=True)  # (..., m, 1)

    x_normalized = x / (x_norm + self.eps)
    y_normalized = y / (y_norm + self.eps)

    return torch.matmul(x_normalized, y_normalized.transpose(-2, -1))

KernelFunction ¶

Bases: ABC

Abstract base class for kernel functions.

A kernel function \(k(\mathbf{x}, \mathbf{y})\) defines a similarity measure between inputs \(\mathbf{x}\) and \(\mathbf{y}\), satisfying positive semi-definiteness properties. This interface supports both explicit kernel evaluation and feature map representations.

Methods:

Name	Description
`compute`	Compute kernel values between x and y.
`gram_matrix`	Compute Gram matrix \(K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j)\).
`is_positive_definite`	Check if the kernel yields a positive definite Gram matrix.

Functions¶

compute `abstractmethod` ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute kernel values between x and y.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input tensor of shape (..., n, d).	required
`y`	`Tensor`	Second input tensor of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m) where element \((i,j)\) contains \(k(\mathbf{x}_i, \mathbf{y}_j)\).

Source code in spectrans/kernels/base.py

@abstractmethod
def compute(self, x: Tensor, y: Tensor) -> Tensor:
    r"""Compute kernel values between x and y.

    Parameters
    ----------
    x : Tensor
        First input tensor of shape (..., n, d).
    y : Tensor
        Second input tensor of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m) where element $(i,j)$
        contains $k(\mathbf{x}_i, \mathbf{y}_j)$.
    """
    pass

gram_matrix ¶

gram_matrix(x: Tensor) -> Tensor

Compute Gram matrix \(K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j)\).

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Gram matrix of shape (..., n, n).

Source code in spectrans/kernels/base.py

def gram_matrix(self, x: Tensor) -> Tensor:
    r"""Compute Gram matrix $K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j)$.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Gram matrix of shape (..., n, n).
    """
    return self.compute(x, x)

is_positive_definite ¶

is_positive_definite(x: Tensor, eps: float = 1e-06) -> bool

Check if the kernel yields a positive definite Gram matrix.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required
`eps`	`float`	Tolerance for eigenvalue positivity check.	`1e-6`

Returns:

Type	Description
`bool`	True if all eigenvalues of Gram matrix are > eps.

Source code in spectrans/kernels/base.py

def is_positive_definite(self, x: Tensor, eps: float = 1e-6) -> bool:
    """Check if the kernel yields a positive definite Gram matrix.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).
    eps : float, default=1e-6
        Tolerance for eigenvalue positivity check.

    Returns
    -------
    bool
        True if all eigenvalues of Gram matrix are > eps.
    """
    gram = self.gram_matrix(x)
    eigenvalues = torch.linalg.eigvalsh(gram)
    return bool(torch.all(eigenvalues > eps).item())

PolynomialKernel ¶

PolynomialKernel(degree: int = 2, alpha: float = 1.0, coef0: float = 0.0)

Bases: KernelFunction

Polynomial kernel.

The kernel function is: \(k(\mathbf{x}, \mathbf{y}) = (\alpha \langle \mathbf{x}, \mathbf{y} \rangle + c)^d\).

Parameters:

Name	Type	Description	Default
`degree`	`int`	Polynomial degree.	`2`
`alpha`	`float`	Scaling of inner product.	`1.0`
`coef0`	`float`	Constant term.	`0.0`

Attributes:

Name	Type	Description
`degree`	`int`	The polynomial degree.
`alpha`	`float`	Inner product scaling.
`coef0`	`float`	Constant coefficient.

Methods:

Name	Description
`compute`	Compute polynomial kernel matrix.

Source code in spectrans/kernels/base.py

def __init__(
    self,
    degree: int = 2,
    alpha: float = 1.0,
    coef0: float = 0.0,
):
    self.degree = degree
    self.alpha = alpha
    self.coef0 = coef0

Functions¶

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute polynomial kernel matrix.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute polynomial kernel matrix.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    inner_product = torch.matmul(x, y.transpose(-2, -1))
    return (self.alpha * inner_product + self.coef0) ** self.degree

RandomFeatureMap ¶

RandomFeatureMap(input_dim: int, num_features: int, kernel_scale: float = 1.0, seed: int | None = None)

Bases: Module, ABC

Abstract base class for random feature map approximations.

Random feature maps provide finite-dimensional approximations to kernel functions through the mapping:

.. math:: k(\mathbf{x}, \mathbf{y}) \approx \varphi(\mathbf{x})^T \varphi(\mathbf{y})

This enables linear-time computation of kernel operations.

Parameters:

Name	Type	Description	Default
`input_dim`	`int`	Dimension of input vectors.	required
`num_features`	`int`	Number of random features (D).	required
`kernel_scale`	`float`	Scaling parameter for the kernel.	`1.0`
`seed`	`int \| None`	Random seed for reproducibility.	`None`

Attributes:

Name	Type	Description
`input_dim`	`int`	Input dimension.
`num_features`	`int`	Number of random features.
`kernel_scale`	`float`	Kernel scaling parameter.

Methods:

Name	Description
`forward`	Apply feature map to input.
`kernel_approximation`	Approximate kernel matrix using feature maps.

Source code in spectrans/kernels/base.py

def __init__(
    self,
    input_dim: int,
    num_features: int,
    kernel_scale: float = 1.0,
    seed: int | None = None,
):
    super().__init__()
    self.input_dim = input_dim
    self.num_features = num_features
    self.kernel_scale = kernel_scale

    if seed is not None:
        torch.manual_seed(seed)

Functions¶

forward `abstractmethod` ¶

forward(x: Tensor) -> Tensor

Apply feature map to input.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Feature mapped tensor of shape (..., n, D) where D is the number of random features.

Source code in spectrans/kernels/base.py

@abstractmethod
def forward(self, x: Tensor) -> Tensor:
    """Apply feature map to input.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Feature mapped tensor of shape (..., n, D) where D
        is the number of random features.
    """
    pass

kernel_approximation ¶

kernel_approximation(x: Tensor, y: Tensor) -> Tensor

Approximate kernel matrix using feature maps.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Approximated kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py

def kernel_approximation(self, x: Tensor, y: Tensor) -> Tensor:
    """Approximate kernel matrix using feature maps.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Approximated kernel matrix of shape (..., n, m).
    """
    phi_x = self.forward(x)  # (..., n, D)
    phi_y = self.forward(y)  # (..., m, D)
    return torch.matmul(phi_x, phi_y.transpose(-2, -1))

ShiftInvariantKernel ¶

ShiftInvariantKernel(bandwidth: float = 1.0)

Bases: KernelFunction

Base class for shift-invariant (stationary) kernels.

Shift-invariant kernels depend only on the difference \(\mathbf{x} - \mathbf{y}\), i.e., \(k(\mathbf{x}, \mathbf{y}) = k(\mathbf{x} - \mathbf{y}, \mathbf{0})\) \(= \kappa(\mathbf{x} - \mathbf{y})\) for some function \(\kappa\).

These kernels admit Random Fourier Features approximation via Bochner's theorem.

Parameters:

Name	Type	Description	Default
`bandwidth`	`float`	Kernel bandwidth parameter (inverse of length scale).	`1.0`

Attributes:

Name	Type	Description
`bandwidth`	`float`	The bandwidth parameter.

Methods:

Name	Description
`evaluate_difference`	Evaluate kernel on difference vectors.
`compute`	Compute kernel matrix for shift-invariant kernel.
`spectral_density`	Fourier transform of the kernel (spectral density).

Source code in spectrans/kernels/base.py

def __init__(self, bandwidth: float = 1.0):
    self.bandwidth = bandwidth

Functions¶

evaluate_difference `abstractmethod` ¶

evaluate_difference(diff: Tensor) -> Tensor

Evaluate kernel on difference vectors.

Parameters:

Name	Type	Description	Default
`diff`	`Tensor`	Difference vectors \(\mathbf{x} - \mathbf{y}\) of shape (..., d).	required

Returns:

Type	Description
`Tensor`	Kernel values \(\kappa(\text{diff})\) of shape (...).

Source code in spectrans/kernels/base.py

@abstractmethod
def evaluate_difference(self, diff: Tensor) -> Tensor:
    r"""Evaluate kernel on difference vectors.

    Parameters
    ----------
    diff : Tensor
        Difference vectors $\mathbf{x} - \mathbf{y}$ of shape (..., d).

    Returns
    -------
    Tensor
        Kernel values $\kappa(\text{diff})$ of shape (...).
    """
    pass

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute kernel matrix for shift-invariant kernel.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute kernel matrix for shift-invariant kernel.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    # Compute pairwise differences
    x_expanded = x.unsqueeze(-2)  # (..., n, 1, d)
    y_expanded = y.unsqueeze(-3)  # (..., 1, m, d)
    diff = x_expanded - y_expanded  # (..., n, m, d)

    # Evaluate kernel on differences
    return self.evaluate_difference(diff)

spectral_density `abstractmethod` ¶

spectral_density(omega: Tensor) -> Tensor

Fourier transform of the kernel (spectral density).

For shift-invariant kernels, this defines the sampling distribution for Random Fourier Features.

Parameters:

Name	Type	Description	Default
`omega`	`Tensor`	Frequency vectors of shape (..., d).	required

Returns:

Type	Description
`Tensor`	Spectral density values of shape (...).

Source code in spectrans/kernels/base.py

@abstractmethod
def spectral_density(self, omega: Tensor) -> Tensor:
    """Fourier transform of the kernel (spectral density).

    For shift-invariant kernels, this defines the sampling
    distribution for Random Fourier Features.

    Parameters
    ----------
    omega : Tensor
        Frequency vectors of shape (..., d).

    Returns
    -------
    Tensor
        Spectral density values of shape (...).
    """
    pass

GaussianRFFKernel ¶

GaussianRFFKernel(input_dim: int, num_features: int, sigma: float = 1.0, use_cos_sin: bool = False, orthogonal: bool = False, trainable: bool = False, seed: int | None = None)

Bases: ShiftInvariantKernel, RandomFeatureMap

Gaussian (RBF) kernel with Random Fourier Features approximation.

Implements the Gaussian kernel using RFF.

The kernel function is: \(k(\mathbf{x}, \mathbf{y}) = \exp\left(-\frac{\|\mathbf{x} - \mathbf{y}\|^2}{2\sigma^2}\right)\).

Parameters:

Name	Type	Description	Default
`input_dim`	`int`	Dimension of input vectors.	required
`num_features`	`int`	Number of random Fourier features.	required
`sigma`	`float`	Kernel bandwidth (standard deviation).	`1.0`
`use_cos_sin`	`bool`	If True, use both cos and sin features (doubles feature dimension).	`False`
`orthogonal`	`bool`	If True, use orthogonal random features.	`False`
`trainable`	`bool`	If True, make random parameters trainable.	`False`
`seed`	`int \| None`	Random seed for reproducibility.	`None`

Attributes:

Name	Type	Description
`omega`	`Parameter or Tensor`	Random frequencies of shape (input_dim, num_features).
`bias`	`Parameter or Tensor`	Random phase shifts of shape (num_features,).

Methods:

Name	Description
`forward`	Apply random Fourier feature map.
`evaluate_difference`	Evaluate Gaussian kernel on difference vectors.
`spectral_density`	Spectral density for Gaussian kernel (Gaussian distribution).

Source code in spectrans/kernels/rff.py

def __init__(
    self,
    input_dim: int,
    num_features: int,
    sigma: float = 1.0,
    use_cos_sin: bool = False,
    orthogonal: bool = False,
    trainable: bool = False,
    seed: int | None = None,
):
    ShiftInvariantKernel.__init__(self, bandwidth=1.0 / sigma)
    RandomFeatureMap.__init__(self, input_dim, num_features, kernel_scale=sigma, seed=seed)

    self.sigma = sigma
    self.use_cos_sin = use_cos_sin
    self.orthogonal = orthogonal
    self.trainable = trainable

    # Effective number of output features
    self.output_features = num_features * 2 if use_cos_sin else num_features

    # Initialize random parameters
    self._initialize_parameters()

Functions¶

forward ¶

forward(x: Tensor) -> Tensor

Apply random Fourier feature map.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Feature mapped tensor of shape (..., n, D) where D is self.output_features.

Source code in spectrans/kernels/rff.py

def forward(self, x: Tensor) -> Tensor:
    """Apply random Fourier feature map.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Feature mapped tensor of shape (..., n, D) where D is
        self.output_features.
    """
    # Linear projection: (..., n, d) @ (d, m) -> (..., n, m)
    projection = torch.matmul(x, self.omega)

    # Add phase shifts
    projection = projection + self.bias

    if self.use_cos_sin:
        # Use both cos and sin features
        cos_features = torch.cos(projection)
        sin_features = torch.sin(projection)
        features = torch.cat([cos_features, sin_features], dim=-1)
        # Normalization factor for cos+sin
        scale = math.sqrt(1.0 / self.num_features)
    else:
        # Use only cos features
        features = torch.cos(projection)
        # Normalization factor for cos only
        scale = math.sqrt(2.0 / self.num_features)

    return features * scale

evaluate_difference ¶

evaluate_difference(diff: Tensor) -> Tensor

Evaluate Gaussian kernel on difference vectors.

Parameters:

Name	Type	Description	Default
`diff`	`Tensor`	Difference vectors of shape (..., d).	required

Returns:

Type	Description
`Tensor`	Kernel values of shape (...).

Source code in spectrans/kernels/rff.py

def evaluate_difference(self, diff: Tensor) -> Tensor:
    """Evaluate Gaussian kernel on difference vectors.

    Parameters
    ----------
    diff : Tensor
        Difference vectors of shape (..., d).

    Returns
    -------
    Tensor
        Kernel values of shape (...).
    """
    squared_norm = torch.sum(diff**2, dim=-1)
    return torch.exp(-squared_norm / (2 * self.sigma**2))

spectral_density ¶

spectral_density(omega: Tensor) -> Tensor

Spectral density for Gaussian kernel (Gaussian distribution).

Parameters:

Name	Type	Description	Default
`omega`	`Tensor`	Frequency vectors of shape (..., d).	required

Returns:

Type	Description
`Tensor`	Spectral density values of shape (...).

Source code in spectrans/kernels/rff.py

def spectral_density(self, omega: Tensor) -> Tensor:
    """Spectral density for Gaussian kernel (Gaussian distribution).

    Parameters
    ----------
    omega : Tensor
        Frequency vectors of shape (..., d).

    Returns
    -------
    Tensor
        Spectral density values of shape (...).
    """
    d = omega.shape[-1]
    norm_squared = torch.sum(omega**2, dim=-1)
    # Gaussian spectral density
    result: Tensor = (2 * math.pi * self.sigma**2) ** (d / 2) * torch.exp(
        -0.5 * self.sigma**2 * norm_squared
    )
    return result

LaplacianRFFKernel ¶

LaplacianRFFKernel(input_dim: int, num_features: int, sigma: float = 1.0, use_cos_sin: bool = False, trainable: bool = False, seed: int | None = None)

Bases: ShiftInvariantKernel, RandomFeatureMap

Laplacian kernel with Random Fourier Features approximation.

Implements the Laplacian kernel using RFF with Cauchy distribution.

The kernel function is: \(k(\mathbf{x}, \mathbf{y}) = \exp\left(-\frac{\|\mathbf{x} - \mathbf{y}\|_1}{\sigma}\right)\).

Parameters:

Name	Type	Description	Default
`input_dim`	`int`	Dimension of input vectors.	required
`num_features`	`int`	Number of random Fourier features.	required
`sigma`	`float`	Kernel bandwidth parameter.	`1.0`
`use_cos_sin`	`bool`	If True, use both cos and sin features.	`False`
`trainable`	`bool`	If True, make random parameters trainable.	`False`
`seed`	`int \| None`	Random seed for reproducibility.	`None`

Methods:

Name	Description
`forward`	Apply random Fourier feature map.
`evaluate_difference`	Evaluate Laplacian kernel on difference vectors.
`spectral_density`	Spectral density for Laplacian kernel (Cauchy distribution).

Source code in spectrans/kernels/rff.py

def __init__(
    self,
    input_dim: int,
    num_features: int,
    sigma: float = 1.0,
    use_cos_sin: bool = False,
    trainable: bool = False,
    seed: int | None = None,
):
    ShiftInvariantKernel.__init__(self, bandwidth=1.0 / sigma)
    RandomFeatureMap.__init__(self, input_dim, num_features, kernel_scale=sigma, seed=seed)

    self.sigma = sigma
    self.use_cos_sin = use_cos_sin
    self.trainable = trainable

    self.output_features = num_features * 2 if use_cos_sin else num_features

    self._initialize_parameters()

Functions¶

forward ¶

forward(x: Tensor) -> Tensor

Apply random Fourier feature map.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Feature mapped tensor of shape (..., n, D).

Source code in spectrans/kernels/rff.py

def forward(self, x: Tensor) -> Tensor:
    """Apply random Fourier feature map.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Feature mapped tensor of shape (..., n, D).
    """
    projection = torch.matmul(x, self.omega) + self.bias

    if self.use_cos_sin:
        cos_features = torch.cos(projection)
        sin_features = torch.sin(projection)
        features = torch.cat([cos_features, sin_features], dim=-1)
        scale = math.sqrt(1.0 / self.num_features)
    else:
        features = torch.cos(projection)
        scale = math.sqrt(2.0 / self.num_features)

    return features * scale

evaluate_difference ¶

evaluate_difference(diff: Tensor) -> Tensor

Evaluate Laplacian kernel on difference vectors.

Parameters:

Name	Type	Description	Default
`diff`	`Tensor`	Difference vectors of shape (..., d).	required

Returns:

Type	Description
`Tensor`	Kernel values of shape (...).

Source code in spectrans/kernels/rff.py

def evaluate_difference(self, diff: Tensor) -> Tensor:
    """Evaluate Laplacian kernel on difference vectors.

    Parameters
    ----------
    diff : Tensor
        Difference vectors of shape (..., d).

    Returns
    -------
    Tensor
        Kernel values of shape (...).
    """
    l1_norm = torch.sum(torch.abs(diff), dim=-1)
    return torch.exp(-l1_norm / self.sigma)

spectral_density ¶

spectral_density(omega: Tensor) -> Tensor

Spectral density for Laplacian kernel (Cauchy distribution).

Parameters:

Name	Type	Description	Default
`omega`	`Tensor`	Frequency vectors of shape (..., d).	required

Returns:

Type	Description
`Tensor`	Spectral density values of shape (...).

Source code in spectrans/kernels/rff.py

def spectral_density(self, omega: Tensor) -> Tensor:
    """Spectral density for Laplacian kernel (Cauchy distribution).

    Parameters
    ----------
    omega : Tensor
        Frequency vectors of shape (..., d).

    Returns
    -------
    Tensor
        Spectral density values of shape (...).
    """
    d = omega.shape[-1]
    # Product of 1D Cauchy densities
    density = torch.ones_like(omega[..., 0])
    for i in range(d):
        density = density * (
            2 * self.sigma / (math.pi * (1 + (self.sigma * omega[..., i]) ** 2))
        )
    return density

OrthogonalRandomFeatures ¶

OrthogonalRandomFeatures(input_dim: int, num_features: int, kernel_type: Literal['gaussian', 'laplacian'] = 'gaussian', sigma: float = 1.0, use_hadamard: bool = False, trainable: bool = False, seed: int | None = None)

Bases: RandomFeatureMap

Orthogonal Random Features for kernel approximation.

Uses structured orthogonal matrices to reduce approximation variance compared to standard i.i.d. Gaussian features.

Parameters:

Name	Type	Description	Default
`input_dim`	`int`	Dimension of input vectors.	required
`num_features`	`int`	Number of random features.	required
`kernel_type`	`Literal['gaussian', 'laplacian']`	Type of kernel to approximate.	`"gaussian"`
`sigma`	`float`	Kernel bandwidth parameter.	`1.0`
`use_hadamard`	`bool`	If True, use fast Hadamard transform.	`False`
`trainable`	`bool`	If True, make scaling parameters trainable.	`False`
`seed`	`int \| None`	Random seed.	`None`

Methods:

Name	Description
`forward`	Apply orthogonal random feature map.

Source code in spectrans/kernels/rff.py

def __init__(
    self,
    input_dim: int,
    num_features: int,
    kernel_type: Literal["gaussian", "laplacian"] = "gaussian",
    sigma: float = 1.0,
    use_hadamard: bool = False,
    trainable: bool = False,
    seed: int | None = None,
):
    super().__init__(input_dim, num_features, kernel_scale=sigma, seed=seed)

    self.kernel_type = kernel_type
    self.sigma = sigma
    self.use_hadamard = use_hadamard
    self.trainable = trainable

    self._initialize_parameters()

Functions¶

forward ¶

forward(x: Tensor) -> Tensor

Apply orthogonal random feature map.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Feature mapped tensor of shape (..., n, D).

Source code in spectrans/kernels/rff.py

def forward(self, x: Tensor) -> Tensor:
    """Apply orthogonal random feature map.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Feature mapped tensor of shape (..., n, D).
    """
    if self.use_hadamard:
        # Pad input if necessary
        if x.shape[-1] < self.d_padded:
            padding = self.d_padded - x.shape[-1]
            x = F.pad(x, (0, padding))

        # Apply HD HD HD structure
        z = x
        for i in range(3):
            if hasattr(self, "diagonals"):
                diag = self.diagonals[i]
            else:
                diag = getattr(self, f"diagonal_{i}")
            z = z * diag
            z = self._hadamard_transform(z)

        # Truncate to desired number of features
        projection = z[..., : self.num_features]
    else:
        projection = torch.matmul(x, self.projection)

    # Add bias and apply cosine
    projection = projection + self.bias
    features = torch.cos(projection)

    # Normalize
    scale = math.sqrt(2.0 / self.num_features)
    return features * scale

RFFAttentionKernel ¶

RFFAttentionKernel(input_dim: int, num_features: int, kernel_type: Literal['softmax', 'relu', 'elu'] = 'softmax', use_orthogonal: bool = True, redraw: bool = False, seed: int | None = None)

Bases: RandomFeatureMap

Random Fourier Features specifically designed for attention mechanisms.

Implements positive random features for use in linear attention, following the Performer architecture.

Parameters:

Name	Type	Description	Default
`input_dim`	`int`	Dimension of input vectors (typically head_dim).	required
`num_features`	`int`	Number of random features.	required
`kernel_type`	`Literal['softmax', 'relu', 'elu']`	Type of kernel approximation.	`"softmax"`
`use_orthogonal`	`bool`	If True, use orthogonal random features.	`True`
`redraw`	`bool`	If True, redraw random features at each forward pass.	`False`
`seed`	`int \| None`	Random seed.	`None`

Methods:

Name	Description
`forward`	Apply random feature map for attention.

Source code in spectrans/kernels/rff.py

def __init__(
    self,
    input_dim: int,
    num_features: int,
    kernel_type: Literal["softmax", "relu", "elu"] = "softmax",
    use_orthogonal: bool = True,
    redraw: bool = False,
    seed: int | None = None,
):
    super().__init__(input_dim, num_features, seed=seed)

    self.kernel_type = kernel_type
    self.use_orthogonal = use_orthogonal
    self.redraw = redraw

    if not redraw:
        self._initialize_parameters()

Functions¶

forward ¶

forward(x: Tensor) -> Tensor

Apply random feature map for attention.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Positive feature mapped tensor of shape (..., n, D).

Source code in spectrans/kernels/rff.py

def forward(self, x: Tensor) -> Tensor:
    """Apply random feature map for attention.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Positive feature mapped tensor of shape (..., n, D).
    """
    if self.redraw:
        # Redraw random features (useful for training)
        projection = (
            self._sample_orthogonal_gaussian()
            if self.use_orthogonal
            else torch.randn(self.input_dim, self.num_features, device=x.device)
        )
        projection = projection / math.sqrt(self.input_dim)
    else:
        projection = self.projection

    # Linear projection
    z = torch.matmul(x, projection)

    if self.kernel_type == "softmax":
        # Positive features for softmax kernel approximation
        # $\varphi(\mathbf{x}) = \exp(\mathbf{x}^T \omega - \|\mathbf{x}\|^2/2) / \sqrt{m}$
        x_norm_sq = torch.sum(x**2, dim=-1, keepdim=True) / 2
        features = torch.exp(z - x_norm_sq)
        scale = 1.0 / math.sqrt(self.num_features)

    elif self.kernel_type == "relu":
        # ReLU kernel: $\max(0, \mathbf{x}^T \omega)$
        features = F.relu(z)
        scale = math.sqrt(2.0 / self.num_features)

    else:  # elu
        # ELU kernel for smooth approximation
        features = F.elu(z) + 1
        scale = 1.0 / math.sqrt(self.num_features)

    return features * scale

FourierKernel ¶

FourierKernel(rank: int, input_dim: int, learnable_filter: bool = True, filter_type: Literal['gaussian', 'butterworth', 'ideal'] = 'gaussian', cutoff_freq: float = 0.5)

Bases: Module, SpectralKernel

Kernel defined in Fourier domain.

Defines kernel through spectral filters in frequency space.

Parameters:

Name	Type	Description	Default
`rank`	`int`	Number of Fourier modes.	required
`input_dim`	`int`	Input dimension.	required
`learnable_filter`	`bool`	Whether filter is learnable.	`True`
`filter_type`	`Literal['gaussian', 'butterworth', 'ideal']`	Type of spectral filter.	`"gaussian"`
`cutoff_freq`	`float`	Normalized cutoff frequency.	`0.5`

Attributes:

Name	Type	Description
`filter`	`Parameter or Tensor`	Spectral filter of shape (rank,).

Methods:

Name	Description
`compute`	Compute Fourier kernel.

Source code in spectrans/kernels/spectral.py

def __init__(
    self,
    rank: int,
    input_dim: int,
    learnable_filter: bool = True,
    filter_type: Literal["gaussian", "butterworth", "ideal"] = "gaussian",
    cutoff_freq: float = 0.5,
):
    # Use super() to initialize nn.Module (first in MRO)
    super().__init__()
    # Manually set attributes that SpectralKernel.__init__ would set
    self.rank = rank
    self.normalize = True

    self.input_dim = input_dim
    self.filter_type = filter_type
    self.cutoff_freq = cutoff_freq

    # Initialize spectral filter
    filter_vals = self._init_filter()

    if learnable_filter:
        self.filter = nn.Parameter(filter_vals)
    else:
        self.register_buffer("filter", filter_vals)

Functions¶

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute Fourier kernel.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/spectral.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute Fourier kernel.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    # Compute FFT of inputs
    x_freq = safe_rfft(x, dim=-1)
    y_freq = safe_rfft(y, dim=-1)

    # Truncate to rank modes
    x_freq = x_freq[..., : self.rank]
    y_freq = y_freq[..., : self.rank]

    # Apply spectral filter
    x_filtered = x_freq * self.filter
    y_filtered = y_freq * self.filter

    # Compute kernel in frequency domain
    # K(x,y) = Real(IFFT(X_filtered * conj(Y_filtered)))
    kernel_freq = x_filtered.unsqueeze(-2) * y_filtered.unsqueeze(-3).conj()

    # Average over frequency dimension
    kernel: Tensor = kernel_freq.real.mean(dim=-1)

    return kernel

LearnableSpectralKernel ¶

LearnableSpectralKernel(input_dim: int, rank: int, init_scale: float = 1.0, trainable_eigenvectors: bool = True, normalize: bool = True)

Bases: Module, SpectralKernel

Spectral kernel with learnable eigenvalues and eigenfunctions.

Parameters:

Name	Type	Description	Default
`input_dim`	`int`	Input dimension.	required
`rank`	`int`	Number of spectral components.	required
`init_scale`	`float`	Initialization scale.	`1.0`
`trainable_eigenvectors`	`bool`	Whether eigenvectors are trainable.	`True`
`normalize`	`bool`	Whether to normalize.	`True`

Attributes:

Name	Type	Description
`eigenvectors`	`Parameter`	Learnable eigenvectors of shape (input_dim, rank).
`eigenvalues`	`Parameter`	Learnable eigenvalues of shape (rank,).

Methods:

Name	Description
`compute`	Compute learnable spectral kernel.
`extract_features`	Extract spectral features.
`forward`	Forward pass for nn.Module compatibility.
`orthogonalize_eigenvectors`	Orthogonalize eigenvectors via Gram-Schmidt.

Source code in spectrans/kernels/spectral.py

def __init__(
    self,
    input_dim: int,
    rank: int,
    init_scale: float = 1.0,
    trainable_eigenvectors: bool = True,
    normalize: bool = True,
):
    nn.Module.__init__(self)
    SpectralKernel.__init__(self, rank, normalize)

    self.input_dim = input_dim
    self.trainable_eigenvectors = trainable_eigenvectors

    # Initialize eigenvectors (orthogonal)
    eigenvectors = torch.randn(input_dim, rank) * init_scale
    eigenvectors, _ = torch.linalg.qr(eigenvectors)

    if trainable_eigenvectors:
        self.eigenvectors = nn.Parameter(eigenvectors)
    else:
        self.register_buffer("eigenvectors", eigenvectors)

    # Initialize eigenvalues (positive, decreasing)
    eigenvalues = torch.linspace(1.0, 0.1, rank) * init_scale
    self.eigenvalues = nn.Parameter(eigenvalues)

Functions¶

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute learnable spectral kernel.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/spectral.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute learnable spectral kernel.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    # Project to eigenspace
    x_proj = torch.matmul(x, self.eigenvectors)  # (..., n, r)
    y_proj = torch.matmul(y, self.eigenvectors)  # (..., m, r)

    # Apply eigenvalue weighting
    x_weighted = x_proj * torch.sqrt(torch.abs(self.eigenvalues) + 1e-8)
    y_weighted = y_proj * torch.sqrt(torch.abs(self.eigenvalues) + 1e-8)

    # Compute kernel
    kernel = torch.matmul(x_weighted, y_weighted.transpose(-2, -1))

    if self.normalize:
        # Row normalization
        kernel = F.normalize(kernel, p=2, dim=-1)

    return kernel

extract_features ¶

extract_features(x: Tensor) -> Tensor

Extract spectral features.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input of shape (..., n, d).	required

Returns:

Type	Description
`Tensor`	Spectral features of shape (..., n, r).

Source code in spectrans/kernels/spectral.py

def extract_features(self, x: Tensor) -> Tensor:
    """Extract spectral features.

    Parameters
    ----------
    x : Tensor
        Input of shape (..., n, d).

    Returns
    -------
    Tensor
        Spectral features of shape (..., n, r).
    """
    # Project to eigenspace
    features = torch.matmul(x, self.eigenvectors)

    # Weight by eigenvalues
    features = features * torch.sqrt(torch.abs(self.eigenvalues) + 1e-8)

    return features

forward ¶

forward(x: Tensor, y: Tensor | None = None) -> Tensor

Forward pass for nn.Module compatibility.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor \| None`	Second input. If None, returns features.	`None`

Returns:

Type	Description
`Tensor`	Kernel matrix or features.

Source code in spectrans/kernels/spectral.py

def forward(self, x: Tensor, y: Tensor | None = None) -> Tensor:
    """Forward pass for nn.Module compatibility.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor | None, default=None
        Second input. If None, returns features.

    Returns
    -------
    Tensor
        Kernel matrix or features.
    """
    if y is None:
        return self.extract_features(x)
    else:
        return self.compute(x, y)

orthogonalize_eigenvectors ¶

orthogonalize_eigenvectors() -> None

Orthogonalize eigenvectors via Gram-Schmidt.

Source code in spectrans/kernels/spectral.py

def orthogonalize_eigenvectors(self) -> None:
    """Orthogonalize eigenvectors via Gram-Schmidt."""
    if self.trainable_eigenvectors:
        with torch.no_grad():
            Q, _ = torch.linalg.qr(self.eigenvectors)
            self.eigenvectors.data = Q

PolynomialSpectralKernel ¶

PolynomialSpectralKernel(rank: int, degree: int = 2, coef0: float = 1.0, alpha: float = 1.0, normalize: bool = True)

Bases: SpectralKernel

Polynomial kernel with spectral decomposition.

Computes \((\mathbf{X}\mathbf{Y}^T + c)^d\) using eigendecomposition.

Parameters:

Name	Type	Description	Default
`rank`	`int`	Rank of spectral approximation.	required
`degree`	`int`	Polynomial degree.	`2`
`coef0`	`float`	Constant coefficient.	`1.0`
`alpha`	`float`	Scaling factor.	`1.0`
`normalize`	`bool`	Whether to normalize.	`True`

Attributes:

Name	Type	Description
`degree`	`int`	Polynomial degree.
`coef0`	`float`	Constant term.
`alpha`	`float`	Scale factor.

Methods:

Name	Description
`compute`	Compute polynomial spectral kernel.
`compute_attention`	Compute attention weights using spectral decomposition.

Source code in spectrans/kernels/spectral.py

def __init__(
    self,
    rank: int,
    degree: int = 2,
    coef0: float = 1.0,
    alpha: float = 1.0,
    normalize: bool = True,
):
    super().__init__(rank, normalize)
    self.degree = degree
    self.coef0 = coef0
    self.alpha = alpha

Functions¶

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute polynomial spectral kernel.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/spectral.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute polynomial spectral kernel.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    # Standard polynomial kernel
    inner = torch.matmul(x, y.transpose(-2, -1))
    kernel = (self.alpha * inner + self.coef0) ** self.degree

    if self.normalize:
        # Normalize by geometric mean of norms
        x_norm = torch.norm(x, dim=-1, keepdim=True)
        y_norm = torch.norm(y, dim=-1, keepdim=True)
        norm_matrix = torch.matmul(x_norm, y_norm.transpose(-2, -1))
        kernel = kernel / (norm_matrix + 1e-8)

    return kernel

compute_attention ¶

compute_attention(q: Tensor, k: Tensor) -> Tensor

Compute attention weights using spectral decomposition.

Parameters:

Name	Type	Description	Default
`q`	`Tensor`	Queries of shape (..., n, d).	required
`k`	`Tensor`	Keys of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Attention weights of shape (..., n, m).

Source code in spectrans/kernels/spectral.py

def compute_attention(self, q: Tensor, k: Tensor) -> Tensor:
    """Compute attention weights using spectral decomposition.

    Parameters
    ----------
    q : Tensor
        Queries of shape (..., n, d).
    k : Tensor
        Keys of shape (..., m, d).

    Returns
    -------
    Tensor
        Attention weights of shape (..., n, m).
    """
    # Low-rank approximation via SVD
    # Q = U_q S_q V_q^T, K = U_k S_k V_k^T

    # Compute QK^T approximately
    q_reduced = self._reduce_rank(q)  # (..., n, r)
    k_reduced = self._reduce_rank(k)  # (..., m, r)

    # Polynomial kernel in reduced space
    inner = torch.matmul(q_reduced, k_reduced.transpose(-2, -1))
    attention = (self.alpha * inner + self.coef0) ** self.degree

    if self.normalize:
        attention = F.softmax(attention, dim=-1)

    return attention

SpectralKernel ¶

SpectralKernel(rank: int, normalize: bool = True)

Bases: KernelFunction

Base class for spectral kernel functions.

Spectral kernels use eigendecomposition or spectral analysis for efficient kernel computation.

Parameters:

Name	Type	Description	Default
`rank`	`int`	Rank of spectral approximation.	required
`normalize`	`bool`	Whether to normalize kernel values.	`True`

Attributes:

Name	Type	Description
`rank`	`int`	Approximation rank.
`normalize`	`bool`	Normalization flag.

Methods:

Name	Description
`spectral_decomposition`	Compute spectral decomposition of input.

Source code in spectrans/kernels/spectral.py

def __init__(self, rank: int, normalize: bool = True):
    self.rank = rank
    self.normalize = normalize

Functions¶

spectral_decomposition ¶

spectral_decomposition(x: Tensor) -> tuple[Tensor, Tensor]

Compute spectral decomposition of input.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., n, d).	required

Returns:

Name	Type	Description
`eigenvectors`	`Tensor`	Eigenvectors of shape (..., n, rank).
`eigenvalues`	`Tensor`	Eigenvalues of shape (..., rank).

Source code in spectrans/kernels/spectral.py

def spectral_decomposition(self, x: Tensor) -> tuple[Tensor, Tensor]:
    """Compute spectral decomposition of input.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    eigenvectors : Tensor
        Eigenvectors of shape (..., n, rank).
    eigenvalues : Tensor
        Eigenvalues of shape (..., rank).
    """
    # Compute Gram matrix
    gram = torch.matmul(x, x.transpose(-2, -1))

    # Eigendecomposition
    eigenvalues, eigenvectors = torch.linalg.eigh(gram)

    # Keep top-k eigenvalues/vectors
    eigenvalues = eigenvalues[..., -self.rank :]
    eigenvectors = eigenvectors[..., -self.rank :]

    if self.normalize:
        # Normalize by trace
        trace = eigenvalues.sum(dim=-1, keepdim=True)
        eigenvalues = eigenvalues / (trace + 1e-8)

    return eigenvectors, eigenvalues

TruncatedSVDKernel ¶

TruncatedSVDKernel(rank: int, normalize: bool = True, use_randomized: bool = False)

Bases: SpectralKernel

Kernel approximation via truncated SVD.

Uses SVD to compute low-rank approximation of kernel matrix.

Parameters:

Name	Type	Description	Default
`rank`	`int`	Truncation rank.	required
`normalize`	`bool`	Whether to normalize.	`True`
`use_randomized`	`bool`	Use randomized SVD for large matrices.	`False`

Attributes:

Name	Type	Description
`use_randomized`	`bool`	Whether to use randomized algorithms.

Methods:

Name	Description
`compute`	Compute kernel via truncated SVD.

Source code in spectrans/kernels/spectral.py

def __init__(
    self,
    rank: int,
    normalize: bool = True,
    use_randomized: bool = False,
):
    super().__init__(rank, normalize)
    self.use_randomized = use_randomized

Functions¶

compute ¶

compute(x: Tensor, y: Tensor) -> Tensor

Compute kernel via truncated SVD.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	First input of shape (..., n, d).	required
`y`	`Tensor`	Second input of shape (..., m, d).	required

Returns:

Type	Description
`Tensor`	Approximate kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/spectral.py

def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute kernel via truncated SVD.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Approximate kernel matrix of shape (..., n, m).
    """
    # Compute full kernel matrix
    kernel_full = torch.matmul(x, y.transpose(-2, -1))

    if self.use_randomized:
        # Randomized SVD (faster for large matrices)
        kernel_approx = self._randomized_svd_approximation(kernel_full)
    else:
        # Standard SVD
        U, S, Vt = torch.linalg.svd(kernel_full, full_matrices=False)

        # Truncate to rank
        U_r = U[..., : self.rank]
        S_r = S[..., : self.rank]
        Vt_r = Vt[..., : self.rank, :]

        # Reconstruct
        kernel_approx = torch.matmul(U_r * S_r.unsqueeze(-2), Vt_r)

    if self.normalize:
        # Normalize rows
        row_norms = kernel_approx.norm(dim=-1, keepdim=True)
        kernel_approx = kernel_approx / (row_norms + 1e-8)

    return kernel_approx

Kernel Functions¶

spectrans.kernels ¶

Classes¶

CosineKernel ¶

Functions¶

compute ¶

KernelFunction ¶

Functions¶

compute abstractmethod ¶

gram_matrix ¶

is_positive_definite ¶

PolynomialKernel ¶

Functions¶

compute ¶

RandomFeatureMap ¶

Functions¶

forward abstractmethod ¶

kernel_approximation ¶

ShiftInvariantKernel ¶

Functions¶

evaluate_difference abstractmethod ¶

compute ¶

spectral_density abstractmethod ¶

GaussianRFFKernel ¶

Functions¶

forward ¶

evaluate_difference ¶

spectral_density ¶

LaplacianRFFKernel ¶

Functions¶

forward ¶

evaluate_difference ¶

spectral_density ¶

OrthogonalRandomFeatures ¶

Functions¶

forward ¶

RFFAttentionKernel ¶

Functions¶

forward ¶

FourierKernel ¶

Functions¶

compute ¶

LearnableSpectralKernel ¶

Functions¶

compute ¶

extract_features ¶

forward ¶

orthogonalize_eigenvectors ¶

PolynomialSpectralKernel ¶

Functions¶

compute ¶

compute_attention ¶

SpectralKernel ¶

Functions¶

spectral_decomposition ¶

TruncatedSVDKernel ¶

Functions¶

compute ¶

compute `abstractmethod` ¶

forward `abstractmethod` ¶

evaluate_difference `abstractmethod` ¶

spectral_density `abstractmethod` ¶