Skip to content

Base Kernel Classes

spectrans.kernels.base

Base interfaces and classes for kernel functions and random feature maps.

This module defines the abstract base classes for kernel functions used in spectral attention mechanisms and other kernel-based methods. It provides interfaces for both explicit kernel evaluations and implicit feature map representations through random features.

The kernel framework supports various approximation techniques including Random Fourier Features (RFF), polynomial kernels, and spectral kernels, enabling computation of attention mechanisms with linear complexity.

Classes:

Name Description
KernelFunction

Abstract base class for kernel functions \(k(\mathbf{x}, \mathbf{y})\).

RandomFeatureMap

Abstract base class for random feature approximations.

ShiftInvariantKernel

Base class for shift-invariant (stationary) kernels.

Examples:

Implementing a custom kernel:

>>> import torch
>>> from spectrans.kernels.base import KernelFunction
>>> class LinearKernel(KernelFunction):
...     def compute(self, x, y):
...         return torch.matmul(x, y.transpose(-2, -1))

Using a random feature map:

>>> from spectrans.kernels.base import RandomFeatureMap
>>> class CustomFeatureMap(RandomFeatureMap):
...     def __init__(self, input_dim, num_features):
...         super().__init__(input_dim, num_features)
...         # Initialize random parameters
...     def forward(self, x):
...         # Return feature mapped tensor
...         pass
Notes

For shift-invariant kernels, Bochner's theorem states that any positive definite shift-invariant kernel can be represented as the Fourier transform of a non-negative measure:

\[ k(\mathbf{x} - \mathbf{y}) = \int p(\omega) \exp(i\omega^T(\mathbf{x}-\mathbf{y})) d\omega \]

This representation enables Random Fourier Features approximation through Monte Carlo sampling, where the kernel is approximated by the inner product of explicit feature maps \(k(\mathbf{x}, \mathbf{y}) \approx \varphi(\mathbf{x})^T \varphi(\mathbf{y})\). The feature map takes the form \(\varphi(\mathbf{x}) = \sqrt{\frac{2}{D}} [\cos(\omega_1^T\mathbf{x} + b_1), \ldots, \cos(\omega_D^T\mathbf{x} + b_D)]\) where \(\omega_i\) are sampled from \(p(\omega)\) and \(b_i\) from \(\text{Uniform}[0, 2\pi]\).

The approximation error decreases with \(O(1/\sqrt{D})\) where \(D\) is the number of random features.

References

Ali Rahimi and Benjamin Recht. 2007. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20 (NeurIPS 2007), pages 1177-1184.

Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, and Adrian Weller. 2021. Rethinking attention with performers. In Proceedings of the International Conference on Learning Representations (ICLR).

See Also

spectrans.kernels.rff : Random Fourier Features implementation. spectrans.kernels.spectral : Spectral kernel functions.

Classes

KernelFunction

Bases: ABC

Abstract base class for kernel functions.

A kernel function \(k(\mathbf{x}, \mathbf{y})\) defines a similarity measure between inputs \(\mathbf{x}\) and \(\mathbf{y}\), satisfying positive semi-definiteness properties. This interface supports both explicit kernel evaluation and feature map representations.

Methods:

Name Description
compute

Compute kernel values between x and y.

gram_matrix

Compute Gram matrix \(K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j)\).

is_positive_definite

Check if the kernel yields a positive definite Gram matrix.

Functions
compute abstractmethod
compute(x: Tensor, y: Tensor) -> Tensor

Compute kernel values between x and y.

Parameters:

Name Type Description Default
x Tensor

First input tensor of shape (..., n, d).

required
y Tensor

Second input tensor of shape (..., m, d).

required

Returns:

Type Description
Tensor

Kernel matrix of shape (..., n, m) where element \((i,j)\) contains \(k(\mathbf{x}_i, \mathbf{y}_j)\).

Source code in spectrans/kernels/base.py
@abstractmethod
def compute(self, x: Tensor, y: Tensor) -> Tensor:
    r"""Compute kernel values between x and y.

    Parameters
    ----------
    x : Tensor
        First input tensor of shape (..., n, d).
    y : Tensor
        Second input tensor of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m) where element $(i,j)$
        contains $k(\mathbf{x}_i, \mathbf{y}_j)$.
    """
    pass
gram_matrix
gram_matrix(x: Tensor) -> Tensor

Compute Gram matrix \(K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j)\).

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (..., n, d).

required

Returns:

Type Description
Tensor

Gram matrix of shape (..., n, n).

Source code in spectrans/kernels/base.py
def gram_matrix(self, x: Tensor) -> Tensor:
    r"""Compute Gram matrix $K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j)$.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Gram matrix of shape (..., n, n).
    """
    return self.compute(x, x)
is_positive_definite
is_positive_definite(x: Tensor, eps: float = 1e-06) -> bool

Check if the kernel yields a positive definite Gram matrix.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (..., n, d).

required
eps float

Tolerance for eigenvalue positivity check.

1e-6

Returns:

Type Description
bool

True if all eigenvalues of Gram matrix are > eps.

Source code in spectrans/kernels/base.py
def is_positive_definite(self, x: Tensor, eps: float = 1e-6) -> bool:
    """Check if the kernel yields a positive definite Gram matrix.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).
    eps : float, default=1e-6
        Tolerance for eigenvalue positivity check.

    Returns
    -------
    bool
        True if all eigenvalues of Gram matrix are > eps.
    """
    gram = self.gram_matrix(x)
    eigenvalues = torch.linalg.eigvalsh(gram)
    return bool(torch.all(eigenvalues > eps).item())

RandomFeatureMap

RandomFeatureMap(input_dim: int, num_features: int, kernel_scale: float = 1.0, seed: int | None = None)

Bases: Module, ABC

Abstract base class for random feature map approximations.

Random feature maps provide finite-dimensional approximations to kernel functions through the mapping:

.. math:: k(\mathbf{x}, \mathbf{y}) \approx \varphi(\mathbf{x})^T \varphi(\mathbf{y})

This enables linear-time computation of kernel operations.

Parameters:

Name Type Description Default
input_dim int

Dimension of input vectors.

required
num_features int

Number of random features (D).

required
kernel_scale float

Scaling parameter for the kernel.

1.0
seed int | None

Random seed for reproducibility.

None

Attributes:

Name Type Description
input_dim int

Input dimension.

num_features int

Number of random features.

kernel_scale float

Kernel scaling parameter.

Methods:

Name Description
forward

Apply feature map to input.

kernel_approximation

Approximate kernel matrix using feature maps.

Source code in spectrans/kernels/base.py
def __init__(
    self,
    input_dim: int,
    num_features: int,
    kernel_scale: float = 1.0,
    seed: int | None = None,
):
    super().__init__()
    self.input_dim = input_dim
    self.num_features = num_features
    self.kernel_scale = kernel_scale

    if seed is not None:
        torch.manual_seed(seed)
Functions
forward abstractmethod
forward(x: Tensor) -> Tensor

Apply feature map to input.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (..., n, d).

required

Returns:

Type Description
Tensor

Feature mapped tensor of shape (..., n, D) where D is the number of random features.

Source code in spectrans/kernels/base.py
@abstractmethod
def forward(self, x: Tensor) -> Tensor:
    """Apply feature map to input.

    Parameters
    ----------
    x : Tensor
        Input tensor of shape (..., n, d).

    Returns
    -------
    Tensor
        Feature mapped tensor of shape (..., n, D) where D
        is the number of random features.
    """
    pass
kernel_approximation
kernel_approximation(x: Tensor, y: Tensor) -> Tensor

Approximate kernel matrix using feature maps.

Parameters:

Name Type Description Default
x Tensor

First input of shape (..., n, d).

required
y Tensor

Second input of shape (..., m, d).

required

Returns:

Type Description
Tensor

Approximated kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py
def kernel_approximation(self, x: Tensor, y: Tensor) -> Tensor:
    """Approximate kernel matrix using feature maps.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Approximated kernel matrix of shape (..., n, m).
    """
    phi_x = self.forward(x)  # (..., n, D)
    phi_y = self.forward(y)  # (..., m, D)
    return torch.matmul(phi_x, phi_y.transpose(-2, -1))

ShiftInvariantKernel

ShiftInvariantKernel(bandwidth: float = 1.0)

Bases: KernelFunction

Base class for shift-invariant (stationary) kernels.

Shift-invariant kernels depend only on the difference \(\mathbf{x} - \mathbf{y}\), i.e., \(k(\mathbf{x}, \mathbf{y}) = k(\mathbf{x} - \mathbf{y}, \mathbf{0})\) \(= \kappa(\mathbf{x} - \mathbf{y})\) for some function \(\kappa\).

These kernels admit Random Fourier Features approximation via Bochner's theorem.

Parameters:

Name Type Description Default
bandwidth float

Kernel bandwidth parameter (inverse of length scale).

1.0

Attributes:

Name Type Description
bandwidth float

The bandwidth parameter.

Methods:

Name Description
evaluate_difference

Evaluate kernel on difference vectors.

compute

Compute kernel matrix for shift-invariant kernel.

spectral_density

Fourier transform of the kernel (spectral density).

Source code in spectrans/kernels/base.py
def __init__(self, bandwidth: float = 1.0):
    self.bandwidth = bandwidth
Functions
evaluate_difference abstractmethod
evaluate_difference(diff: Tensor) -> Tensor

Evaluate kernel on difference vectors.

Parameters:

Name Type Description Default
diff Tensor

Difference vectors \(\mathbf{x} - \mathbf{y}\) of shape (..., d).

required

Returns:

Type Description
Tensor

Kernel values \(\kappa(\text{diff})\) of shape (...).

Source code in spectrans/kernels/base.py
@abstractmethod
def evaluate_difference(self, diff: Tensor) -> Tensor:
    r"""Evaluate kernel on difference vectors.

    Parameters
    ----------
    diff : Tensor
        Difference vectors $\mathbf{x} - \mathbf{y}$ of shape (..., d).

    Returns
    -------
    Tensor
        Kernel values $\kappa(\text{diff})$ of shape (...).
    """
    pass
compute
compute(x: Tensor, y: Tensor) -> Tensor

Compute kernel matrix for shift-invariant kernel.

Parameters:

Name Type Description Default
x Tensor

First input of shape (..., n, d).

required
y Tensor

Second input of shape (..., m, d).

required

Returns:

Type Description
Tensor

Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py
def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute kernel matrix for shift-invariant kernel.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    # Compute pairwise differences
    x_expanded = x.unsqueeze(-2)  # (..., n, 1, d)
    y_expanded = y.unsqueeze(-3)  # (..., 1, m, d)
    diff = x_expanded - y_expanded  # (..., n, m, d)

    # Evaluate kernel on differences
    return self.evaluate_difference(diff)
spectral_density abstractmethod
spectral_density(omega: Tensor) -> Tensor

Fourier transform of the kernel (spectral density).

For shift-invariant kernels, this defines the sampling distribution for Random Fourier Features.

Parameters:

Name Type Description Default
omega Tensor

Frequency vectors of shape (..., d).

required

Returns:

Type Description
Tensor

Spectral density values of shape (...).

Source code in spectrans/kernels/base.py
@abstractmethod
def spectral_density(self, omega: Tensor) -> Tensor:
    """Fourier transform of the kernel (spectral density).

    For shift-invariant kernels, this defines the sampling
    distribution for Random Fourier Features.

    Parameters
    ----------
    omega : Tensor
        Frequency vectors of shape (..., d).

    Returns
    -------
    Tensor
        Spectral density values of shape (...).
    """
    pass

PolynomialKernel

PolynomialKernel(degree: int = 2, alpha: float = 1.0, coef0: float = 0.0)

Bases: KernelFunction

Polynomial kernel.

The kernel function is: \(k(\mathbf{x}, \mathbf{y}) = (\alpha \langle \mathbf{x}, \mathbf{y} \rangle + c)^d\).

Parameters:

Name Type Description Default
degree int

Polynomial degree.

2
alpha float

Scaling of inner product.

1.0
coef0 float

Constant term.

0.0

Attributes:

Name Type Description
degree int

The polynomial degree.

alpha float

Inner product scaling.

coef0 float

Constant coefficient.

Methods:

Name Description
compute

Compute polynomial kernel matrix.

Source code in spectrans/kernels/base.py
def __init__(
    self,
    degree: int = 2,
    alpha: float = 1.0,
    coef0: float = 0.0,
):
    self.degree = degree
    self.alpha = alpha
    self.coef0 = coef0
Functions
compute
compute(x: Tensor, y: Tensor) -> Tensor

Compute polynomial kernel matrix.

Parameters:

Name Type Description Default
x Tensor

First input of shape (..., n, d).

required
y Tensor

Second input of shape (..., m, d).

required

Returns:

Type Description
Tensor

Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py
def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute polynomial kernel matrix.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    inner_product = torch.matmul(x, y.transpose(-2, -1))
    return (self.alpha * inner_product + self.coef0) ** self.degree

CosineKernel

CosineKernel(eps: float = 1e-08)

Bases: KernelFunction

Cosine similarity kernel.

The kernel function is: \(k(\mathbf{x}, \mathbf{y}) =\) \(\frac{\langle \mathbf{x}, \mathbf{y} \rangle}{\|\mathbf{x}\| \|\mathbf{y}\|}\).

Parameters:

Name Type Description Default
eps float

Small value for numerical stability.

1e-8

Attributes:

Name Type Description
eps float

Numerical stability parameter.

Methods:

Name Description
compute

Compute cosine similarity kernel matrix.

Source code in spectrans/kernels/base.py
def __init__(self, eps: float = 1e-8):
    self.eps = eps
Functions
compute
compute(x: Tensor, y: Tensor) -> Tensor

Compute cosine similarity kernel matrix.

Parameters:

Name Type Description Default
x Tensor

First input of shape (..., n, d).

required
y Tensor

Second input of shape (..., m, d).

required

Returns:

Type Description
Tensor

Kernel matrix of shape (..., n, m).

Source code in spectrans/kernels/base.py
def compute(self, x: Tensor, y: Tensor) -> Tensor:
    """Compute cosine similarity kernel matrix.

    Parameters
    ----------
    x : Tensor
        First input of shape (..., n, d).
    y : Tensor
        Second input of shape (..., m, d).

    Returns
    -------
    Tensor
        Kernel matrix of shape (..., n, m).
    """
    x_norm = torch.norm(x, dim=-1, keepdim=True)  # (..., n, 1)
    y_norm = torch.norm(y, dim=-1, keepdim=True)  # (..., m, 1)

    x_normalized = x / (x_norm + self.eps)
    y_normalized = y / (y_norm + self.eps)

    return torch.matmul(x_normalized, y_normalized.transpose(-2, -1))