Random Fourier Features¶
spectrans.kernels.rff ¶
Random Fourier Features (RFF) for kernel approximation.
This module implements Random Fourier Features, a technique for approximating shift-invariant kernels through explicit feature maps. RFF enables linear-time computation of kernel operations that would normally require quadratic time.
The implementation supports various kernel types including Gaussian (RBF), Laplacian, and other shift-invariant kernels. It also includes orthogonal random features that reduce approximation variance.
Classes:
| Name | Description |
|---|---|
GaussianRFFKernel |
Gaussian/RBF kernel with RFF approximation. |
LaplacianRFFKernel |
Laplacian kernel with RFF approximation. |
OrthogonalRandomFeatures |
Orthogonal variant of random features for better approximation. |
RFFAttentionKernel |
Specialized RFF for attention mechanisms. |
Examples:
Basic Gaussian RFF usage:
>>> import torch
>>> from spectrans.kernels.rff import GaussianRFFKernel
>>> kernel = GaussianRFFKernel(input_dim=64, num_features=256, sigma=1.0)
>>> x = torch.randn(32, 100, 64) # (batch, sequence, dim)
>>> features = kernel.feature_map(x)
>>> assert features.shape == (32, 100, 256)
Computing approximate kernel matrix:
>>> y = torch.randn(32, 50, 64)
>>> K_approx = kernel.kernel_approximation(x, y)
>>> assert K_approx.shape == (32, 100, 50)
Using orthogonal features:
>>> from spectrans.kernels.rff import OrthogonalRandomFeatures
>>> orf = OrthogonalRandomFeatures(input_dim=64, num_features=256)
>>> features = orf(x)
Notes
For a shift-invariant kernel \(k(\mathbf{x}, \mathbf{y}) = \kappa(\mathbf{x} - \mathbf{y})\) with Fourier transform \(p(\omega)\), Bochner's theorem gives:
The RFF approximation samples \(\omega \sim p(\omega)\) and uses:
This gives \(k(\mathbf{x}, \mathbf{y}) \approx \varphi(\mathbf{x})^T \varphi(\mathbf{y})\) with approximation error \(O(1/\sqrt{D})\).
For Gaussian kernel: \(p(\omega) = \mathcal{N}(0, \sigma^2 I)\)
For Laplacian kernel: \(p(\omega) = \text{Cauchy}(0, \sigma)\)
References
Ali Rahimi and Benjamin Recht. 2007. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20 (NeurIPS 2007), pages 1177-1184.
Felix X. Yu, Ananda Theertha Suresh, Krzysztof M. Choromanski, Daniel N. Holtmann-Rice, and Sanjiv Kumar. 2016. Orthogonal random features. In Advances in Neural Information Processing Systems 29 (NeurIPS 2016), pages 1975-1983.
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, and Adrian Weller. 2021. Rethinking attention with performers. In Proceedings of the International Conference on Learning Representations (ICLR).
See Also
spectrans.kernels.base : Base kernel interfaces. spectrans.layers.attention.spectral : Spectral attention using RFF.
Classes¶
GaussianRFFKernel ¶
GaussianRFFKernel(input_dim: int, num_features: int, sigma: float = 1.0, use_cos_sin: bool = False, orthogonal: bool = False, trainable: bool = False, seed: int | None = None)
Bases: ShiftInvariantKernel, RandomFeatureMap
Gaussian (RBF) kernel with Random Fourier Features approximation.
Implements the Gaussian kernel using RFF.
The kernel function is: \(k(\mathbf{x}, \mathbf{y}) = \exp\left(-\frac{\|\mathbf{x} - \mathbf{y}\|^2}{2\sigma^2}\right)\).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of input vectors. |
required |
num_features
|
int
|
Number of random Fourier features. |
required |
sigma
|
float
|
Kernel bandwidth (standard deviation). |
1.0
|
use_cos_sin
|
bool
|
If True, use both cos and sin features (doubles feature dimension). |
False
|
orthogonal
|
bool
|
If True, use orthogonal random features. |
False
|
trainable
|
bool
|
If True, make random parameters trainable. |
False
|
seed
|
int | None
|
Random seed for reproducibility. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
omega |
Parameter or Tensor
|
Random frequencies of shape (input_dim, num_features). |
bias |
Parameter or Tensor
|
Random phase shifts of shape (num_features,). |
Methods:
| Name | Description |
|---|---|
forward |
Apply random Fourier feature map. |
evaluate_difference |
Evaluate Gaussian kernel on difference vectors. |
spectral_density |
Spectral density for Gaussian kernel (Gaussian distribution). |
Source code in spectrans/kernels/rff.py
Functions¶
forward ¶
Apply random Fourier feature map.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (..., n, d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Feature mapped tensor of shape (..., n, D) where D is self.output_features. |
Source code in spectrans/kernels/rff.py
evaluate_difference ¶
Evaluate Gaussian kernel on difference vectors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
diff
|
Tensor
|
Difference vectors of shape (..., d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Kernel values of shape (...). |
Source code in spectrans/kernels/rff.py
spectral_density ¶
Spectral density for Gaussian kernel (Gaussian distribution).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
omega
|
Tensor
|
Frequency vectors of shape (..., d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Spectral density values of shape (...). |
Source code in spectrans/kernels/rff.py
LaplacianRFFKernel ¶
LaplacianRFFKernel(input_dim: int, num_features: int, sigma: float = 1.0, use_cos_sin: bool = False, trainable: bool = False, seed: int | None = None)
Bases: ShiftInvariantKernel, RandomFeatureMap
Laplacian kernel with Random Fourier Features approximation.
Implements the Laplacian kernel using RFF with Cauchy distribution.
The kernel function is: \(k(\mathbf{x}, \mathbf{y}) = \exp\left(-\frac{\|\mathbf{x} - \mathbf{y}\|_1}{\sigma}\right)\).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of input vectors. |
required |
num_features
|
int
|
Number of random Fourier features. |
required |
sigma
|
float
|
Kernel bandwidth parameter. |
1.0
|
use_cos_sin
|
bool
|
If True, use both cos and sin features. |
False
|
trainable
|
bool
|
If True, make random parameters trainable. |
False
|
seed
|
int | None
|
Random seed for reproducibility. |
None
|
Methods:
| Name | Description |
|---|---|
forward |
Apply random Fourier feature map. |
evaluate_difference |
Evaluate Laplacian kernel on difference vectors. |
spectral_density |
Spectral density for Laplacian kernel (Cauchy distribution). |
Source code in spectrans/kernels/rff.py
Functions¶
forward ¶
Apply random Fourier feature map.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (..., n, d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Feature mapped tensor of shape (..., n, D). |
Source code in spectrans/kernels/rff.py
evaluate_difference ¶
Evaluate Laplacian kernel on difference vectors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
diff
|
Tensor
|
Difference vectors of shape (..., d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Kernel values of shape (...). |
Source code in spectrans/kernels/rff.py
spectral_density ¶
Spectral density for Laplacian kernel (Cauchy distribution).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
omega
|
Tensor
|
Frequency vectors of shape (..., d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Spectral density values of shape (...). |
Source code in spectrans/kernels/rff.py
OrthogonalRandomFeatures ¶
OrthogonalRandomFeatures(input_dim: int, num_features: int, kernel_type: Literal['gaussian', 'laplacian'] = 'gaussian', sigma: float = 1.0, use_hadamard: bool = False, trainable: bool = False, seed: int | None = None)
Bases: RandomFeatureMap
Orthogonal Random Features for kernel approximation.
Uses structured orthogonal matrices to reduce approximation variance compared to standard i.i.d. Gaussian features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of input vectors. |
required |
num_features
|
int
|
Number of random features. |
required |
kernel_type
|
Literal['gaussian', 'laplacian']
|
Type of kernel to approximate. |
"gaussian"
|
sigma
|
float
|
Kernel bandwidth parameter. |
1.0
|
use_hadamard
|
bool
|
If True, use fast Hadamard transform. |
False
|
trainable
|
bool
|
If True, make scaling parameters trainable. |
False
|
seed
|
int | None
|
Random seed. |
None
|
Methods:
| Name | Description |
|---|---|
forward |
Apply orthogonal random feature map. |
Source code in spectrans/kernels/rff.py
Functions¶
forward ¶
Apply orthogonal random feature map.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (..., n, d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Feature mapped tensor of shape (..., n, D). |
Source code in spectrans/kernels/rff.py
RFFAttentionKernel ¶
RFFAttentionKernel(input_dim: int, num_features: int, kernel_type: Literal['softmax', 'relu', 'elu'] = 'softmax', use_orthogonal: bool = True, redraw: bool = False, seed: int | None = None)
Bases: RandomFeatureMap
Random Fourier Features specifically designed for attention mechanisms.
Implements positive random features for use in linear attention, following the Performer architecture.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of input vectors (typically head_dim). |
required |
num_features
|
int
|
Number of random features. |
required |
kernel_type
|
Literal['softmax', 'relu', 'elu']
|
Type of kernel approximation. |
"softmax"
|
use_orthogonal
|
bool
|
If True, use orthogonal random features. |
True
|
redraw
|
bool
|
If True, redraw random features at each forward pass. |
False
|
seed
|
int | None
|
Random seed. |
None
|
Methods:
| Name | Description |
|---|---|
forward |
Apply random feature map for attention. |
Source code in spectrans/kernels/rff.py
Functions¶
forward ¶
Apply random feature map for attention.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (..., n, d). |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Positive feature mapped tensor of shape (..., n, D). |