Boltz.Layers
API Reference
Boltz.Layers.ClassTokens Type
ClassTokens(dim; init=zeros32)
Appends class tokens to an input with embedding dimension dim
for use in many vision transformer models.
Boltz.Layers.ConvNormActivation Type
ConvNormActivation(kernel_size::Dims, in_chs::Integer, hidden_chs::Dims{N},
activation; norm_layer=nothing, conv_kwargs=(;), norm_kwargs=(;),
last_layer_activation::Bool=false) where {N}
Construct a Chain of convolutional layers with normalization and activation functions.
Arguments
kernel_size
: size of the convolutional kernelin_chs
: number of input channelshidden_chs
: dimensions of the hidden layersactivation
: activation function
Keyword Arguments
norm_layer
: Function with signaturef(i::Integer, dims::Integer, act::F; kwargs...)
.i
is the location of the layer in the model,dims
is the channel dimension of the input, andact
is the activation function.kwargs
are forwarded from thenorm_kwargs
input, The function should return a normalization layer. Defaults tonothing
, which means no normalization layer is usedconv_kwargs
: keyword arguments for the convolutional layersnorm_kwargs
: keyword arguments for the normalization layerslast_layer_activation
: set totrue
to apply the activation function to the last layer
Boltz.Layers.DynamicExpressionsLayer Type
DynamicExpressionsLayer(operator_enum::OperatorEnum, expressions::Node...;
eval_options::EvalOptions=EvalOptions())
DynamicExpressionsLayer(operator_enum::OperatorEnum,
expressions::AbstractVector{<:Node}; kwargs...)
Wraps a DynamicExpressions.jl
Node
into a Lux layer and allows the constant nodes to be updated using any of the AD Backends.
For details about these expressions, refer to the DynamicExpressions.jl
documentation.
Arguments
operator_enum
:OperatorEnum
fromDynamicExpressions.jl
expressions
:Node
fromDynamicExpressions.jl
orAbstractVector{<:Node}
Keyword Arguments
turbo
: UseLoopVectorization.jl
for faster evaluation (Deprecated)bumper
: UseBumper.jl
for faster evaluation (Deprecated)eval_options
: EvalOptions fromDynamicExpressions.jl
These options are simply forwarded to DynamicExpressions.jl
's eval_tree_array
and eval_grad_tree_array
function.
Extended Help
Example
julia> operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos]);
julia> x1 = Node(; feature=1);
julia> x2 = Node(; feature=2);
julia> expr_1 = x1 * cos(x2 - 3.2)
x1 * cos(x2 - 3.2)
julia> expr_2 = x2 - x1 * x2 + 2.5 - 1.0 * x1
((x2 - (x1 * x2)) + 2.5) - (1.0 * x1)
julia> layer = Layers.DynamicExpressionsLayer(operators, expr_1, expr_2);
julia> ps, st = Lux.setup(Random.default_rng(), layer)
((layer_1 = (layer_1 = (params = Float32[3.2],), layer_2 = (params = Float32[2.5, 1.0],)), layer_2 = NamedTuple()), (layer_1 = (layer_1 = NamedTuple(), layer_2 = NamedTuple()), layer_2 = NamedTuple()))
julia> x = [1.0f0 2.0f0 3.0f0
4.0f0 5.0f0 6.0f0]
2×3 Matrix{Float32}:
1.0 2.0 3.0
4.0 5.0 6.0
julia> layer(x, ps, st)[1] ≈ Float32[0.6967068 -0.4544041 -2.8266668; 1.5 -4.5 -12.5]
true
julia> ∂x, ∂ps, _ = Zygote.gradient(Base.Fix1(sum, abs2) ∘ first ∘ layer, x, ps, st);
julia> ∂x ≈ Float32[-14.0292 54.206482 180.32669; -0.9995737 10.7700815 55.6814]
true
julia> ∂ps.layer_1.layer_1.params ≈ Float32[-6.451908]
true
julia> ∂ps.layer_1.layer_2.params ≈ Float32[-31.0, 90.0]
true
Boltz.Layers.HamiltonianNN Type
HamiltonianNN{FST}(model; autodiff=nothing) where {FST}
Constructs a Hamiltonian Neural Network [1]. This neural network is useful for learning symmetries and conservation laws by supervision on the gradients of the trajectories. It takes as input a concatenated vector of length 2n
containing the position (of size n
) and momentum (of size n
) of the particles. It then returns the time derivatives for position and momentum.
Arguments
FST
: Iftrue
, then the type of the state returned by the model must be same as the type of the input state. See the documentation onStatefulLuxLayer
for more information.model
: ALux.AbstractLuxLayer
neural network that returns the Hamiltonian of the system. Themodel
must return a "batched scalar", i.e. all the dimensions of the output except the last one must be equal to 1. The last dimension must be equal to the batchsize of the input.
Keyword Arguments
autodiff
: The autodiff framework to be used for the internal Hamiltonian computation. The default isnothing
, which selects the best possible backend available. The available options areAutoForwardDiff
andAutoZygote
.
Autodiff Backends
autodiff | Package Needed | Notes |
---|---|---|
AutoZygote | Zygote.jl | Preferred Backend. Chosen if Zygote is loaded and autodiff is nothing . |
AutoForwardDiff | Chosen if Zygote is not loaded and autodiff is nothing . |
Note
This layer uses nested autodiff. Please refer to the manual entry on Nested Autodiff for more information and known limitations.
Boltz.Layers.MLP Type
MLP(in_dims::Integer, hidden_dims::Dims{N}, activation=NNlib.relu; norm_layer=nothing,
dropout_rate::Real=0.0f0, dense_kwargs=(;), norm_kwargs=(;),
last_layer_activation=false) where {N}
Construct a multi-layer perceptron (MLP) with dense layers, optional normalization layers, and dropout.
Arguments
in_dims
: number of input dimensionshidden_dims
: dimensions of the hidden layersactivation
: activation function (stacked after the normalization layer, if present else after the dense layer)
Keyword Arguments
norm_layer
: Function with signaturef(i::Integer, dims::Integer, act::F; kwargs...)
.i
is the location of the layer in the model,dims
is the channel dimension of the input, andact
is the activation function.kwargs
are forwarded from thenorm_kwargs
input, The function should return a normalization layer. Defaults tonothing
, which means no normalization layer is useddropout_rate
: dropout rate (default:0.0f0
)dense_kwargs
: keyword arguments for the dense layersnorm_kwargs
: keyword arguments for the normalization layerslast_layer_activation
: set totrue
to apply the activation function to the last layer
Boltz.Layers.MultiHeadSelfAttention Type
MultiHeadSelfAttention(in_planes::Int, number_heads::Int; use_qkv_bias::Bool=false,
attention_dropout_rate::T=0.0f0, projection_dropout_rate::T=0.0f0)
Multi-head self-attention layer
Arguments
planes
: number of input channelsnheads
: number of headsuse_qkv_bias
: whether to use bias in the layer to get the query, key and valueattn_dropout_prob
: dropout probability after the self-attention layerproj_dropout_prob
: dropout probability after the projection layer
Boltz.Layers.PatchEmbedding Type
PatchEmbedding(image_size, patch_size, in_channels, embed_planes;
norm_layer=Returns(Lux.NoOpLayer()), flatten=true)
Constructs a patch embedding layer with the given image size, patch size, input channels, and embedding planes. The patch size must be a divisor of the image size.
Arguments
image_size
: image size as a tuplepatch_size
: patch size as a tuplein_channels
: number of input channelsembed_planes
: number of embedding planes
Keyword Arguments
norm_layer
: Takes the embedding planes as input and returns a layer that normalizes the embedding planes. Defaults to no normalization.flatten
: set totrue
to flatten the output of the convolutional layer
Boltz.Layers.PeriodicEmbedding Type
PeriodicEmbedding(idxs, periods)
Create an embedding periodic in some inputs with specified periods. Input indices not in idxs
are passed through unchanged, but inputs in idxs
are moved to the end of the output and replaced with their sines, followed by their cosines (scaled appropriately to have the specified periods). This smooth embedding preserves phase information and enforces periodicity.
For example, layer = PeriodicEmbedding([2, 3], [3.0, 1.0])
will create a layer periodic in the second input with period 3.0 and periodic in the third input with period 1.0. In this case, layer([a, b, c, d], st) == ([a, d, sinpi(2 / 3.0 * b), sinpi(2 / 1.0 * c), cospi(2 / 3.0 * b), cospi(2 / 1.0 * c)], st)
.
Arguments
idxs
: Indices of the periodic inputsperiods
: Periods of the periodic inputs, in the same order as inidxs
Inputs
x
must be anAbstractArray
withissubset(idxs, axes(x, 1))
st
must be aNamedTuple
wherest.k = 2 ./ periods
, but on the same device asx
Returns
AbstractArray
of size(size(x, 1) + length(idxs), ...)
where...
are the other dimensions ofx
.st
, unchanged
Example
julia> layer = Layers.PeriodicEmbedding([2], [4.0])
PeriodicEmbedding([2], [4.0])
julia> ps, st = Lux.setup(Random.default_rng(), layer);
julia> all(layer([1.1, 2.2, 3.3], ps, st)[1] .==
[1.1, 3.3, sinpi(2 / 4.0 * 2.2), cospi(2 / 4.0 * 2.2)])
true
Boltz.Layers.PositiveDefinite Type
PositiveDefinite(model, x0; ψ, r)
PositiveDefinite(model; in_dims, ψ, r)
Constructs a Lyapunov-Net [2], which is positive definite about x0
whenever ψ
and r
meet certain conditions described below.
For a model ϕ
, PositiveDefinite(ϕ, ψ, r, x0)(x, ps, st) = ψ(ϕ(x, ps, st) - ϕ(x0, ps, st)) + r(x, x0)
. This results in a model which maps x0
to 0
and any other input to a positive number (i.e., a model which is positive definite about x0
) whenever ψ
is positive definite about zero and r
returns a positive number for any non-equal inputs and zero for equal inputs.
Arguments
model
: the underlying model being transformed into a positive definite functionx0
: The unique input that will be mapped to zero instead of a positive number
Keyword Arguments
in_dims
: the number of input dimensions ifx0
is not provided; usesx0 = zeros(in_dims)
ψ
: a positive definite function (about zero); defaults tor
: a bivariate function such thatr(x0, x0) = 0
andr(x, x0) > 0
wheneverx ≠ x0
; defaults to
Inputs
x
: will be passed directly intomodel
, so must meet the input requirements of that argument
Returns
The output of the positive definite model
The state of the positive definite model. If the underlying model changes it state, the state will be updated first according to the call with the input
x0
, then according to the call with the inputx
.
States
st
: aNamedTuple
containing the state of the underlyingmodel
and thex0
value
Parameters
- Same as the underlying
model
Boltz.Layers.ShiftTo Type
ShiftTo(model, in_val, out_val)
Vertically shifts the output of model
to output out_val
when the input is in_val
.
For a model ϕ
, ShiftTo(ϕ, in_val, out_val)(x, ps, st) = ϕ(x, ps, st) + Δϕ
, where Δϕ = out_val - ϕ(in_val, ps, st)
.
Arguments
model
: the underlying model being transformed into a positive definite functionin_val
: The input that will be mapped toout_val
out_val
: The value that the output will be shifted to when the input isin_val
Inputs
x
: will be passed directly intomodel
, so must meet the input requirements of that argument
Returns
The output of the shifted model
The state of the shifted model. If the underlying model changes it state, the state will be updated first according to the call with the input
in_val
, then according to the call with the inputx
.
States
st
: aNamedTuple
containing the state of the underlyingmodel
and thein_val
andout_val
values
Parameters
- Same as the underlying
model
Boltz.Layers.SplineLayer Type
SplineLayer(in_dims, grid_min, grid_max, grid_step, basis::Type{Basis};
train_grid::Union{Val, Bool}=Val(false), init_saved_points=nothing)
Constructs a spline layer with the given basis function.
Arguments
in_dims
: input dimensions of the layer. This must be a tuple of integers, to construct a flat vector of saved_points pass in()
.grid_min
: minimum value of the grid.grid_max
: maximum value of the grid.grid_step
: step size of the grid.basis
: basis function to use for the interpolation. Currently only the basis functions from DataInterpolations.jl are supported:ConstantInterpolation
LinearInterpolation
QuadraticInterpolation
QuadraticSpline
CubicSpline
Keyword Arguments
train_grid
: whether to train the grid or not.init_saved_points
: values of the function at multiples of the time step. Initialized by default to a random vector sampled from the unit normal. Alternatively, can take a function with the signatureinit_saved_points(rng, in_dims, grid_min, grid_max, grid_step)
.
Warning
Currently this layer is limited since it relies on DataInterpolations.jl which doesn't work with GPU arrays. This will be fixed in the future by extending support to different basis functions.
Boltz.Layers.TensorProductLayer Type
TensorProductLayer(basis_fns, out_dim::Int; init_weight = randn32)
Constructs the Tensor Product Layer, which takes as input an array of n tensor product basis,
where
Arguments
basis_fns
: Array of TensorProductBasis, where corresponds to the dimension of the input. out_dim
: Dimension of the output.
Keyword Arguments
init_weight
: Initializer for the weight matrix. Defaults torandn32
.
Limited Backend Support
Support for backends apart from CPU and CUDA is limited and slow due to limited support for kron
in the backend.
Boltz.Layers.ViPosEmbedding Type
ViPosEmbedding(embedding_size, number_patches; init = randn32)
Positional embedding layer used by many vision transformer-like models.
sourceBoltz.Layers.VisionTransformerEncoder Type
VisionTransformerEncoder(in_planes, depth, number_heads; mlp_ratio = 4.0f0,
dropout = 0.0f0)
Transformer as used in the base ViT architecture [3].
Arguments
in_planes
: number of input channelsdepth
: number of attention blocksnumber_heads
: number of attention heads
Keyword Arguments
mlp_ratio
: ratio of MLP layers to the number of input channelsdropout_rate
: dropout rate
Boltz.Layers.ConvBatchNormActivation Method
ConvBatchNormActivation(kernel_size::Dims, (in_filters, out_filters)::Pair{Int, Int},
depth::Int, act::F; use_norm::Bool=true, conv_kwargs=(;),
last_layer_activation::Bool=true, norm_kwargs=(;)) where {F}
This function is a convenience wrapper around ConvNormActivation
that constructs a chain with norm_layer
set to Lux.BatchNorm
if use_norm
is true
and nothing
otherwise. In most cases, users should use ConvNormActivation
directly for a more flexible interface.