Boltz.Layers API Reference 
Boltz.Layers.ClassTokens Type
ClassTokens(dim; init=zeros32)Appends class tokens to an input with embedding dimension dim for use in many vision transformer models.
Boltz.Layers.ConvNormActivation Type
ConvNormActivation(kernel_size::Dims, in_chs::Integer, hidden_chs::Dims{N},
    activation; norm_layer=nothing, conv_kwargs=(;), norm_kwargs=(;),
    last_layer_activation::Bool=false) where {N}Construct a Chain of convolutional layers with normalization and activation functions.
Arguments
- kernel_size: size of the convolutional kernel
- in_chs: number of input channels
- hidden_chs: dimensions of the hidden layers
- activation: activation function
Keyword Arguments
- norm_layer: Function with signature- f(i::Integer, dims::Integer, act::F; kwargs...).- iis the location of the layer in the model,- dimsis the channel dimension of the input, and- actis the activation function.- kwargsare forwarded from the- norm_kwargsinput, The function should return a normalization layer. Defaults to- nothing, which means no normalization layer is used
- conv_kwargs: keyword arguments for the convolutional layers
- norm_kwargs: keyword arguments for the normalization layers
- last_layer_activation: set to- trueto apply the activation function to the last layer
Boltz.Layers.DynamicExpressionsLayer Type
DynamicExpressionsLayer(operator_enum::OperatorEnum, expressions::Node...;
    eval_options::EvalOptions=EvalOptions())
DynamicExpressionsLayer(operator_enum::OperatorEnum,
    expressions::AbstractVector{<:Node}; kwargs...)Wraps a DynamicExpressions.jl Node into a Lux layer and allows the constant nodes to be updated using any of the AD Backends.
For details about these expressions, refer to the DynamicExpressions.jl documentation.
Arguments
- operator_enum:- OperatorEnumfrom- DynamicExpressions.jl
- expressions:- Nodefrom- DynamicExpressions.jlor- AbstractVector{<:Node}
Keyword Arguments
- turbo: Use- LoopVectorization.jlfor faster evaluation (Deprecated)
- bumper: Use- Bumper.jlfor faster evaluation (Deprecated)
- eval_options: EvalOptions from- DynamicExpressions.jl
These options are simply forwarded to DynamicExpressions.jl's eval_tree_array and eval_grad_tree_array function.
Extended Help
Example
julia> operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos]);
julia> x1 = Node(; feature=1);
julia> x2 = Node(; feature=2);
julia> expr_1 = x1 * cos(x2 - 3.2)
x1 * cos(x2 - 3.2)
julia> expr_2 = x2 - x1 * x2 + 2.5 - 1.0 * x1
((x2 - (x1 * x2)) + 2.5) - (1.0 * x1)
julia> layer = Layers.DynamicExpressionsLayer(operators, expr_1, expr_2);
julia> ps, st = Lux.setup(Random.default_rng(), layer)
((layer_1 = (layer_1 = (params = Float32[3.2],), layer_2 = (params = Float32[2.5, 1.0],)), layer_2 = NamedTuple()), (layer_1 = (layer_1 = NamedTuple(), layer_2 = NamedTuple()), layer_2 = NamedTuple()))
julia> x = [1.0f0 2.0f0 3.0f0
            4.0f0 5.0f0 6.0f0]
2×3 Matrix{Float32}:
 1.0  2.0  3.0
 4.0  5.0  6.0
julia> layer(x, ps, st)[1] ≈ Float32[0.6967068 -0.4544041 -2.8266668; 1.5 -4.5 -12.5]
true
julia> ∂x, ∂ps, _ = Zygote.gradient(Base.Fix1(sum, abs2) ∘ first ∘ layer, x, ps, st);
julia> ∂x ≈ Float32[-14.0292 54.206482 180.32669; -0.9995737 10.7700815 55.6814]
true
julia> ∂ps.layer_1.layer_1.params ≈ Float32[-6.451908]
true
julia> ∂ps.layer_1.layer_2.params ≈ Float32[-31.0, 90.0]
trueBoltz.Layers.HamiltonianNN Type
HamiltonianNN{FST}(model; autodiff=nothing) where {FST}Constructs a Hamiltonian Neural Network [1]. This neural network is useful for learning symmetries and conservation laws by supervision on the gradients of the trajectories. It takes as input a concatenated vector of length 2n containing the position (of size n) and momentum (of size n) of the particles. It then returns the time derivatives for position and momentum.
Arguments
- FST: If- true, then the type of the state returned by the model must be same as the type of the input state. See the documentation on- StatefulLuxLayerfor more information.
- model: A- Lux.AbstractLuxLayerneural network that returns the Hamiltonian of the system. The- modelmust return a "batched scalar", i.e. all the dimensions of the output except the last one must be equal to 1. The last dimension must be equal to the batchsize of the input.
Keyword Arguments
- autodiff: The autodiff framework to be used for the internal Hamiltonian computation. The default is- nothing, which selects the best possible backend available. The available options are- AutoForwardDiffand- AutoZygote.
Autodiff Backends
| autodiff | Package Needed | Notes | 
|---|---|---|
| AutoZygote | Zygote.jl | Preferred Backend. Chosen if Zygoteis loaded andautodiffisnothing. | 
| AutoForwardDiff | Chosen if Zygoteis not loaded andautodiffisnothing. | |
| AutoEnzyme | Enzyme.jl | This is the only backend that works with Reactant. modeis ignored for now | 
Note
This layer uses nested autodiff. Please refer to the manual entry on Nested Autodiff for more information and known limitations.
Boltz.Layers.MLP Type
MLP(in_dims::Integer, hidden_dims::Dims{N}, activation=NNlib.relu; norm_layer=nothing,
    dropout_rate::Real=0.0f0, dense_kwargs=(;), norm_kwargs=(;),
    last_layer_activation=false) where {N}Construct a multi-layer perceptron (MLP) with dense layers, optional normalization layers, and dropout.
Arguments
- in_dims: number of input dimensions
- hidden_dims: dimensions of the hidden layers
- activation: activation function (stacked after the normalization layer, if present else after the dense layer)
Keyword Arguments
- norm_layer: Function with signature- f(i::Integer, dims::Integer, act::F; kwargs...).- iis the location of the layer in the model,- dimsis the channel dimension of the input, and- actis the activation function.- kwargsare forwarded from the- norm_kwargsinput, The function should return a normalization layer. Defaults to- nothing, which means no normalization layer is used
- dropout_rate: dropout rate (default:- 0.0f0)
- dense_kwargs: keyword arguments for the dense layers
- norm_kwargs: keyword arguments for the normalization layers
- last_layer_activation: set to- trueto apply the activation function to the last layer
- residual_connection: set to- trueto apply a residual connection to the MLP
Boltz.Layers.MultiHeadSelfAttention Type
MultiHeadSelfAttention(in_planes::Int, number_heads::Int; use_qkv_bias::Bool=false,
    attention_dropout_rate::T=0.0f0, projection_dropout_rate::T=0.0f0)Multi-head self-attention layer
Arguments
- planes: number of input channels
- nheads: number of heads
- use_qkv_bias: whether to use bias in the layer to get the query, key and value
- attn_dropout_prob: dropout probability after the self-attention layer
- proj_dropout_prob: dropout probability after the projection layer
Dreprecated
Use MultiHeadAttention from Lux instead.
Boltz.Layers.PatchEmbedding Type
PatchEmbedding(image_size, patch_size, in_channels, embed_planes;
    norm_layer=Returns(Lux.NoOpLayer()), flatten=true)Constructs a patch embedding layer with the given image size, patch size, input channels, and embedding planes. The patch size must be a divisor of the image size.
Arguments
- image_size: image size as a tuple
- patch_size: patch size as a tuple
- in_channels: number of input channels
- embed_planes: number of embedding planes
Keyword Arguments
- norm_layer: Takes the embedding planes as input and returns a layer that normalizes the embedding planes. Defaults to no normalization.
- flatten: set to- trueto flatten the output of the convolutional layer
Boltz.Layers.PeriodicEmbedding Type
PeriodicEmbedding(idxs, periods)Create an embedding periodic in some inputs with specified periods. Input indices not in idxs are passed through unchanged, but inputs in idxs are moved to the end of the output and replaced with their sines, followed by their cosines (scaled appropriately to have the specified periods). This smooth embedding preserves phase information and enforces periodicity.
For example, layer = PeriodicEmbedding([2, 3], [3.0, 1.0]) will create a layer periodic in the second input with period 3.0 and periodic in the third input with period 1.0. In this case, layer([a, b, c, d], st) == ([a, d, sinpi(2 / 3.0 * b), sinpi(2 / 1.0 * c), cospi(2 / 3.0 * b), cospi(2 / 1.0 * c)], st).
Arguments
- idxs: Indices of the periodic inputs
- periods: Periods of the periodic inputs, in the same order as in- idxs
Inputs
- xmust be an- AbstractArraywith- issubset(idxs, axes(x, 1))
- stmust be a- NamedTuplewhere- st.k = 2 ./ periods, but on the same device as- x
Returns
- AbstractArrayof size- (size(x, 1) + length(idxs), ...)where- ...are the other dimensions of- x.
- st, unchanged
Example
julia> layer = Layers.PeriodicEmbedding([2], [4.0])
PeriodicEmbedding([2], [4.0])
julia> ps, st = Lux.setup(Random.default_rng(), layer);
julia> all(layer([1.1, 2.2, 3.3], ps, st)[1] .==
           [1.1, 3.3, sinpi(2 / 4.0 * 2.2), cospi(2 / 4.0 * 2.2)])
trueBoltz.Layers.PhysicsSelfAttentionIrregularMesh Type
PhysicsSelfAttentionIrregularMesh(
    dim; nheads=8, dim_head=64, dropout=0.0f0, slice_num=64
)Physics self-attention layer used in neural PDE solvers. See [2] and [3] for more details.
sourceBoltz.Layers.PositiveDefinite Type
PositiveDefinite(model, x0; ψ, r)
PositiveDefinite(model; in_dims, ψ, r)Constructs a Lyapunov-Net [4], which is positive definite about x0 whenever ψ and r meet certain conditions described below.
For a model ϕ, PositiveDefinite(ϕ, ψ, r, x0)(x, ps, st) = ψ(ϕ(x, ps, st) - ϕ(x0, ps, st)) + r(x, x0). This results in a model which maps x0 to 0 and any other input to a positive number (i.e., a model which is positive definite about x0) whenever ψ is positive definite about zero and r returns a positive number for any non-equal inputs and zero for equal inputs.
Arguments
- model: the underlying model being transformed into a positive definite function
- x0: The unique input that will be mapped to zero instead of a positive number
Keyword Arguments
- in_dims: the number of input dimensions if- x0is not provided; uses- x0 = zeros(in_dims)
- ψ: a positive definite function (about zero); defaults to
- r: a bivariate function such that- r(x0, x0) = 0and- r(x, x0) > 0whenever- x ≠ x0; defaults to
Inputs
- x: will be passed directly into- model, so must meet the input requirements of that argument
Returns
- The output of the positive definite model 
- The state of the positive definite model. If the underlying model changes it state, the state will be updated first according to the call with the input - x0, then according to the call with the input- x.
States
- st: a- NamedTuplecontaining the state of the underlying- modeland the- x0value
Parameters
- Same as the underlying model
Boltz.Layers.ShiftTo Type
ShiftTo(model, in_val, out_val)Vertically shifts the output of model to output out_val when the input is in_val.
For a model ϕ, ShiftTo(ϕ, in_val, out_val)(x, ps, st) = ϕ(x, ps, st) + Δϕ, where Δϕ = out_val - ϕ(in_val, ps, st).
Arguments
- model: the underlying model being transformed into a positive definite function
- in_val: The input that will be mapped to- out_val
- out_val: The value that the output will be shifted to when the input is- in_val
Inputs
- x: will be passed directly into- model, so must meet the input requirements of that argument
Returns
- The output of the shifted model 
- The state of the shifted model. If the underlying model changes it state, the state will be updated first according to the call with the input - in_val, then according to the call with the input- x.
States
- st: a- NamedTuplecontaining the state of the underlying- modeland the- in_valand- out_valvalues
Parameters
- Same as the underlying model
Boltz.Layers.SplineLayer Type
SplineLayer(in_dims, grid_min, grid_max, grid_step, basis::Type{Basis};
    train_grid::Union{Val, Bool}=Val(false), init_saved_points=nothing)Constructs a spline layer with the given basis function.
Arguments
- in_dims: input dimensions of the layer. This must be a tuple of integers, to construct a flat vector of saved_points pass in- ().
- grid_min: minimum value of the grid.
- grid_max: maximum value of the grid.
- grid_step: step size of the grid.
- basis: basis function to use for the interpolation. Currently only the basis functions from DataInterpolations.jl are supported:- ConstantInterpolation
- LinearInterpolation
- QuadraticInterpolation
- QuadraticSpline
- CubicSpline
 
Keyword Arguments
- train_grid: whether to train the grid or not.
- init_saved_points: values of the function at multiples of the time step. Initialized by default to a random vector sampled from the unit normal. Alternatively, can take a function with the signature- init_saved_points(rng, in_dims, grid_min, grid_max, grid_step).
Warning
Currently this layer is limited since it relies on DataInterpolations.jl which doesn't work with GPU arrays. This will be fixed in the future by extending support to different basis functions.
Boltz.Layers.TensorProductLayer Type
TensorProductLayer(basis_fns, out_dim::Int; init_weight = randn32)Constructs the Tensor Product Layer, which takes as input an array of n tensor product basis, 
where 
Arguments
- basis_fns: Array of TensorProductBasis- , where - corresponds to the dimension of the input. 
- out_dim: Dimension of the output.
Keyword Arguments
- init_weight: Initializer for the weight matrix. Defaults to- randn32.
Limited Backend Support
Support for backends apart from CPU and CUDA is limited and slow due to limited support for kron in the backend.
Boltz.Layers.ViPosEmbedding Type
ViPosEmbedding(embedding_size, number_patches; init = randn32)Positional embedding layer used by many vision transformer-like models.
sourceBoltz.Layers.VisionTransformerEncoder Type
VisionTransformerEncoder(in_planes, depth, number_heads; mlp_ratio = 4.0f0,
    dropout = 0.0f0)Transformer as used in the base ViT architecture [5].
Arguments
- in_planes: number of input channels
- depth: number of attention blocks
- number_heads: number of attention heads
Keyword Arguments
- mlp_ratio: ratio of MLP layers to the number of input channels
- dropout_rate: dropout rate
Boltz.Layers.ConvBatchNormActivation Method
ConvBatchNormActivation(kernel_size::Dims, (in_filters, out_filters)::Pair{Int, Int},
    depth::Int, act::F; use_norm::Bool=true, conv_kwargs=(;),
    last_layer_activation::Bool=true, norm_kwargs=(;)) where {F}This function is a convenience wrapper around ConvNormActivation that constructs a chain with norm_layer set to Lux.BatchNorm if use_norm is true and nothing otherwise. In most cases, users should use ConvNormActivation directly for a more flexible interface.