Boltz.Layers
API Reference
Boltz.Layers.ClassTokens Type
ClassTokens(dim; init=zeros32)
Appends class tokens to an input with embedding dimension dim
for use in many vision transformer models.
Boltz.Layers.ConvNormActivation Type
ConvNormActivation(kernel_size::Dims, in_chs::Integer, hidden_chs::Dims{N},
activation; norm_layer=nothing, conv_kwargs=(;), norm_kwargs=(;),
last_layer_activation::Bool=false) where {N}
Construct a Chain of convolutional layers with normalization and activation functions.
Arguments
kernel_size
: size of the convolutional kernelin_chs
: number of input channelshidden_chs
: dimensions of the hidden layersactivation
: activation function
Keyword Arguments
norm_layer
: Function with signaturef(i::Integer, dims::Integer, act::F; kwargs...)
.i
is the location of the layer in the model,dims
is the channel dimension of the input, andact
is the activation function.kwargs
are forwarded from thenorm_kwargs
input, The function should return a normalization layer. Defaults tonothing
, which means no normalization layer is usedconv_kwargs
: keyword arguments for the convolutional layersnorm_kwargs
: keyword arguments for the normalization layerslast_layer_activation
: set totrue
to apply the activation function to the last layer
Boltz.Layers.DynamicExpressionsLayer Type
DynamicExpressionsLayer(operator_enum::OperatorEnum, expressions::Node...;
eval_options::EvalOptions=EvalOptions())
DynamicExpressionsLayer(operator_enum::OperatorEnum,
expressions::AbstractVector{<:Node}; kwargs...)
Wraps a DynamicExpressions.jl
Node
into a Lux layer and allows the constant nodes to be updated using any of the AD Backends.
For details about these expressions, refer to the DynamicExpressions.jl
documentation.
Arguments
operator_enum
:OperatorEnum
fromDynamicExpressions.jl
expressions
:Node
fromDynamicExpressions.jl
orAbstractVector{<:Node}
Keyword Arguments
turbo
: UseLoopVectorization.jl
for faster evaluation (Deprecated)bumper
: UseBumper.jl
for faster evaluation (Deprecated)eval_options
: EvalOptions fromDynamicExpressions.jl
These options are simply forwarded to DynamicExpressions.jl
's eval_tree_array
and eval_grad_tree_array
function.
Extended Help
Example
julia> operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos]);
julia> x1 = Node(; feature=1);
julia> x2 = Node(; feature=2);
julia> expr_1 = x1 * cos(x2 - 3.2)
x1 * cos(x2 - 3.2)
julia> expr_2 = x2 - x1 * x2 + 2.5 - 1.0 * x1
((x2 - (x1 * x2)) + 2.5) - (1.0 * x1)
julia> layer = Layers.DynamicExpressionsLayer(operators, expr_1, expr_2);
julia> ps, st = Lux.setup(Random.default_rng(), layer)
((layer_1 = (layer_1 = (params = Float32[3.2],), layer_2 = (params = Float32[2.5, 1.0],)), layer_2 = NamedTuple()), (layer_1 = (layer_1 = NamedTuple(), layer_2 = NamedTuple()), layer_2 = NamedTuple()))
julia> x = [1.0f0 2.0f0 3.0f0
4.0f0 5.0f0 6.0f0]
2×3 Matrix{Float32}:
1.0 2.0 3.0
4.0 5.0 6.0
julia> layer(x, ps, st)[1] ≈ Float32[0.6967068 -0.4544041 -2.8266668; 1.5 -4.5 -12.5]
true
julia> ∂x, ∂ps, _ = Zygote.gradient(Base.Fix1(sum, abs2) ∘ first ∘ layer, x, ps, st);
julia> ∂x ≈ Float32[-14.0292 54.206482 180.32669; -0.9995737 10.7700815 55.6814]
true
julia> ∂ps.layer_1.layer_1.params ≈ Float32[-6.451908]
true
julia> ∂ps.layer_1.layer_2.params ≈ Float32[-31.0, 90.0]
true
Boltz.Layers.HamiltonianNN Type
HamiltonianNN{FST}(model; autodiff=nothing) where {FST}
Constructs a Hamiltonian Neural Network (Greydanus et al., 2019). This neural network is useful for learning symmetries and conservation laws by supervision on the gradients of the trajectories. It takes as input a concatenated vector of length 2n
containing the position (of size n
) and momentum (of size n
) of the particles. It then returns the time derivatives for position and momentum.
Arguments
FST
: Iftrue
, then the type of the state returned by the model must be same as the type of the input state. See the documentation onStatefulLuxLayer
for more information.model
: ALux.AbstractLuxLayer
neural network that returns the Hamiltonian of the system. Themodel
must return a "batched scalar", i.e. all the dimensions of the output except the last one must be equal to 1. The last dimension must be equal to the batchsize of the input.
Keyword Arguments
autodiff
: The autodiff framework to be used for the internal Hamiltonian computation. The default isnothing
, which selects the best possible backend available. The available options areAutoForwardDiff
andAutoZygote
.
Autodiff Backends
autodiff | Package Needed | Notes |
---|---|---|
AutoZygote | Zygote.jl | Preferred Backend. Chosen if Zygote is loaded and autodiff is nothing . |
AutoForwardDiff | Chosen if Zygote is not loaded and autodiff is nothing . |
Note
This layer uses nested autodiff. Please refer to the manual entry on Nested Autodiff for more information and known limitations.
Boltz.Layers.MLP Type
MLP(in_dims::Integer, hidden_dims::Dims{N}, activation=NNlib.relu; norm_layer=nothing,
dropout_rate::Real=0.0f0, dense_kwargs=(;), norm_kwargs=(;),
last_layer_activation=false) where {N}
Construct a multi-layer perceptron (MLP) with dense layers, optional normalization layers, and dropout.
Arguments
in_dims
: number of input dimensionshidden_dims
: dimensions of the hidden layersactivation
: activation function (stacked after the normalization layer, if present else after the dense layer)
Keyword Arguments
norm_layer
: Function with signaturef(i::Integer, dims::Integer, act::F; kwargs...)
.i
is the location of the layer in the model,dims
is the channel dimension of the input, andact
is the activation function.kwargs
are forwarded from thenorm_kwargs
input, The function should return a normalization layer. Defaults tonothing
, which means no normalization layer is useddropout_rate
: dropout rate (default:0.0f0
)dense_kwargs
: keyword arguments for the dense layersnorm_kwargs
: keyword arguments for the normalization layerslast_layer_activation
: set totrue
to apply the activation function to the last layer
Boltz.Layers.MultiHeadSelfAttention Type
MultiHeadSelfAttention(in_planes::Int, number_heads::Int; use_qkv_bias::Bool=false,
attention_dropout_rate::T=0.0f0, projection_dropout_rate::T=0.0f0)
Multi-head self-attention layer
Arguments
planes
: number of input channelsnheads
: number of headsuse_qkv_bias
: whether to use bias in the layer to get the query, key and valueattn_dropout_prob
: dropout probability after the self-attention layerproj_dropout_prob
: dropout probability after the projection layer
Boltz.Layers.PatchEmbedding Type
PatchEmbedding(image_size, patch_size, in_channels, embed_planes;
norm_layer=Returns(Lux.NoOpLayer()), flatten=true)
Constructs a patch embedding layer with the given image size, patch size, input channels, and embedding planes. The patch size must be a divisor of the image size.
Arguments
image_size
: image size as a tuplepatch_size
: patch size as a tuplein_channels
: number of input channelsembed_planes
: number of embedding planes
Keyword Arguments
norm_layer
: Takes the embedding planes as input and returns a layer that normalizes the embedding planes. Defaults to no normalization.flatten
: set totrue
to flatten the output of the convolutional layer
Boltz.Layers.PeriodicEmbedding Type
PeriodicEmbedding(idxs, periods)
Create an embedding periodic in some inputs with specified periods. Input indices not in idxs
are passed through unchanged, but inputs in idxs
are moved to the end of the output and replaced with their sines, followed by their cosines (scaled appropriately to have the specified periods). This smooth embedding preserves phase information and enforces periodicity.
For example, layer = PeriodicEmbedding([2, 3], [3.0, 1.0])
will create a layer periodic in the second input with period 3.0 and periodic in the third input with period 1.0. In this case, layer([a, b, c, d], st) == ([a, d, sinpi(2 / 3.0 * b), sinpi(2 / 1.0 * c), cospi(2 / 3.0 * b), cospi(2 / 1.0 * c)], st)
.
Arguments
idxs
: Indices of the periodic inputsperiods
: Periods of the periodic inputs, in the same order as inidxs
Inputs
x
must be anAbstractArray
withissubset(idxs, axes(x, 1))
st
must be aNamedTuple
wherest.k = 2 ./ periods
, but on the same device asx
Returns
AbstractArray
of size(size(x, 1) + length(idxs), ...)
where...
are the other dimensions ofx
.st
, unchanged
Example
julia> layer = Layers.PeriodicEmbedding([2], [4.0])
PeriodicEmbedding([2], [4.0])
julia> ps, st = Lux.setup(Random.default_rng(), layer);
julia> all(layer([1.1, 2.2, 3.3], ps, st)[1] .==
[1.1, 3.3, sinpi(2 / 4.0 * 2.2), cospi(2 / 4.0 * 2.2)])
true
Boltz.Layers.SplineLayer Type
SplineLayer(in_dims, grid_min, grid_max, grid_step, basis::Type{Basis};
train_grid::Union{Val, Bool}=Val(false), init_saved_points=nothing)
Constructs a spline layer with the given basis function.
Arguments
in_dims
: input dimensions of the layer. This must be a tuple of integers, to construct a flat vector of saved_points pass in()
.grid_min
: minimum value of the grid.grid_max
: maximum value of the grid.grid_step
: step size of the grid.basis
: basis function to use for the interpolation. Currently only the basis functions from DataInterpolations.jl are supported:ConstantInterpolation
LinearInterpolation
QuadraticInterpolation
QuadraticSpline
CubicSpline
Keyword Arguments
train_grid
: whether to train the grid or not.init_saved_points
: values of the function at multiples of the time step. Initialized by default to a random vector sampled from the unit normal. Alternatively, can take a function with the signatureinit_saved_points(rng, in_dims, grid_min, grid_max, grid_step)
.
Warning
Currently this layer is limited since it relies on DataInterpolations.jl which doesn't work with GPU arrays. This will be fixed in the future by extending support to different basis functions.
Boltz.Layers.TensorProductLayer Type
TensorProductLayer(basis_fns, out_dim::Int; init_weight = randn32)
Constructs the Tensor Product Layer, which takes as input an array of n tensor product basis,
where
Arguments
basis_fns
: Array of TensorProductBasis, where corresponds to the dimension of the input. out_dim
: Dimension of the output.
Keyword Arguments
init_weight
: Initializer for the weight matrix. Defaults torandn32
.
Limited Backend Support
Support for backends apart from CPU and CUDA is limited and slow due to limited support for kron
in the backend.
Boltz.Layers.ViPosEmbedding Type
ViPosEmbedding(embedding_size, number_patches; init = randn32)
Positional embedding layer used by many vision transformer-like models.
Boltz.Layers.VisionTransformerEncoder Type
VisionTransformerEncoder(in_planes, depth, number_heads; mlp_ratio = 4.0f0,
dropout = 0.0f0)
Transformer as used in the base ViT architecture (Dosovitskiy et al., 2020).
Arguments
in_planes
: number of input channelsdepth
: number of attention blocksnumber_heads
: number of attention heads
Keyword Arguments
mlp_ratio
: ratio of MLP layers to the number of input channelsdropout_rate
: dropout rate
Boltz.Layers.ConvBatchNormActivation Method
ConvBatchNormActivation(kernel_size::Dims, (in_filters, out_filters)::Pair{Int, Int},
depth::Int, act::F; use_norm::Bool=true, conv_kwargs=(;),
last_layer_activation::Bool=true, norm_kwargs=(;)) where {F}
This function is a convenience wrapper around ConvNormActivation
that constructs a chain with norm_layer
set to Lux.BatchNorm
if use_norm
is true
and nothing
otherwise. In most cases, users should use ConvNormActivation
directly for a more flexible interface.
Bibliography
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S. and others (2020). An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929.
Greydanus, S.; Dzamba, M. and Yosinski, J. (2019). Hamiltonian neural networks. Advances in neural information processing systems 32.