Dense layer

[source]

Dense class

keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    lora_rank=None,
    **kwargs
)

Just your regular densely-connected NN layer.

Dense implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer (only applicable if use_bias is True).

Note: If the input to the layer has a rank greater than 2, Dense computes the dot product between the inputs and the kernel along the last axis of the inputs and axis 0 of the kernel (using tf.tensordot). For example, if input has dimensions (batch_size, d0, d1), then we create a kernel with shape (d1, units), and the kernel operates along axis 2 of the input, on every sub-tensor of shape (1, 1, d1) (there are batch_size * d0 such sub-tensors). The output in this case will have shape (batch_size, d0, units).

Arguments

  • units: Positive integer, dimensionality of the output space.
  • activation: Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
  • use_bias: Boolean, whether the layer uses a bias vector.
  • kernel_initializer: Initializer for the kernel weights matrix.
  • bias_initializer: Initializer for the bias vector.
  • kernel_regularizer: Regularizer function applied to the kernel weights matrix.
  • bias_regularizer: Regularizer function applied to the bias vector.
  • activity_regularizer: Regularizer function applied to the output of the layer (its "activation").
  • kernel_constraint: Constraint function applied to the kernel weights matrix.
  • bias_constraint: Constraint function applied to the bias vector.
  • lora_rank: Optional integer. If set, the layer's forward pass will implement LoRA (Low-Rank Adaptation) with the provided rank. LoRA sets the layer's kernel to non-trainable and replaces it with a delta over the original kernel, obtained via multiplying two lower-rank trainable matrices. This can be useful to reduce the computation cost of fine-tuning large dense layers. You can also enable LoRA on an existing Dense layer by calling layer.enable_lora(rank).

Input shape

N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).

Output shape

N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).