Learn the scale: Custom data scaling layers in Keras

Custom scaling layer

Here’s an example of how to construct the following neural architecture in Keras:

import tensorflow as tf
from keras.models import Model
from keras.layers import Input, Dense

We set some initial parameters and random seeds for reproducibility:

n_variables = 10
random_seed = 100
kernel_initializer = tf.keras.initializers.GlorotUniform(seed=random_seed)

The custom scaling layer inherits after the general Keras Layer:

class ScalingLayer(tf.keras.layers.Layer):

    def __init__(self):

    def build(self, input_shape):
        self.scaling_factors = self.add_weight(shape=(int(input_shape[-1]),),

    def call(self, inputs):
        return tf.multiply(inputs, self.scaling_factors)

Stack layers:

input_layer = Input(shape=(n_variables,))
scaling_layer = ScalingLayer()(input_layer)
layer_2 = Dense(n_variables, activation='tanh', kernel_initializer=kernel_initializer)(scaling_layer)
layer_3 = Dense(n_variables, activation='tanh', kernel_initializer=kernel_initializer)(layer_2)
output_layer = Dense(n_variables, activation='tanh', kernel_initializer=kernel_initializer)(layer_3)
model = Model(input_layer, output_layer)

With model summary we’ll be able to see all trainable parameters:

Model: "model"
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 10)]              0         

 scaling_layer (ScalingLaye  (None, 10)                10        

 dense (Dense)               (None, 10)                110       

 dense_1 (Dense)             (None, 10)                110       

 dense_2 (Dense)             (None, 10)                110       

Total params: 340 (1.33 KB)
Trainable params: 340 (1.33 KB)
Non-trainable params: 0 (0.00 Byte)

Custom centering-scaling layer

Here’s an example of how to construct the following neural architecture in Keras:

The custom centering-scaling layer inherits after the general Keras Layer:

class CenteringScalingLayer(tf.keras.layers.Layer):

    def __init__(self):

    def build(self, input_shape):
        self.scaling_factors = self.add_weight(shape=(int(input_shape[-1]),),
        self.centering_factors = self.add_weight(shape=(int(input_shape[-1]),),

    def call(self, inputs):
        return tf.multiply(inputs, self.scaling_factors) + self.centering_factors

Stack layers:

input_layer = Input(shape=(n_variables,))
scaling_layer = CenteringScalingLayer()(input_layer)
layer_2 = Dense(n_variables, activation='tanh', kernel_initializer=kernel_initializer)(scaling_layer)
layer_3 = Dense(n_variables, activation='tanh', kernel_initializer=kernel_initializer)(layer_2)
output_layer = Dense(n_variables, activation='tanh', kernel_initializer=kernel_initializer)(layer_3)
model = Model(input_layer, output_layer)

With model summary we’ll be able to see all trainable parameters:

Model: "model_1"
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 10)]              0         

 centering_scaling_layer (C  (None, 10)                20        

 dense_3 (Dense)             (None, 10)                110       

 dense_4 (Dense)             (None, 10)                110       

 dense_5 (Dense)             (None, 10)                110       

Total params: 350 (1.37 KB)
Trainable params: 350 (1.37 KB)
Non-trainable params: 0 (0.00 Byte)

We get 10 more trainable parameters in the centering_scaling_layer.