Photo by Trevor Bobyk on Unsplash

TensorFlow的量化感知训练可以进一步提升量化后模型的准确率。不过目前这一工具支持的范围有限。例如对于常见的BatchNorm层,它必须跟在卷积层后才可以被正确量化。对于没有直接跟在卷积层后的BatchNorm层,我们需要将它排除在感知量化范围外。本文记录了HRNet下的具体实现方法。

首先,创建无需量化的配置类。

class NoOpQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):
    """Use this config object if the layer has nothing to be quantized for 
    quantization aware training."""

    def get_weights_and_quantizers(self, layer):
        return []

    def get_activations_and_quantizers(self, layer):
        return []

    def set_quantize_weights(self, layer, quantize_weights):
        pass

    def set_quantize_activations(self, layer, quantize_activations):
        pass

    def get_output_quantizers(self, layer):
        # Does not quantize output, since we return an empty list.
        return []

    def get_config(self):
        return {}

然后,构建函数返回对应layer。

例如目前不支持的UpSamping2D层:

def quant_aware_upsampling2d(size):
    """Since TFMOT DO NOT support upsampling 2D layer, a walkround is provided."""
    return tfmot.quantization.keras.quantize_annotate_layer(
        layers.UpSampling2D(size=size), NoOpQuantizeConfig())

最后,在训练时添加scope。

with tfmot.quantization.keras.quantize_scope(
    {'NoOpQuantizeConfig': NoOpQuantizeConfig,
     'quant_aware_identity': quant_aware_identity,
     'quant_aware_upsampling2d': quant_aware_upsampling2d}):
    model = tfmot.quantization.keras.quantize_model(model)

之后模型就可以执行量化训练了。