Contrastive Training Improves Zero-Shot Classification of

less than 1 minute read

Introduction

  • What is the name of the LayoutBERT paper?

    Contrastive Training Improves Zero-Shot Classification of Semi-structured Documents

    AWS AI Labs

  • What are the main contributions of the LayoutBERT paper?

    • explore the zero shot classification setting
    • pairwise contrastive objective for pretraining and finetuning
    • unsupervised pretraining with pseudolabels

Method

  • LayoutBERT pseudolables are subset of tokens extracted from a document

    Untitled

  • LayoutBERT uses two encoders, a text encoder and a document encoder that incorporates visual information
  • The LayoutBERT architecture encodes text and 2D positional embeddings
    • like LayoutLM with fewer positional embeddings and no visual information
  • LayoutBERT finetuning is done with the class labels as the text

Results

  • LayoutBERT zero shot performance results

    Untitled

  • LayoutBERT supervised zero shot performance results: contrastive finetuning does the best

    Untitled

Conclusions

This paper provides a simple boost to baseline methods for structured document classification. It’s clear that the benefit of the method comes from the information included in the text descriptions of the labels. I would have liked to see the label content analyzed for the datasets that are evaluated.

Reference

https://arxiv.org/pdf/2210.05613.pdf

Updated: