Sequence Labeling With Transformers
Practical NLP operates on long texts and annotations for sequence labeling tasks often come in offset format. Pre-trained transformer models assume tokenization that annotations must be aligned with, and long texts must be segmented consistently with the annotations.