Optimal Approach for Fine-Tuning LayoutLMv3 for Token Classification with 80 Labels

hugobee · May 26, 2025, 11:29am

Hello everyone,

I’m trying to extract medical information from PDF files using LayoutLMv3 for token classification.

I’ve successfully fine-tuned the model for a few different kinds of tokens (name, date of birth, patient ID, etc.), but now I want to scale up to around 80 different labels.

I’m wondering if it’s better to train one model for all labels or to decompose the task into multiple specialized models (like just models of around 10 labels). Any advice or experiences would be greatly appreciated!

Has anyone encountered a similar issue or have any advice on the best approach? Thanks in advance for your help!

Have a good day,

Hugo

John6666 · May 26, 2025, 1:13pm

if it’s better to train one model for all labels or to decompose the task into multiple specialized models (like just models of around 10 labels)

Looking at the dataset used to train LayoutLMv2, it seems that a number of items within 20 is more appropriate. I think v3 probably has similar characteristics.

Well, small models are often not suitable for processing many items at once, so it is safer to divide them into multiple models. Even if you continue to train a single model, it is a good idea to save the current successful weights somewhere.

hugobee · May 26, 2025, 2:57pm

Thanks you for your response! I’m gonna try that

Topic		Replies	Views
LayoutLMV3 for Token Classification 🤗Transformers	7	5078	June 19, 2025
Finetune LayoutLM for multilabel document image classification Models	0	469	July 18, 2023
Layoutlmv2 token classification on documents having tokens larger than 512 Models	8	2425	October 20, 2022
Why is LayoutLMv2 Bad at Token Classification? Beginners	0	431	June 17, 2023
Image Token classification LayoutLMv3 Beginners	0	401	November 7, 2023

Optimal Approach for Fine-Tuning LayoutLMv3 for Token Classification with 80 Labels

Related topics