Hello everyone,
I’m trying to extract medical information from PDF files using LayoutLMv3 for token classification.
I’ve successfully fine-tuned the model for a few different kinds of tokens (name, date of birth, patient ID, etc.), but now I want to scale up to around 80 different labels.
I’m wondering if it’s better to train one model for all labels or to decompose the task into multiple specialized models (like just models of around 10 labels). Any advice or experiences would be greatly appreciated!
Has anyone encountered a similar issue or have any advice on the best approach? Thanks in advance for your help!
Have a good day,
Hugo