Looking for an mxnet implementation of a BERT based transfer learning sample (preferably on multi-gpu), where the end layer is customized for a specific use case. I am interested in using the dataset I have, that contains 10 different classes based on topic/ theme.
Do you mean fine-tuning?
Yes, fine-tuning it for a custom application. Unfortunately, did not find a good sample for mxnet framework.
Have you checked the gluon-nlp site? The SQuad example may be a example for you. But I don’t think it uses multiple GPU.
@w_a_r_b_e, you can find:
- a tutorial on gluon-nlp for fine-tuning for sentence pair classification: http://gluon-nlp.mxnet.io/examples/sentence_embedding/bert.html
- a list of scripts for different type of fine-tuning: https://github.com/dmlc/gluon-nlp/tree/master/scripts/bert
- a tutorial I wrote on fine-tuning BERT for sentiment analysis:
Super helpful. Thank you!