I was wondering If I can use Allennlp to load a very big dataset and train T5 model on the data that I will get from the dataset. About building the data set I have read the documentation and the idea is clear but I did not find information related to indexers and Is there any thing on documentation can help be start working on that? but I need more idea about tokenization.
2) Can I use T5tokenizer from transformers instead? I just can not get the hook to how think about it yet.
3) In the past there was example on using BERT I could not find it now also seems there are some new improvements.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…