#논문리뷰

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

BERT에서 Word Emb, Pos(Relative) Emb를 쪼개 두 벡터로 각각 계산하자!

Posted on Fri, Jun 25, 2021 NLP 논문리뷰

ZeRO-Infinity

DeepSpeed ZeRO-Infinity

Posted on Sun, May 30, 2021 논문리뷰 MLDL Framework

DExperts: On-the-Fly Controlled Text Generation with Experts and Anti-Experts

Language Model Finetune 통해 Detoxify & Sentiment Controlled Generation 하기

Posted on Fri, May 14, 2021 NLP 논문리뷰

GeDi: Generative Discriminator Guided Sequence Generation

GPT 110M으로 GPT-2(XL, 1.2B), GPT-3(175B) Generation Guide하기

Posted on Sat, May 1, 2021 NLP 논문리뷰

Longformer

BERT max len 512를 넘어 4096까지, Sequence length에 O(n)인 Attention Transformer

Posted on Sat, Mar 27, 2021 NLP 논문리뷰

exBERT: Extending Pre-trained Models with Domain-specific Vocabulary Under Constrained Training Resources

기존 BERT에 새로운 Vocab & (상대적으로)작은, 병렬 BERT모델을 붙여서 학습시, Domain Adaptation(DAPT)가 아주 잘 된다! (약 5-6%p의 균일한 성능 향상을 보임)

Posted on Fri, Mar 19, 2021 NLP 논문리뷰