- 
        Multimodal End-to-End Sparse Model for Emotion RecognitionExisting multimodal solutions incorporate a two-phase pipeline which results in suboptimal performance. We introduces a sparse deep learning model that allows end-to-end learning from raw text, audio, & video altogether with only a single GPU device. 
- 
        CrossNER: Evaluating Cross-Domain Named Entity RecognitionThis blog highlights the motivation of cross-domain NER, introduces a newly collected CrossNER dataset, and provides an analysis of domain-adaptive pre-training methods for the cross-domain NER task. Finally, we discuss the potential research directions on the NER domain adaptation task. 
- 
        Language Model as Few-Shot Learners for Task-Oriented Dialogue SystemsEvaluating the Few-Shot ability of Language Models in Task-Oriented Dialogue Systems