Decomposable Attention Model for Novelty Detection

Abstract

Detecting, whether a document contains sufficient new information to be deemed as novel, is of immense significance in this age of data duplication. Existing techniques for document-level novelty detection mostly perform at the lexical level and are unable to address the semantic-level redundancy. These techniques usually rely on handcrafted features extracted from the documents in a rule-based or traditional feature-based machine learning setup. Here, we present an effective approach based on neural attention mechanism to detect document-level novelty without any manual feature engineering. We contend that the simple alignment of texts between the source and target document(s) could identify the state of novelty of a target document. Our deep neural architecture elicits inference knowledge from a large-scale natural language inference dataset, which proves crucial to the novelty detection task. Our approach is effective and outperforms the standard baselines and recent work on document-level novelty detection by a margin of 3% in terms of accuracy.

Images

Decomposable Attention for Novelty Detection

Full paper

Click here to see the full paper which was published in Cambridge University Natural Language Engineering journal

GitHub repository

Click here to see the code for replicating the results in the paper

Decomposable Attention Model for Novelty Detection

Decomposable attention model for faster and better detection of semantic redundancy in text documents

Decomposable Attention Model for Novelty Detection

Decomposable attention model for faster and better detection of semantic redundancy in text documents

Abstract

Images

Full paper

GitHub repository