Document Structure Learning¶
Automatic methods for extracting and learning the hierarchical organizational structure of documents. Rather than relying on pre-defined structural annotations, these approaches learn structures in a data-driven manner from raw text.
Document structure learning encompasses techniques ranging from dependency parsing (extracting grammatical relationships), to rhetorical structure discovery, to higher-level discourse organization. In the fake news detection domain, learned document structures offer a signal for distinguishing authentic from fabricated content, as real news typically exhibits more coherent and well-organized hierarchical structures compared to hastily-written misinformation.
Key papers¶
- Learning Hierarchical Discourse-level Structure for Fake News Detection — learns discourse-level dependency trees via unsupervised inter-sentential attention, achieving 82.19% accuracy on fake news classification
Related topics¶
- Discourse Structure (specific focus on discourse organization)
- Tree structured neural networks (neural architectures for hierarchical data)