Document Structure Learning¶

Automatic methods for extracting and learning the hierarchical organizational structure of documents. Rather than relying on pre-defined structural annotations, these approaches learn structures in a data-driven manner from raw text.

Document structure learning encompasses techniques ranging from dependency parsing (extracting grammatical relationships), to rhetorical structure discovery, to higher-level discourse organization. In the fake news detection domain, learned document structures offer a signal for distinguishing authentic from fabricated content, as real news typically exhibits more coherent and well-organized hierarchical structures compared to hastily-written misinformation.

Key papers¶

Learning Hierarchical Discourse-level Structure for Fake News Detection — learns discourse-level dependency trees via unsupervised inter-sentential attention, achieving 82.19% accuracy on fake news classification

Discourse Structure (specific focus on discourse organization)
Tree structured neural networks (neural architectures for hierarchical data)

Document Structure Learning¶

Key papers¶

Related topics¶