Skip to content

Graph-Text Paper Series

We will introduce a series of 6 main papers on graph-to-text and text-to-graph generation:

  • Method Papers Method:
    • (1) Top-1 supervised system at WebNLG 2020 Challenge,
    • (2) Unsupervised cycle training of graph-to-text and text-to-graph generation (Oral@INLG 2020 Workshop),
    • (3) Unsupervised one-to-many cycle training (AISTATS 2021).
  • Dataset Paper Dataset:
    • Distant supervision dataset (COLING 2020).
  • Text-to-Graph Papers Sub-method:
    • (1) Document-level relation extraction,
    • (2) Document-level named entity recognition (NAACL 2019).

What are Graph-to-Text and Text-to-Graph Tasks?

Graph-to-Text (G2T)

Text-to-Graph (T2G) conversion.

Text-to-Graph (T2G) is to parse graphical knowledge from text. Its important applications include knowledge graph/database construction from text documents.

Graph-to-Text (G2T)

Graph-to-Text (G2T) conversion.

Graph-to-Text (G2T) is to generate text description based on graphical data. It can be applied to verbalize knowledge graphs, which is very useful for intelligent bots.

Paper Series (by Amazon AI, Shanghai)

Method Paper #1 (Top 1 @WebNLG 2020 Challenge)

Our 1st model, P2, approaches the supervised G2T task by a plan-and-pretrain approach based on the T5 model. Our model is the top #1 system in the leaderboard of WebNLG 2020 Challenge at the INLG conference.

TextGen P2: A Plan-and-Pretrain Approach for Knowledge Graph-to-Text Generation
Qipeng Guo, Zhijing Jin, Ning Dai, Xipeng Qiu, Xiangyang Xue, David Wipf, Zheng Zhang
INLG 2020 Workshop / Paper / Code
Top #1 in the Leaderboard of WebNLG 2020

Method Paper #2 (INLG 2020 Workshop)

Our 2nd model, CycleGT, looks at the unsupervised G2T and T2G. We formulate them as a joint learning task through cycle training. The performance is on par with supervised baselines. This paper is an oral presentation at the WebNLG workshop at INLG 2020.

TextGen IE CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training
Qipeng Guo*, Zhijing Jin*, Xipeng Qiu, Weinan Zhang, David Wipf, Zheng Zhang
INLG 2020 Workshop (Oral) / Paper / Code

Method Paper #3 (AISTATS 2021)

Our 3nd model, CycleCVAE, extends the cycle training framework to one-to-many mappings between graphs and text. In contrast with cycle training methods assuming the one-to-one mapping (e.g., one graph corresponds to one text description), our work enables the one-to-many mapping (e.g., one graph corresponds to multiple text descriptions). We improve cycle training with the conditional variational autoencoder (CVAE).

Our CycleCVAE paper has been accepted at AISTATS 2021.

TextGen IE Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings
Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, David Wipf
AISTATS 2021 / Paper / Code

Dataset paper (COLING 2020)

We build a distantly supervised dataset, called GenWiki, for G2T and T2G conversion. GenWiki has 1.3 million content-sharing text and graphs. It can be used for unsupervised training or distantly supervised training. GenWiki’s paper is at the top NLP conference COLING 2020.

TextGen IE GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation
Zhijing Jin*, Qipeng Guo*, Xipeng Qiu, Zheng Zhang
COLING 2020 / Paper / Poster / Code

Sub-Method Paper #1

A component in our CycleGT/CycleCVAE framework is the T2G task. T2G is usually done by two steps: named entity recognition (NER), and relation extraction (RE). In our paper Relation of the Relations” (RoR), we address relation extraction of textual documents. Our RoR model extracts relations among given entities in a text document. We use Graph Neural Networks (GNN) to model the relations among multiple entities, and the meta-level interdependencies among multiple relations.

Our model outperforms the state-of-the-art approaches by +1.12% on the ACE05 dataset and +2.55% on SemEval 2018 Task 7.2.

IE Relation of the Relations: A New Paradigm of the Relation Extraction Problem
Zhijing Jin*, Yongyi Yang*, Xipeng Qiu, Zheng Zhang
arXiv 2020 / Paper / Code

Sub-Method Paper #2 (NAACL 2019)

Besides relation extraction, the other component of T2G is named entity recognition (NER). We have an earlier paper, GraphIE, on document-level NER using graph neural networks (GNN). The GraphIE paper is at NAACL 2019.

IE GraphIE: A Graph-Based Framework for Information Extraction
Yujie Qian, Enrico Santos, Zhijing Jin, Jiang Guo, Regina Barzilay
NAACL 2019 / Paper / Code


Since many of our paper use Graph Neural Networks (GNN), we composed a simple survey on GNN for NLP.

NLP Graph Neural Net Applications for Natural Language Processing
Xipeng Qiu, Zhijing Jin, Xiangkun Hu
Paper 2020

The researchers of the above papers are from Amazon AI Shanghai (China), Max Planck Institute of Intelligent Systems, Tübingen (Germany), Fudan University (China), Tsinghua University (China), Jiao Tong University (China), and Massachusetts Institute of Technology (US).