Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Accepted at Empirical Methods for Natural Language Processing (EMNLP) 2019 NLP experienced a major change in the previous months. Previously, each NLP task defined a neural model and trained this model on the given task. But in recent months, various papers (ELMo , ULMFiT , GPT , BERT , GPT2 ) showed that it is possible to pre-train a NLP model on a language modelling task (more on this below) and then use this model as a starting point to fine-tune to further tasks. This has been labelled as an important turning point for NLP by many (, , , inter alia).