Chemical reaction prediction with hybrid graph-SMILES transformers
Abstract: In this work, we propose a novel forward reaction prediction method. It is an end-to-end neural machine translation method which receives as input the molecular graphs of the reactants and reagents and outputs the SMILES representation of the products. Named Hybrid Graph-SMILES Transformer (HGST), the method addresses the weaknesses of existing reaction predictors while taking advantage of their strengths. Compared to sequence-based models, HGST does not need SMILES randomization and data augmentation to achieve new state-of-the-art results on USPTO benchmarks. This is due to the permutation invariance property of molecular graphs which are taken as inputs, making the training and inference faster. Although stereo-isomeric information encoded in SMILES could be lost in graphs, our model solves the issue by directly incorporating the necessary stereo-isomeric information into the atom and bond features of the molecular graphs. Compared to graph-based models, there is no need to predefine an atom-mapping and solve the permutation invariance problem associated with graph prediction because the method predicts a SMILES.
The main ingredient of HGST is its molecular graph attention encoder that allows intra- and inter-molecule, as well as long- and short-range, information exchange between atoms. This aspect is crucial for elucidating the reaction mechanism. HGST comprises an explainer that is able to automatically highlight atoms attended to, allowing interpretability from the model perspective and showing atoms that are important and involved in the reactions. These reaction centers are often provided as inputs for the graph-based models and very difficult to identify in sequence-based models due to the complex syntax of SMILES. However, in our model, we get probable reaction center identifications and potential electron movements simply as a byproduct of our modelling choices.
Speaker: Prudencio Tossou
Directional graph networks
Abstract: In order to overcome the expressive limitations of graph neural networks (GNNs), we propose the first method that exploits vector flows over graphs to develop globally consistent directional and asymmetric aggregation functions. We show that our directional graph networks (DGNs) generalize convolutional neural networks (CNNs) when applied on a grid. Whereas recent theoretical works focus on understanding local neighborhoods, local structures and local isomorphism with no global information flow, our novel theoretical framework allows directional convolutional kernels in any graph.
First, by defining a vector field in the graph, we develop a method of applying directional derivatives and smoothing by projecting node-specific messages into the field. Then we propose the use of the Laplacian eigenvectors as such vector field, and we show that the method generalizes CNNs on an n-dimensional grid and is provably more discriminative than standard GNNs regarding the Weisfeiler-Lehman 1-WL test. Finally, we bring the power of CNN data augmentation to graphs by providing a means of doing reflection, rotation, and distortion on the underlying directional field.
We evaluate our method on different standard benchmarks and see a relative error reduction of 8% on the CIFAR10 graph dataset and 11%-32% on the molecular ZINC dataset. We further observe significant improvements on real-world biochemical datasets comprising of HIV inhibition (OGB-MolHIV) and biological activity classification of 400,000 compounds on 128 tasks (OGB-MolPCBA). An important outcome of this work is that it enables translation of any physical or biological problems with intrinsic directional axes into a graph network formalism with an embedded directional field. This statement is validated by the strong empirical evidence on molecular property prediction datasets.
Speaker: Dominique Beaini