He is a Deep Learning researcher, recently co-founder and CTO of NEAR.AI, based in San Francisco, California, which teaches machines to write code. Previously, he was Engineering Manager at Google Research, leading a team working on Natural Language Understanding projects.
Topic: Attention is All You Need
Short Description: Currently most of the structured processing relies on recurrent or convolutional models in a encoder-decoder configuration. The best performing models connect decoder with encoder via attention mechanism. I’ll describe my recent work at Google on a simple network architecture, the Transformer, based solely on attention mechanisms. Experiments on neural machine translation task show improvement over existing best results at fraction of training costs compared to previous state-of-the-art recurrent models.