How to optimize models and hardware for fast inference in highload project
After training of the model, you need to put it into production. In this talk we will discuss steps required to optimize model in order to deploy it into production. We will look at advantages and disadvantages of serving models in a private cloud, AWS / Google Cloud or your own GPU servers. Also we will talk about optimization of OS, software and hardware taking as an example development of Nomeroff Net, vehicle number plate recognition system at AUTO.RIA.com.
Prerequisites for attendees: understanding of convolutional neural network internals.