The Bag of Words Torn Open: Image Retrieval goes Deep (ENG)
This talk will introduce an automated tuning method of a convolutional neural network for image retrieval from a large collection of unordered images. State-of-the-art retrieval and Structure-from-Motion (SfM) methods are used to obtain 3D models, which are used to guide the selection of the training data for CNN fine-tuning. Hard-positive and hard-negative examples are shown to enhance the final performance in particular object retrieval with compact codes. Remarkably, the proposed method is on par with existing 256D compact representations even by using 32D image vectors. A comprehensive review of state-of-the-art image-retrieval systems using compact codes built from a bag-of-words image representation will be reviewed to motivate the proposed work.