In the last decades, information retrieval and content-based image classification have been studied by the scientific community as sophisticated alternatives to the often ineffective keyword searches or textual evidences. A key application of this technology is Computer-Aided Diagnosis (CAD), procedures that assist medical personnel in interpreting medical images. The availability of automated screening for certain diseases, such as Diabetic Retinopathy and Skin Cancer, is a pressing issue as the incidence of those diseases is increasing much faster than the number of specialists trained to make the screening , .
Deep Learning Architectures (DLA) (Deep Neural Networks , and Deep Belief Networks ) appear among the most prominent solutions for pattern recognition in images. Nevertheless, those solutions suffer from the need to estimate huge numbers of parameters, which implies the need of large training sets and a lot of computational resources.
The most advanced representations based upon the Bag of Visual Words (BoVW) model ,  are competing with DLA. Those cu2tting-edge representations are also “deep” in a sense, since they are based upon many layers of feature extraction. However, most of the time, there is no learning involved in the lower levels. This makes BoVW models less flexible than DLA, but also much less greedy in terms of computing resources and annotated data. Cutting-edge BoVW representations , ,  result in very large feature vectors (up to hundreds of thousands of dimensions) that we collectively are calling Jumbo Vectors. Although a complete analytical justification of both DLA and BoVW for image classification is still lacking, the high dimensionalities of the latter (in combination with the capacity limitation of Support Vector Machines) seems to be a necessary, but by no means sufficient condition to their success.
Our aim is to advance the state of the art in Computer-aided diagnosis (CAD), for the screening of pathologies based upon medical images. Our target applications are the early screening of Melanoma and Diabetic Retinopathy.
Melanoma is the leading cause of deaths due to skin cancer. Its prognosis is very good when it is found early, but deteriorate rapidly as the disease progresses, therefore, early screening is critical .
Diabetic Retinopathy is a leading cause of blindness worldwide. Early diagnosis also plays a critical role in the expected treatment outcomes .
More often than not, the limiting factor for CAD research is the availability of annotated data. For both studies we have cooperated with medical personnel in order to secure enough data to train and test the classification models. The Retinopathy datasets, described in , are already publicly available. For the Melanoma screening we have secured two datasets, one with dermatoscopy images, containing 747 images, being 187 melanomas, and another with 437 clinical images, 125 of them being melanomas.
It is clear that such small datasets are largely insufficient to estimate the millions of parameters involved in Deep Learning Architectures (DLA). In our preliminary experiments for the melanoma datasets, the Jumbo Vector approach showed encouraging results, while a modest 4-layer network failed completely due to extreme overfitting. The alternative we want to explore to make DLA competitive with Jumbo Vectors is to employ other annotated data, such as the standard computer vision datasets PASCAL VOC , ImageNet , ImageCLEF  and MediaEval , to train the lower layers of the architecture, in a transfer learning scheme . Therefore, the opportunity of the Amazon Web Services (AWS) Grant appears as fantastic opportunity to explore the competitiveness of DLA in a transfer learning scheme. We also intend to explore parallel implementations of the Jumbo Vector methods, which have as well, many costly but embarrassingly parallel, processing steps.
Expected outcomes (24-month timeframe)
- Deep Learning Architectures implementation  in AWS environment. Most of the code necessary is already available, although the parameterization work is delicate;
- Parallel implementation of the Fisher Vectors  and BossaNova frameworks  in AWS environment. We have the complete (sequential) source code completely implemented, and a parallel version implementation already ongoing ;
- Tests on the Melanoma screening application (annotated datasets already acquired) ;
- Tests on the Diabetic Retinopathy application (annotated datasets already acquired) ;
- A comprehensive, original, study of transfer learning for DLA. We expect this study to have great potential for scientific impact.
 Ramon Pires, Herbert F. Jelinek, Jacques Wainer, Siome Goldenstein, Eduardo Valle, Anderson Rocha. “Assessing the Need for Referral in Automatic Diabetic Retinopathy Detection”. IEEE Transactions on Biomedical Engineering
 Ning Situ; Xiaojing Yuan; Chen, Ji; Zouridakis, G., “Malignant melanoma detection by Bag-of-Features classification,” Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Conference of the IEEE , vol., no., pp.3110,3113, 20-25 Aug. 2008
 Hinton, G. E. Deep Belief Networks. http://scholarpedia.org/article/Deep_belief_networks. 2009.
 Csurka, G., Bray, C., Dance, C., and Fan, L. (2004). Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision (ECCV), pages 1–22, 2004.
 Avila, S.; Thome, N.; Cord, M.; Valle, E.; Araújo, A. Pooling in Image Representation: The Visual Codeword Point of View. Computer Vision and Image Understanding (CVIU), volume 117, issue 5, p. 453-465, 2013.
 The PASCAL Visual Object Classes, http://pascallin.ecs.soton.ac.uk/challenges/VOC/.
 ImageNet, http://www.image-net.org/.
 ImageCLEF – The CLEF Cross Language Image Retrieval Track, http://www.imageclef.org/.
 MediaEval Benchmarking Initiative for Multimedia Evaluation, http://multimediaeval.org/