New dissimilarity measures for image phylogeny reconstruction

Image phylogeny is the problem of reconstructing the structure that represents the history of generation of semantically similar images (e.g., near-duplicate images). Typical image phylogeny approaches break the problem into two steps: (1) estimating the dissimilarity between each pair of images and (2) reconstructing the phylogeny structure.

In this article, the authors propose new approaches to the standard formulation of the dissimilarity measure employed in image phylogeny, aiming at improving the reconstruction of the tree structure that represents the generational relationships between semantically similar images. These new formulations exploit a different method of color adjustment, local gradients to estimate pixel differences and mutual information as a similarity measure.

The results obtained with the proposed formulation remarkably outperform the existing counterparts in the literature, allowing a much better analysis of the kinship relationships in a set of images, allowing for more accurate deployment of phylogeny solutions to tackle traitor tracing, copyright enforcement and digital forensics problems.

Filipe Costa, Alberto Oliveira, Pasquale Ferrara, Zanoni Dias, Siome Goldenstein and Anderson Rocha. Pattern Analysis and Applications (2017). doi:10.1007/s10044-017-0616-9

Posted in blog, publications, science | Tagged , , , , , , , , , , , | Leave a comment

Temporal robust features for violence detection

Automatically detecting violence in videos is paramount for enforcing the law and providing the society with better policies for safer public places. In addition, it may be essential for protecting minors from accessing inappropriate contents on-line, and for helping parents choose suitable movie titles for their children. However, this is an open problem as the very definition of violence is subjective and may vary from one society to another. Detecting such nuances from video footages with no human supervision is very challenging.

In this paper the authors explores a fast end-to-end Bag-of-VisualWords (BoVW)-based framework for violence classification. They adapt Temporal Robust Features (TRoF), a fast spatio-temporal interest point detector and descriptor, which is custom-tailored for inappropriate content detection, such as violence. The used method holds promise for fast and effective classification of other recognition tasks (e.g., pornography and other inappropriate material). When compared to more complex counterparts for violence detection, the method shows similar classification quality while being several times more efficient in terms of runtime and memory footprint.

The explored three-layered BoVW-based framework for video violence classification

MOREIRA, D. H. ; AVILA, SANDRA ; PEREZ, MAURICIO ; MORAES, Daniel ; TESTONI, Vanessa ; VALLE, Eduardo ; Siome Goldenstein ; ROCHA, ANDERSON. Temporal Robust Features for Violence Detection. In: IEEE Intl. Winter Conference on Applications of Computer Vision (WACV), 2017, Santa Rosa. IEEE Intl. Winter Conference on Applications of Computer Vision (WACV), 2017. p. 1-9.

Posted in blog, publications, science | Tagged , , , , , , , , , , , , , | Leave a comment

Our research about pornography detection on the News

Our research about pornography detection, recently published at Neurocomputing, is getting the attention of many News agencies in Brazil. It’s a good indicator to measure the great impact it has into our modern society. All of them are listed below (in Portuguese):

Posted in blog, extra, media | Tagged , , , | Leave a comment

Talk: Multimedia Integrity Analytics

Today Prof. Anderson Rocha will give a talk at University of Kentucky about Multimedia Integrity Analytics. The talk is part of the university weekly seminars and presentation.

Abstract: Currently, multimedia objects can be easily created, stored, (re)-transmitted, and edited for good or bad. In this sense, there has been an increasing interest in finding the structure of temporal evolution within a set of documents and how documents are related to one another overtime. This process, also known in the literature as Multimedia Phylogeny, aims at finding the phylogeny tree(s) that best explains the creation process of a set of related documents (e.g., images/videos) and their ancestry relationships. Solutions to this problem have direct applications in forensics, security, copyright enforcement, news tracking services and other areas. In this talk, we will explore solutions for reconstructing the evolutionary tree(s) associated with a set of visual documents, more specifically images and videos. This can be useful for aiding experts to track the source of child pornography image broadcasting or the chain of image and video distribution in time, being extremely useful for complex different media provenance tasks. Finally, we will also discuss how to implement such solutions for large-scale setups considering millions of documents at the same time.

The full set of slides are available here.

Posted in blog, Keynotes, science, talk | Tagged , , , | Leave a comment

Empirical Evaluation of Resampling Procedures for Optimising SVM Hyperparameters

Tuning the regularisation and kernel hyperparameters is a vital step in optimising the generalisation performance of kernel methods, such as the support vector machine (SVM). This is most often performed by minimising a resampling/cross-validation based model selection criterion, however there seems little practical guidance on the most suitable form of resampling.

In this article, the authors present the results of an extensive empirical evaluation of resampling procedures for SVM hyperparameter selection, designed to address this gap in the machine learning literature. They tested 17 different resampling procedures on 121 binary classification data sets in order to select the best SVM hyperparameters.

The conclusion is that the 2-fold procedure should be used in data sets with more than 1000 points. In these cases, the user may expect a difference of −0.0031 to 0.0031 in the error rate of the classifier if a 5-fold procedure was used, which they believe is the limit of what one should consider an irrelevant change in the classifier error rate. For smaller data sets, they could not detect any significant difference (on average) between 5-fold and computationally more costly procedures such as 10-fold, 5 to 20 times repeated bootstrap, or 2 times repeated 5-fold. Thus, a 3-fold is appropriate for smaller data sets.

WAINER, Jacques; CAWLEY, Gavin. Empirical evaluation of resampling procedures for optimising SVM hyperparameters. Journal of Machine Learning Research, 2016.

Posted in blog, publications, science | Tagged , , , , , | Leave a comment

RECOD’s Diabetic Retinopathy research on the news

Our research on Diabetic Retinopathy has gotten the attention of a well known website about health and wellness in Brazil. The article (in Portuguese) highlights the importance of diabetic retinopathy treatment and how the society can take advantage of our proposed solution. As it is an ongoing research, a new article with our recent discoveries is under preparation. Stay tuned!

(credits to: Gustavo Arrais)

Posted in blog, extra, media | Tagged , , , , , | Leave a comment

Empirical comparison of cross-validation and internal metrics for tuning SVM hyperparameters

Hyperparameter tuning is a mandatory step for building a support vector machine classifier. In this article, the authors study some methods based on metrics of the training set itself, and not the performance of the classifier on a different test set – the usual cross-validation approach. Then, they compare cross-validation (5-fold) with Xi-alpha, radius-margin bound, generalized approximate cross validation, maximum discrepancy and distance between two classes on 110 public binary data sets.

The authors demonstrate that cross validation is the method that resulted in the best selection of the hyper-parameters, but it is also the method with one of the highest execution time. On the other hand, distance between two classes (DBTC) is the fastest and the second best ranked method. The authors also discuss that DBTC is a reasonable alternative to cross validation when training/hyperparameter-selection times are an issue and that the loss in accuracy when using DBTC is reasonably small.

Edson Duarte, Jacques Wainer, Empirical comparison of cross-validation and internal metrics for tuning SVM hyperparameters, Pattern Recognition Letters, Volume 88, 1 March 2017, Pages 6-11, ISSN 0167-8655,

Posted in blog, publications, science | Tagged , , , , , , , | Leave a comment

Video pornography detection through deep learning techniques and motion information

In this paper, the authors deal with a growing issue of our connected society: automated sensitive media (pornographic, violent, gory, etc.) filtering. A range of applications has increased societal interest on the problem, e.g., detecting inappropriate behavior via surveillance cameras; or curtailing the exchange of sexually-charged instant messages, also known as “sexting”, by minors. In addition, law enforcers may use pornography filters as a first sieve when looking for child pornography in the forensic examination of computers, or Internet content. The main application, however, remains preventing uploading or accessing undesired content for certain demographics (e.g., minors), or environments (e.g., schools, workplace).

In spite of the success of deep learning techniques in the computer vision arena, their literature on pornography detection is very scarce. In this work, the authors design and develop deep learning-based approaches to automatically extracting discriminative spatio-temporal characteristics for filtering pornographic content in videos. The evaluation of the proposed techniques shows that the association of Deep Learning with the combined use of static and motion information considerably improves pornography detection. Not only over current scientific state of the art, but also over off-the-shelf software solutions.

The contributions of this paper are three-fold:

i) A novel method for classifying pornographic videos, using convolutional neural networks along with static and motion information;

ii) A new technique for exploring the motion information contained in the MPEG motion vectors;

iii) A study of different forms of combining the static and motion information extracted from questioned videos.

Mauricio Perez, Sandra Avila, Daniel Moreira, Daniel Moraes, Vanessa Testoni, Eduardo Valle, Siome Goldenstein, Anderson Rocha, Video pornography detection through deep learning techniques and motion information, Neurocomputing, Volume 230, 22 March 2017, Pages 279-293, ISSN 0925-2312,

Posted in blog, publications, science | Tagged , , , , , , , , , , , , , , | 1 Comment

RECOD wins international competition for melanoma classification

A team of RECOD researchers won the melanoma classification task at the “Skin Lesion Analysis towards Melanoma Detection” challenge promoted by the International Skin Imaging Collaboration (ISIC).

RECOD got the third place (among 23 participants) at the skin lesion classification for two lesions (melanoma and seborrheic keratosis), and the fifth place for skin lesion segmentation. For the specific task of melanoma detection — the most important in this research area — RECOD got first place. RECOD’s participation in those tasks is detailed in a technical report (submitted before the official ranking was announced).

The results will be presented by Prof. Eduardo Valle at the upcoming International Symposium of Biomedical Imaging (ISBI 2017), where the RECOD team will also present a paper about Transfer Learning and Deep Learning for skin lesion classification.

The team was composed by professors Eduardo Valle  and Sandra Avila, post-doc researcher Lin Tzy Li, Ph.D. student Michel Fornaciali, and M.Sc. students Afonso Menegola and Julia Tavares, all RECOD members.

Prof. Eduardo Valle and Michel Fornaciali were recipients of the Google Research Awards for Latin America 2016, with a project related to the automatic screening of melanoma. More details can be found at Unicamp News (in Portuguese).

RECOD Titans Melanoma Team

From left to right: Julia Tavares, Prof. Sandra Avila, Michel Fornaciali, Prof. Eduardo Valle, Dr. Lin Tzy Li, and Afonso Menegola

Posted in awards, blog, publications, science | Tagged , , , , , | Leave a comment

Talk: The brave new world of open-set recognition

Being part of a series of four talks that will be given at NTU Singapore, in this fourth talk Prof. Anderson Rocha (RECOD) explored the research field of Open-set Recognition. The talk comprises four parts, each of which lasting approximately 45 minutes, totalling three hours. Parts 1 and 2 were delivered in Day #1 (March, 6th) while the others will be delivered in Day #2 (March, 9th).

Abstract: Coinciding with the rise of large-scale statistical learning within the visual computing, forensics and security areas, there has been a dramatic improvement in methods for automated image recognition in myriad of applications ranging from, categorization, object detection, forensics, and human biometrics, among many others. Despite this progress, a tremendous gap exists between the performance of automated methods in the laboratory and the performance of those same methods in the field. A major contributing factor to this is the way in which machine learning algorithms are typically evaluated: without the expectation that a class unknown to the algorithm at training time will be experienced at test time during operational deployment.

The purpose of this talk is to introduce the audience to this difficult problem in statistical learning specifically in the context of visual computing, information forensics and security applications. Examples considering other areas will also be given for completeness. A number of different topics will be explored, including supervised machine learning, probabilistic models, kernel machines, the statistical extreme value theory, and case studies for applications related to the analysis of images.


Part 1: An introduction to the open set recognition problem

  • General introduction: where do we find open set problems in visual computing, information forensics and security?
  • Decision models in machine learning
  • Theoretical background: the risk of the unknown
  • The compact abating probability model (Scheirer et al., T-PAMI 2014)
  • The Open-set Nearest Neighbors classifier (Mendes Jr. et al., Machine Learning, 2017)

10-minute break

Part 2: Algorithms that minimize the risk of the unknown

  • Kernel Density Estimation
  • 1-Class Support Vector Machines (SVMs)
  • Support Vector Data Description
  • 1-vs-Set Machine (Scheirer et al. T-PAMI 2013)
  • PI-SVM (Jain et al. ECCV 2014)
  • W-SVM (Scheirer et al. T-PAMI 2014)
  • Decision Boundary Carving (Costa et al. 2014)
  • Open-set Nearest Neighbors (Mendes Jr. et al. 2017)

Part 3: Case studies related to visual computing and other areas

  • Image Classification/Recognition
  • Visual Information Retrieval
  • Detection problems (e.g., pedestrian, objects)
  • Face Recognition
  • Scene Analysis for Surveillance
  • Source Camera Attribution
  • Authorship Attribution

10-minute break

Part 4: Research opportunities and trends

  • The open set recognition problem and new feature characterization methods (e.g., deep learning)
  • Integrating open set solutions with the image characterization process directly (strongly generalizable image characterization)
  • Opportunities for novelty detection and automatic addition of classes (online adaptation)
  • Bringing the user into the loop (relevance feedback)
  • Final considerations

If you are interested in the talk’s content, the complete set of slides is available here.

Time: 2.00pm – 3.30 pm (Seated by 1.50pm)

Venue: Demo Room, ROSE Lab, Research Techno Plaza (RTP), Level 4, Border X Block, 50 Nanyang Drive, 637553

Posted in blog, Keynotes, science, talk | Tagged , , , , | Leave a comment