hero image
Alberto Cano, Ph.D. - VCU College of Engineering. Engineering East Hall, Room E4251, Richmond, VA, US

Alberto Cano, Ph.D. Alberto Cano, Ph.D.

Assistant Professor | VCU College of Engineering

Engineering East Hall, Room E4251, Richmond, VA, UNITED STATES

Dr. Cano specializes in machine learning, data mining, classification, big data, and high-performance computing.



Alberto Cano is an Assistant Professor with the Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, United States, where he heads the High-Performance Data Mining laboratory. His research is focused on machine learning, data mining, big data, evolutionary computation, general-purpose computing on graphics processing units, and distributed computing.

Areas of Expertise (5)

High Performance Computing

Machine Learning

Data Mining

Big Data


Accomplishments (1)

Amazon Machine Learning Award (professional)


Hate Speech Detection on Amazon Reviews using Data Stream Mining on Spark and AWS

Education (5)

University of Granada, Spain: Ph.D., Computer Science 2014

University of Cordoba, Spain: M.Sc., Intelligent Systems 2013

University of Granada, Spain: M.Sc., Soft Computing and Intelligent Systems 2011

University of Cordoba, Spain: B.Sc., Computer Science 2010

University of Cordoba, Spain: B.Sc., Computer Engineering 2008

Research Grants (2)

Industry Sponsored Research

Hamilton Beach Brands Inc. 


Industry Sponsored Research

Hate Speech Detection on Amazon Reviews using Data Stream Mining on Spark and AWS

Amazon Machine Learning Awards 


Amazon Machine Learning Awards

Courses (2)

CMSC 508 - Database Theory

Database Theory

CMSC 603 - High Performance Distributed Systems

High Performance Distributed Systems

Selected Articles (7)

Kappa Updated Ensemble for Drifting Data Stream Mining Machine Learning

A. Cano and B. Krawczyk


Learning from data streams in the presence of concept drift is among the biggest challenges of contemporary machine learning. Algorithms designed for such scenarios must take into an account the potentially unbounded size of data, its constantly changing nature, and the requirement for real-time processing. Ensemble approaches for data stream mining have gained significant popularity, due to their high predictive capabilities and effective mechanisms for alleviating concept drift. In this paper, we propose a new ensemble method named Kappa Updated Ensemble (KUE). It is a combination of online and block-based ensemble approaches that uses Kappa statistic for dynamic weighting and selection of base classifiers. In order to achieve a higher diversity among base learners, each of them is trained using a different subset of features and updated with new instances with given probability following a Poisson distribution. Furthermore, we update the ensemble with new classifiers only when they contribute positively to the improvement of the quality of the ensemble. Finally, each base classifier in KUE is capable of abstaining itself for taking a part in voting, thus increasing the overall robustness of KUE. An extensive experimental study shows that KUE is capable of outperforming state-of-the-art ensembles on standard and imbalanced drifting data streams while having a low computational complexity. Moreover, we analyze the use of Kappa vs accuracy to drive the criterion to select and update the classifiers, the contribution of the abstaining mechanism, the contribution of the diversification of classifiers, and the contribution of the hybrid architecture to update the classifiers in an online manner.

view more

Multi-label Punitive kNN with Self-Adjusting Memory for Drifting Data Streams ACM Transactions on Knowledge Discovery from Data

M. Roseberry, B. Krawczyk, and A. Cano


In multi-label learning, data may simultaneously belong to more than one class. When multi-label data arrives as a stream, the challenges associated with multi-label learning are joined by those of data stream mining, including the need for algorithms that are fast and flexible, able to match both the speed and evolving nature of the stream. This paper presents a punitive k nearest neighbors algorithm with a self-adjusting memory (MLSAMPkNN) for multi-label, drifting data streams. The memory adjusts in size to contain only the current concept and a novel punitive system identifies and penalizes errant data examples early, removing them from the window. By retaining and using only data that are both current and beneficial, MLSAMPkNN is able to adapt quickly and efficiently to changes within the data stream while still maintaining a low computational complexity. Additionally, the punitive removal mechanism offers increased robustness to various data-level difficulties present in data streams, such as class imbalance and noise. The experimental study compares the proposal to 24 algorithms using 30 real-world and 15 artificial multi-label data streams on six multi-label metrics, evaluation time, and memory consumption. The superior performance of the proposed method is validated through non-parametric statistical analysis, proving both high accuracy and low time complexity. MLSAMPkNN is a versatile classifier, capable of returning excellent performance in diverse stream scenarios.

view more

Evolving Rule-Based Classifiers with Genetic Programming on GPUs for Drifting Data Streams Pattern Recognition

A. Cano and B. Krawczyk


Designing efficient algorithms for mining massive high-speed data streams has become one of the contemporary challenges for the machine learning community. Such models must display highest possible accuracy and ability to swiftly adapt to any kind of changes, while at the same time being characterized by low time and memory complexities. However, little attention has been paid to designing learning systems that will allow us to gain a better understanding of incoming data. There are few proposals on how to design interpretable classifiers for drifting data streams, yet most of them are characterized by a significant trade-off between accuracy and interpretability. In this paper, we show that it is possible to have all of these desirable properties in one model. We introduce ERulesD2S: evolving rule-based classifier for drifting data Streams. By using grammar-guided genetic programming, we are able to obtain accurate sets of rules per class that are able to adapt to changes in the stream without a need for an explicit drift detector. Additionally, we augment our learning model with new proposals for rule propagation and data stream sampling, in order to maintain a balance between learning and forgetting of concepts. To improve efficiency of mining massive and non-stationary data, we implement ERulesD2S parallelized on GPUs. A thorough experimental study on 30 datasets proves that ERulesD2S is able to efficiently adapt to any type of concept drift and outperform state-of-the-art rule-based classifiers, while using small number of rules. At the same time ERulesD2S is highly competitive to other single and ensemble learners in terms of accuracy and computational complexity, while offering fully interpretable classification rules. Additionally, we show that ERulesD2S can scale-up efficiently to high-dimensional data streams, while offering very fast update and classification times. Finally, we present the learning capabilities of ERulesD2S for sparsely labeled data streams.

view more

Interpretable Multi-view Early Warning System adapted to Underrepresented Student Populations IEEE Transactions on Learning Technologies

A. Cano and J.D. Leonard


Early warning systems have been progressively implemented in higher education institutions to predict student performance. However, they usually fail at effectively integrating the many information sources available at universities to make more accurate and timely predictions, they often lack decision-making reasoning to motivate the reasons behind the predictions, and they are generally biased toward the general student body, ignoring the idiosyncrasies of underrepresented student populations (determined by socio-demographic factors such as race, gender, residency, or status as a freshmen, transfer, adult, or first-generation students) that traditionally have greater difficulties and performance gaps. This paper presents a multiview early warning system built with comprehensible Genetic Programming classification rules adapted to specifically target underrepresented and underperforming student populations. The system integrates many student information repositories using multiview learning to improve the accuracy and timing of the predictions. Three interfaces have been developed to provide personalized and aggregated comprehensible feedback to students, instructors, and staff to facilitate early intervention and student support. Experimental results, validated with statistical analysis, indicate that this multiview learning approach outperforms traditional classifiers. Learning outcomes will help instructors and policy-makers to deploy strategies to increase retention and improve academics.

view more

A survey on graphic processing unit computing for large-scale data mining Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

A. Cano


General purpose computation using Graphic Processing Units (GPUs) is a well‐established research area focusing on high‐performance computing solutions for massively parallelizable and time‐consuming problems. Classical methodologies in machine learning and data mining cannot handle processing of massive and high‐speed volumes of information in the context of the big data era. GPUs have successfully improved the scalability of data mining algorithms to address significantly larger dataset sizes in many application areas. The popularization of distributed computing frameworks for big data mining opens up new opportunities for transformative solutions combining GPUs and distributed frameworks. This survey analyzes current trends in the use of GPU computing for large‐scale data mining, discusses GPU architecture advantages for handling volume and velocity of data, identifies limitation factors hampering the scalability of the problems, and discusses open issues and future directions.

view more

Distributed Nearest Neighbor Classification for Large-Scale Multi-label Data on Spark Future Generation Computer Systems

J. Gonzalez-Lopez, S. Ventura, and A. Cano


Modern data is characterized by its ever-increasing volume and complexity, particularly when data instances belong to many categories simultaneously. This learning paradigm is known as multi-label classification and one of its most renowned methods is the multi-label k nearest neighbor ( Ml-knn). The traditional implementations of this method are not feasible for large-scale multi-label data due to its complexity and memory restrictions. We propose a distributed Ml-knn implementation based on the MapReduce programming model, implemented on Apache Spark. We compare three strategies for distributed nearest neighbor search: 1) iteratively broadcasting instances, 2) using a distributed tree-based index structure, and 3) building hash tables to group instances. The experimental study evaluates the trade-off between the quality of the predictions and runtimes on 22 benchmark datasets, and compares the scalability using different sizes of data. The results indicate that the tree-based index strategy outperforms the other approaches, having a speedup of up to 266x for the largest dataset, while achieving an accuracy equivalent to the exact methods. This strategy enables Ml-knn to scale efficiently with respect to the size of the problem.

view more

MIRSVM: Multi-Instance Support Vector Machine with Bag Representatives Pattern Recognition

G. Melki, A. Cano, and S. Ventura


Multiple-instance learning (MIL) is a variation of supervised learning, where samples are represented by labeled bags, each containing sets of instances. The individual labels of the instances within a bag are unknown, and labels are assigned based on a multi-instance assumption. One of the major complexities associated with this type of learning is the ambiguous relationship between a bag’s label and the instances it contains. This paper proposes a novel support vector machine (SVM) multiple-instance formulation and presents an algorithm with a bag-representative selector that trains the SVM based on bag-level information, named MIRSVM. The contribution is able to identify instances that highly impact classification, i.e. bag-representatives, for both positive and negative bags, while finding the optimal class separation hyperplane. Unlike other multi-instance SVM methods, this approach eliminates possible class imbalance issues by allowing both positive and negative bags to have at most one representative, which constitute as the most contributing instances to the model. The experimental study evaluates and compares the performance of this proposal against 11 state-of-the-art multi-instance methods over 15 datasets, and the results are validated through non-parametric statistical analysis. The results indicate that bag-based learners outperform the instance-based and wrapper methods, as well as MIRSVM’s overall superior performance against other multi-instance SVM models, having an average accuracy of 82.6%, which is 2.5% better than the best performing state-of-the-art MI classifier.

view more