Dr. M. Capretz is a Professor of Software Engineering in the Department of Electrical and Computer Engineering at Western University. She was an Associate Vice-Provost (Acting), Graduate and Postdoctoral Studies from July 2015 to June 2016. She has also taken the role of Associate Dean (Acting) Research and Graduate in the Faculty of Engineering from July 2010 to June 2011 and Associate Chair Graduate in the Department of Electrical and Computer Engineering from July 2008 to June 2013. Prior to joining Western University, Dr. M. Capretz was an Assistant Professor in the Software Engineering Laboratory at the University of Aizu (Japan).
Dr. M. Capretz has been involved with software development and research and teaching in software engineering for more than 30 years. Among her industry experience, she worked as a software engineer from 1984 to 1988 at Technological Center for Informatics-CTI in Campinas/SP, Brazil; and from 1981 to 1984, she was a systems analyst at a computer company in Sao Paulo, Brazil.
Dr. M. Capretz is a senior member of IEEE and a member of ACM (Association for Computing Machinery). She is also an Associate Scientist with the Lawson Health Research Institute.
Industry Expertise (3)
Areas of Expertise (11)
University of Durham: Ph.D., Software Engineering 1992
UNICAMP - Universidade Estadual de Campinas: MESc., Electrical Engineering 1988
UNICAMP - Universidade Estadual de Campinas: B.Sc., Computer Science 1981
- Professional Engineers Ontario: Licensed Member
- IEEE : Senior Member
- ACM (Association for Computing Machinery) : Member
- Lawson Health Research Institute : Associate Scientist
- Brazilian Portuguese
Research Grants (4)
DATA ANALYTICS FOR ONLINE CONTENT MANAGEMENT
NSERC CRD/Pelmorex Media
The objective of this research project is to devise a generic and extendable software framework for on-line content management. Two main components will be developed: i) integrated data analytics services for click streams with other data such as weather and user location, and ii) a supply-side platform for advertising that will include an automated process for pricing online inventory.
BIG DATA ANALYTICS FOR ENERGY MANAGEMENT
NSERC CRD/London Hydro
This project explores and advances Big Data analytics in the context of energy management. Smart meters data collect energy consumption following the Green Button Standard. The objective of this research project is to devise a comprehensive software framework for monitoring energy consumption and analyzing data provided by smart meters as well as for developing energy-related software applications. The two main components of this project are Energy Analytics Services and Energy Benchmarking Services.
CLOUD COMPUTING PLATFORM FOR SUSTAINABILITY MANAGEMENT
NSERC CRD/Powersmiths International Corporation
The main objective of this project is to create a software platform to manage and improve buildings' resources consumption while advancing cloud computing technologies. This project will explore the synergy of high-performance and availability requirements of the building's systems and cloud computing features through the creation of an extensible cloud-based platform for sustainability management.
BIG DATA VISUALIZATION FOR SMART BUILDINGS
OCE VIP II/Powersmiths International Coporation
The challenges to be addressed in this project include Big Data visualization specific to sensor data, simple interface design to access complex underlying Big Data analytics, integration of virtual reality and real-time metering, and assembling the storage and processing capabilities needed by sensor Big Data.
Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments.
In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes...
Advances in sensor technologies and the proliferation of smart meters have resulted in an explosion of energy-related data sets. These Big Data have created opportunities for development of new energy services and a promise of better energy management and conservation. Sensor-based energy forecasting has been researched in the context of office buildings, schools, and residential buildings. This paper investigates sensor-based forecasting in the context of event-organizing venues, which present an especially difficult scenario due to large variations in consumption caused by the hosted events. Moreover, the significance of the data set size, specifically the impact of temporal granularity, on energy prediction accuracy is explored. Two machine-learning approaches, neural networks (NN) and support vector regression (SVR), were considered together with three data granularities: daily, hourly, and 15 minutes. The approach has been applied to a large entertainment venue located in Ontario, Canada. Daily data intervals resulted in higher consumption prediction accuracy than hourly or 15-min readings, which can be explained by the inability of the hourly and 15-min models to capture random variations. With daily data, the NN model achieved better accuracy than the SVR; however, with hourly and 15-min data, there was no definitive dominance of one approach over another. Accuracy of daily peak demand prediction was significantly higher than accuracy of consumption prediction.
The emergence of Big Data has had profound impacts on how data are stored and processed. As technologies created to process continuous streams of data with low latency, Complex Event Processing (CEP) and Stream Processing (SP) have often been related to the Big Data velocity dimension and used in this context. Many modern CEP and SP systems leverage cloud environments to provide the low latency and scalability required by Big Data applications, yet validating these systems at the required scale is a research problem per se. Cloud computing simulators have been used as a tool to facilitate reproducible and repeatable experiments in clouds. Nevertheless, existing simulators are mostly based on simple application and simulation models that are not appropriate for CEP or for SP. This article presents CEPSim, a simulator for CEP and SP systems in cloud environments. CEPSim proposes a query model based on Directed Acyclic Graphs (DAGs) and introduces a simulation algorithm based on a novel abstraction called event sets. CEPSim is highly customizable and can be used to analyse the performance and scalability of user-defined queries and to evaluate the effects of various query processing strategies. Experimental results show that CEPSim can simulate existing systems in large Big Data scenarios with accuracy and precision.
The use of Complex Event Processing (CEP) and Stream Processing (SP) systems to process high-volume, high-velocity Big Data has renewed interest in procedures for managing these systems. In particular, self-management and adaptation of runtime platforms have been common research themes, as most of these systems run under dynamic conditions. Nevertheless, the research landscape in this area is still young and fragmented. Most research is performed in the context of specific systems, and it is difficult to generalize the results obtained to other contexts. To enable generic and reusable CEP/SP system management procedures and self-management policies, this research introduces the Attributed Graph Rewriting for Complex Event Processing Management (AGeCEP) formalism. AGeCEP represents queries in a language- and technology-agnostic fashion using attributed graphs. Query reconfiguration capabilities are expressed through standardized attributes, which are defined based on a novel classification of CEP query operators. By leveraging this representation, AGeCEP also proposes graph rewriting rules to define consistent reconfigurations of queries. To demonstrate AGeCEP feasibility, this research has used it to design an autonomic manager and to define a selected set of self-management policies. Finally, experiments demonstrate that AGeCEP can indeed be used to develop algorithms that can be integrated into diverse CEP systems.
Buildings are responsible for a significant amount of total global energy consumption and as a result account for a substantial portion of overall carbon emissions. Moreover, buildings have a great potential for helping to meet energy efficiency targets. Hence, energy saving goals that target buildings can have a significant contribution in reducing environmental impact. Today’s smart buildings achieve energy efficiency by monitoring energy usage with the aim of detecting and diagnosing abnormal energy consumption behaviour. This research proposes a generic collective contextual anomaly detection (CCAD) framework that uses sliding window approach and integrates historic sensor data along with generated and contextual features to train an autoencoder to recognize normal consumption patterns. Subsequently, by determining a threshold that optimizes sensitivity and specificity, the framework identifies abnormal consumption behaviour. The research compares two models trained with different features using real-world data provided by Powersmiths, located in Brampton, Ontario, Canada.
In recent years, advances in sensor technologies and expansion of smart meters have resulted in massive growth of energy data sets. These Big Data have created new opportunities for energy prediction, but at the same time, they impose new challenges for traditional technologies. On the other hand, new approaches for handling and processing these Big Data have emerged, such as MapReduce, Spark, Storm, and Oxdata H2O. This paper explores how findings from machine learning with Big Data can benefit energy consumption prediction. An approach based on local learning with support vector regression (SVR) is presented. Although local learning itself is not a novel concept, it has great potential in the Big Data domain because it reduces computational complexity. The local SVR approach presented here is compared to traditional SVR and to deep neural networks with an H2O machine learning platform for Big Data. Local SVR outperformed both SVR and H2O deep learning in terms of prediction accuracy and computation time. Especially significant was the reduction in training time, local SVR training was an order of magnitude faster than SVR or H2O deep learning.
The demand for knowledge extraction has been increasing. With the growing amount of data being generated by global data sources (e.g., social media and mobile apps) and the popularization of context-specific data (e.g., the Internet of Things), companies and researchers need to connect all these data and extract valuable information. Machine learning has been gaining much attention in data mining, leveraging the birth of new solutions. This paper proposes an architecture to create a flexible and scalable machine learning as a service. An open source solution was implemented and presented. As a case study, a forecast of electricity demand was generated using real-world sensor and weather data by running different algorithms at the same time.