The Internet is inundated by a constant stream of information from the government, business, industry, and largely, the media. Public perception is essentially based on the knowledge and views released via these channels. However, a vast quantity of data is readily available for public consumption, and datasets are easily accessible in their raw forms. Yet, in most cases, the schema of the data sets is either, missing, incomplete or inaccurate, leaving much of the world’s open data on the cutting room floor.
Intrigued by society’s trust in the online platform, Ken Pu, PhD, Associate Professor of Computer Science, in the Faculty of Science, is investigating ways to sift through this vast information and develop technology to enable users to explore and better understand the open data world. Greater accessibility to open data would allow consumers to formulate their own expert opinions about news and events shaping the nations, without relying on a summary from news media and other outlets.
Significant value as well as latent concerns about how the world operates is buried in anonymity. Dr. Pu’s latest research aims to map open data at the federal level in Canada, the U.S. and Great Britain to improve society's understanding of how all levels of government operate. The goal is to enable citizens to form their own opinion, to be more aware of and feel more comfortable with how the government is performing to ensure greater accountability.
In the Software Quality Research Lab, Dr. Pu is looking at ways to humanize open data and give users more control over it. He aims to build an online roadmap for mobile users to be able to source open data in a very pervasive way using touch screen and voice recognition. He is also focused on expanding data processing on mobile devices, with the goal of enabling open data exploration without relying on the need for traditional computer hardware.
Before joining UOIT as an Assistant Professor in 2006, Dr. Pu worked in Silicon Valley for two years as a Software Engineer with IBM. In 2011, he was appointed Associate Professor, and in 2013 he took on a two-year term as Undergraduate Program Director in Computer Science. Dr. Pu completed his Bachelor of Applied Science in Engineering Science, his Master of Applied of Science in Electrical and Computer Engineering, and his Doctorate in Computer Science all at the University of Toronto.
Industry Expertise (8)
Areas of Expertise (6)
Open Data Over the Web
Pervasive and Mobile Devices
Code as Databases
Queries as Programs
Human Database Interaction
University of Toronto: PhD, Computer Science 2006
University of Toronto: MASc, Electrical and Computer Engineering 2000
University of Toronto: BASc, Engineering Science 1998
- Institute of Electrical and Electronics Engineers (IEEE)
- Association for Computing Machinery
Event Appearances (6)
Towards Efficient Feedback Control in Streaming Computer Vision Pipelines
12th Asian Conference on Computer Vision (ACCV'14) Singapore
Using Document Space for Relational Search
IEEE Conference on Information Reuse and Integration San Francisco, California
A Stream Algebra for Computer Vision Pipelines
2nd Workshop on Web-scale Vision and Social Media Columbus, Ohio
Road Boundary Detection in Challenging Scenarios
2012 9th international IEEE International Conference on Advanced Video and Signal-based Surveillance Beijing, China
Authoring Relational Queries on the Mobile Devices
9th International Conference on Mobile Web Information System Niagara Falls, Ontario
Selection of Features for Surname Classification
2011 IEEE Interational Conference on Information Reuse and Integration Las Vegas, Nevada
Research Grants (1)
NSERC Discovery Grant $17000
Led by Dr. Pu, this five-year research project investigates the impact of human-computer interaction hardware on database interaction. Specifically, he aims to address the issues of data visualization and interactive query answering.
Principles of Computer Science
CSCI 2010U, 2nd Year Undergraduate Course
Software System Development and Integration
CSCI 2020U, 2nd Year Undergraduate Course
Computer Architecture I
CSCI 2050U, 2nd Year Undergraduate Course
Database Systems and Concepts
CSCI 3030U, 3rd Year Undergraduate Course
Analysis and Design of Algorithms
CSCI 3070U, 3rd Year Undergraduate Course
CSCI 3055U, 3rd Year Undergraduate Course
CSCI 4020U, 4th Year Undergraduate Course
Survey of Computer Science
CSCI 5010G, Graduate Course
Topics in Information Systems
CSCI 5730G, Graduate Course
CSCI 5740G, Graduate Course
Advanced Topics in Information Systems
CSCI 6720G, Graduate Course
Automatic detection of road boundaries in traffic surveillance imagery can greatly aid subsequent traffic analysis tasks, such as vehicle flow, erratic driving, and stranded vehicles. This paper develops an online technique for identifying the dominant road boundary in video sequences captured by traffic cameras under challenging environmental and lighting conditions, e.g., unlit highways captured at night.
Stream processing is currently an active research direction in computer vision. This is due to the existence of many computer vision algorithms that can be expressed as a pipeline of operations, and the increasing demand for online systems that process image and video streams. Recently, a formal stream algebra has been proposed as an abstract framework that mathematically describes computer vision pipelines. The algebra defines a set of concurrent operators that can describe a pipeline of vision tasks, with image and video streams as operands. In this paper, we extend this algebra framework by developing a formal and abstract description of feedback control in computer vision pipelines.
Central to many applications involving moving objects is the task of processing k-nearest neighbor (k-NN) queries. Most of the existing approaches to this problem are designed for the centralized setting where query processing takes place on a single server; it is difficult, if not impossible, for them to scale to a distributed setting to handle the vast volume of data and concurrent queries that are increasingly common in those applications. To address this problem, we propose a suite of solutions that can support scalable distributed processing of k-NN queries.
In this paper, we present a family of methods and algorithms to efficiently integrate text indexing and keyword search from information retrieval to support search in relational databases. We propose a bi-directional transformation that maps relational database instances to document collections. The transformation is shown to be a homomorphism of keyword search. Thus, any search of tuple networks by a keyword query can be efficiently executed as a search for documents, and vice versa. By this construction, we demonstrate that indexing and search technologies developed for documents can naturally be reduced and integrated into relational database systems.
A basic step in integration is the identification of linkage points, i.e., finding attributes that are shared (or related) between data sources, and that can be used to match records or entities across sources. This is usually performed using a match operator, that associates attributes of one database to another. However, the massive growth in the amount and variety of unstructured and semi-structured data on the Web has created new challenges for this task.