hero image
Ken Pu, PhD - University of Ontario Institute of Technology. Oshawa, ON, CA

Ken Pu, PhD Ken Pu, PhD

Associate Professor, Computer Science | University of Ontario Institute of Technology

Oshawa, ON, CANADA

Increasing accessibility of open data to improve Internet transparency and accountability



The Internet is inundated by a constant stream of information from the government, business, industry, and largely, the media. Public perception is essentially based on the knowledge and views released via these channels. However, a vast quantity of data is readily available for public consumption, and datasets are easily accessible in their raw forms. Yet, in most cases, the schema of the data sets is either, missing, incomplete or inaccurate, leaving much of the world’s open data on the cutting room floor.

Intrigued by society’s trust in the online platform, Ken Pu, PhD, Associate Professor of Computer Science, in the Faculty of Science, is investigating ways to sift through this vast information and develop technology to enable users to explore and better understand the open data world. Greater accessibility to open data would allow consumers to formulate their own expert opinions about news and events shaping the nations, without relying on a summary from news media and other outlets.

Significant value as well as latent concerns about how the world operates is buried in anonymity. Dr. Pu’s latest research aims to map open data at the federal level in Canada, the U.S. and Great Britain to improve society's understanding of how all levels of government operate. The goal is to enable citizens to form their own opinion, to be more aware of and feel more comfortable with how the government is performing to ensure greater accountability.

In the Software Quality Research Lab, Dr. Pu is looking at ways to humanize open data and give users more control over it. He aims to build an online roadmap for mobile users to be able to source open data in a very pervasive way using touch screen and voice recognition. He is also focused on expanding data processing on mobile devices, with the goal of enabling open data exploration without relying on the need for traditional computer hardware.

Before joining UOIT as an Assistant Professor in 2006, Dr. Pu worked in Silicon Valley for two years as a Software Engineer with IBM. In 2011, he was appointed Associate Professor, and in 2013 he took on a two-year term as Undergraduate Program Director in Computer Science. Dr. Pu completed his Bachelor of Applied Science in Engineering Science, his Master of Applied of Science in Electrical and Computer Engineering, and his Doctorate in Computer Science all at the University of Toronto.

Industry Expertise (8)

Computer Hardware

Computer Software

Computer Networking

Computer/Network Security


Program Development


Social Media

Areas of Expertise (6)

Open Data Over the Web

Pervasive and Mobile Devices

Code as Databases

Queries as Programs

Human Database Interaction

Data Science

Education (3)

University of Toronto: PhD, Computer Science 2006

University of Toronto: MASc, Electrical and Computer Engineering 2000

University of Toronto: BASc, Engineering Science 1998

Affiliations (2)

  • Institute of Electrical and Electronics Engineers (IEEE)
  • Association for Computing Machinery

Event Appearances (6)

Towards Efficient Feedback Control in Streaming Computer Vision Pipelines

12th Asian Conference on Computer Vision (ACCV'14)  Singapore


Using Document Space for Relational Search

IEEE Conference on Information Reuse and Integration  San Francisco, California


A Stream Algebra for Computer Vision Pipelines

2nd Workshop on Web-scale Vision and Social Media  Columbus, Ohio


Road Boundary Detection in Challenging Scenarios

2012 9th international IEEE International Conference on Advanced Video and Signal-based Surveillance  Beijing, China


Authoring Relational Queries on the Mobile Devices

9th International Conference on Mobile Web Information System  Niagara Falls, Ontario


Selection of Features for Surname Classification

2011 IEEE Interational Conference on Information Reuse and Integration  Las Vegas, Nevada


Research Grants (1)

Humanized Databases

NSERC Discovery Grant $17000


Led by Dr. Pu, this five-year research project investigates the impact of human-computer interaction hardware on database interaction. Specifically, he aims to address the issues of data visualization and interactive query answering.

Courses (11)

Principles of Computer Science

CSCI 2010U, 2nd Year Undergraduate Course

view more

Software System Development and Integration

CSCI 2020U, 2nd Year Undergraduate Course

view more

Computer Architecture I

CSCI 2050U, 2nd Year Undergraduate Course

view more

Database Systems and Concepts

CSCI 3030U, 3rd Year Undergraduate Course

view more

Analysis and Design of Algorithms

CSCI 3070U, 3rd Year Undergraduate Course

view more

Programming Languages

CSCI 3055U, 3rd Year Undergraduate Course

view more


CSCI 4020U, 4th Year Undergraduate Course

view more

Survey of Computer Science

CSCI 5010G, Graduate Course

view more

Topics in Information Systems

CSCI 5730G, Graduate Course

view more

Intelligent Systems

CSCI 5740G, Graduate Course

view more

Advanced Topics in Information Systems

CSCI 6720G, Graduate Course

view more

Articles (5)

Automatic Parsing of Lane and Road Boundaries in Challenging Traffic Scenes Journal of Electronic Imaging


Automatic detection of road boundaries in traffic surveillance imagery can greatly aid subsequent traffic analysis tasks, such as vehicle flow, erratic driving, and stranded vehicles. This paper develops an online technique for identifying the dominant road boundary in video sequences captured by traffic cameras under challenging environmental and lighting conditions, e.g., unlit highways captured at night.

view more

Towards Efficient Feedback Control in Streaming Computer Vision Pipelines Computer Vision - ACCV 2014 Workshops


Stream processing is currently an active research direction in computer vision. This is due to the existence of many computer vision algorithms that can be expressed as a pipeline of operations, and the increasing demand for online systems that process image and video streams. Recently, a formal stream algebra has been proposed as an abstract framework that mathematically describes computer vision pipelines. The algebra defines a set of concurrent operators that can describe a pipeline of vision tasks, with image and video streams as operands. In this paper, we extend this algebra framework by developing a formal and abstract description of feedback control in computer vision pipelines.

view more

Scalable Distributed Processing of K Nearest Neighbor Queries Over Moving Objects IEEE Transactions on Knowledge and Data Engineering


Central to many applications involving moving objects is the task of processing k-nearest neighbor (k-NN) queries. Most of the existing approaches to this problem are designed for the centralized setting where query processing takes place on a single server; it is difficult, if not impossible, for them to scale to a distributed setting to handle the vast volume of data and concurrent queries that are increasingly common in those applications. To address this problem, we propose a suite of solutions that can support scalable distributed processing of k-NN queries.

view more

Using Document Space for Relational Search IEEE 15th International Conference on Information Reuse and Integration


In this paper, we present a family of methods and algorithms to efficiently integrate text indexing and keyword search from information retrieval to support search in relational databases. We propose a bi-directional transformation that maps relational database instances to document collections. The transformation is shown to be a homomorphism of keyword search. Thus, any search of tuple networks by a keyword query can be efficiently executed as a search for documents, and vice versa. By this construction, we demonstrate that indexing and search technologies developed for documents can naturally be reduced and integrated into relational database systems.

view more

Discovering Linkage Points Over Web Data Proceedings of the VLDB Endowment


A basic step in integration is the identification of linkage points, i.e., finding attributes that are shared (or related) between data sources, and that can be used to match records or entities across sources. This is usually performed using a match operator, that associates attributes of one database to another. However, the massive growth in the amount and variety of unstructured and semi-structured data on the Web has created new challenges for this task.

view more