Robbie T. Nakatsu is a Professor of Information Systems and Business Analytics with expertise in the management of IT, AI / Machine Learning, data management, IT outsourcing, and crowdsourcing. He has taught both undergraduate and graduate courses in management information systems, database management systems, and other topics in IT including programming languages and applications in AI. Prior to joining the faculty, Nakatsu served as a consultant in the MIS Department for Pepsi Co., a senior research analyst for CBS Records (Columbia House Division), and as an analyst in the Information Systems Department at Morgan Stanley. He has published in MIS and computer science journals such as Communications of the ACM, Information & Management, IEEE Transactions on Systems, Man and Cybernetics, Journal of Information Science, and International Journal of Intelligent Systems. In 2010, Nakatsu published a book titled "Diagrammatic Reasoning in AI." Nakatsu is a member of the Association for Computing Machinery and the Association for Information Systems.
University of British Columbia: Ph.D., Management Information Systems 2001
Yale University: B.A., Applied Mathematics 1986
Areas of Expertise (6)
Information Technology Management
Industry Expertise (3)
Training and Development
This article investigates resampling methods used to evaluate the performance of machine learning classification algorithms. It compares four key resampling methods: (1) Monte Carlo resampling, (2) the Bootstrap Method, (3) k-Fold Cross Validation, and (4) Repeated k-Fold Cross Validation. Two classification algorithms, Support Vector Machines (SVM) and Random Forests, applied to three datasets, are used in this study. Nine variations of the four resampling methods are used to tune parameters on the two classification algorithms on each of the three datasets. Performance is defined by how well the resampling method chooses a parameter value that fits the data well. A main finding is that Repeated k-Fold Cross Validation, overall, outperforms the other resampling methods in selecting the best-fit parameter value across the three different datasets.
Although a great many different crowdsourcing approaches are available to those seeking to accomplish individual or organizational tasks, little research attention has yet been given to characterizing how those approaches might be based on task characteristics. To that end, we conducted an extensive review of the crowdsourcing landscape, including a look at what types of taxonomies are currently available. Our review found that no taxonomy explored the multidimensional nature of task complexity. This paper develops a taxonomy whose specific intent is the classification of approaches in terms of the types of tasks for which they are best suited. To develop this task-based taxonomy, we followed an iterative approach that considered over 100 well-known examples of crowdsourcing. The taxonomy considers three dimensions of task complexity: (a) task structure – is the task well-defined, or does it require a more open-ended solution; (2) task interdependence – can the task be solved by an individual, or does it require a community of problem solvers; and (3) task commitment – what level of commitment is expected from crowd members? Based on this taxonomy, we identify seven categories of crowdsourcing and discuss prototypical examples of each approach. Furnished with such an understanding, one should be able to determine which crowdsourcing approach is most suitable for a particular task situation.
I describe a Venn diagramming technique used to perform syllogistic reasoning on categorical statements. The notation uses overlapping circles to represent relationships among two or three sets, shadings to represent emptiness, and x sequences to represent nonemptiness. These notations allow one to easily visualize logic problems. I then discuss rules of manipulation that can be used to transform one Venn diagram into another valid Venn diagram. These rules provide us with a formal procedure for performing syllogistic reasoning—that is to say, they provide us with an algorithm for proving or disproving the validity of a syllogism. I extend the Venn diagramming algorithm for syllogistic reasoning to allow for more than three sets of information at a time. This technique makes use of tables, which is also very intuitive and highly visual. The tabular technique described is capable of processing a much larger variety of logic statements.