Taghi Khoshgoftaar, Ph.D.

Motorola Professor Florida Atlantic University

  • Boca Raton FL

Taghi Khoshgoftaar researches computer security and intrusion detection systems.

Contact

Florida Atlantic University

View more experts managed by Florida Atlantic University

Multimedia

Areas of Expertise

Computer Security and Intrusion Detection Systems
Software Engineering
Machine Learning
Data Mining
Big Data Analytics
Biomedical and Health Informatics

Education

Virginia Polytechnic Institute and State University

Ph.D.

Selected Media Appearances

Artificial intelligence could be 'game changer' in detecting, managing Alzheimer's disease

Science Daily  

2019-06-25

"Machine learning has an inherent capacity to reveal meaningful patterns and insights from a large, complex inter-dependent array of clinical determinants and the ability to continue to 'learn' from ongoing utility of practical predictive models," said Taghi Khoshgoftaar, Ph.D., co-author and Motorola Professor in FAU's Department of Computer and Electrical Engineering and Computer Science. "Seamless use and real-time interpretation will enhance case management and patient care through innovative technology and practical and readily usable integrated clinical applications that could be developed into a hand-held device and app."...

View More

Scientists teach machines to predict recovery time from sports-related concussions

Science Daily  

2019-03-07

"We have introduced a cutting-edge approach and new clinical tool to manage sports-related concussions, which will measurably improve with more and more inclusive data," said Taghi Khoshgoftaar, Ph.D., co-author and Motorola professor in FAU's Department of Computer and Electrical Engineering and Computer Science, who collaborated with lead author Michael F. Bergeron, Ph.D., senior vice president of development and applications at SIVOTEC Analytics, and Sara Landset, co-author and a Ph.D. student at FAU. "Our supervised machine learning method has demonstrated efficacy and warrants further exploration."...

View More

Artificial Intelligence Holds Promise in Detecting Home Health Medicare Fraud

Home Health Care News  

2018-11-21

The team applied algorithms to detect patterns of fraud in the Centers for Medicare & Medicaid Services (CMS) data because “patterns in the data are hidden from us” as humans, said Taghi Khoshgoftaar, Florida Atlantic University director of Data Mining and Machine Learning Lab in the Department of Computer and Electrical Engineering and Computer Science...

View More

Selected Articles

Examining characteristics of predictive models with imbalanced big data

Journal of Big Data

T Hasanin, TM Khoshgoftaar, JL Leevy, N Seliya

2019

High class imbalance between majority and minority classes in datasets can skew the performance of Machine Learning algorithms and bias predictions in favor of the majority (negative) class. This bias, for cases where the minority (positive) class is of greater interest and the occurrence of false negatives is costlier than false positives, may result in adverse consequences. Our paper presents two case studies, each utilizing a unique, combined approach of Random Undersampling and Feature Selection to investigate the effect of class imbalance on big data analytics. Random Undersampling is used to generate six class distributions ranging from balanced to moderately imbalanced, and Feature Importance is used as our Feature Selection method. Classification performance was reported for the Random Forest, Gradient-Boosted Trees, and Logistic Regression learners, as implemented within the Apache Spark framework. The first case study utilized a training dataset and a test dataset from the ECBDL’14 bioinformatics competition. The training and test datasets contain about 32 million instances and 2.9 million instances, respectively. For the first case study, Gradient-Boosted Trees obtained the best results, with either a features-set of 60 or the full set, and a negative-to-positive ratio of either 45:55 or 40:60. The second case study, unlike the first, included training data from one source (POST dataset) and test data from a separate source (Slowloris dataset), where POST and Slowloris are two types of Denial of Service attacks. The POST dataset contains about 1.7 million instances, while the Slowloris dataset contains about 0.2 million instances. For the second case study, Logistic Regression obtained the best results, with a features-set of 5 and any of the following negative-to-positive ratios: 40:60, 45:55, 50:50, 65:35, and 75:25. We conclude that combining Feature Selection with Random Undersampling improves the classification performance of learners with imbalanced big data from different application domains.

View more

Evaluation of maxout activations in deep learning across several big data domains

Journal of Big Data

G Castaneda, P Morris, TM Khoshgoftaar

2019

This study investigates the effectiveness of multiple maxout activation function variants on 18 datasets using Convolutional Neural Networks. A network with maxout activation has a higher number of trainable parameters compared to networks with traditional activation functions. However, it is not clear if the activation function itself or the increase in the number of trainable parameters is responsible in yielding the best performance for different entity recognition tasks. This paper investigates if an increase in the number of convolutional filters on traditional activation functions performs equal-to or better-than maxout networks. Our experiments compare the Rectified Linear Unit, Leaky Rectified Linear Unit, Scaled Exponential Linear Unit, and Hyperbolic Tangent activations to four maxout function variants. We observe that maxout networks train relatively slower than networks with traditional activation functions, e.g. Rectified Linear Unit. In addition, we found that on average, across all datasets, the Rectified Linear Unit activation function performs better than any maxout activation when the number of convolutional filters is increased. Furthermore, adding more filters enhances the classification accuracy of the Rectified Linear Unit networks, without adversely affecting their advantage over maxout activations with respect to network-training speed.

View more

Impact of class distribution on the detection of slow HTTP DoS attacks using Big Data

Journal of Big Data

CL Calvert, TM Khoshgoftaar

2019

The integrity of modern network communications is constantly being challenged by more sophisticated intrusion techniques. Attackers are consistently shifting to stealthier and more complex forms of attacks in an attempt to bypass known mitigation strategies. In recent years, attackers have begun to focus their attack efforts on the application layer, allowing them to produce attacks that can exploit known issues within specific application protocols. Slow HTTP Denial of Service attacks are one such attack variant, which targets the HTTP protocol and can imitate legitimate user traffic in order to deny resources from a service. Successful mitigation of this attack type requires network analysts to evaluate large quantities of network traffic to identify and block intrusive traffic. The issue, is that the number of legitimate traffic instances can far outnumber the amount of attack instances, making detection problematic. Machine learning techniques can be used to aid in detection, but the large level of imbalance between normal (majority) and attack (minority) instances can lead to inaccurate detection results. In this work, we evaluate the use of data sampling to produce varying class distributions in order to counteract the effects of severely imbalanced Slow HTTP DoS big datasets. We also detail our process for collecting real-world representative Slow HTTP DoS attack traffic from a live network environment to create our datasets. Five class distributions are generated to evaluate the Slow HTTP DoS detection performance of eight machine learning techniques. Our results show that the optimal learner and class distribution combination is that of Random Forest with a 65:35 distribution ratio, obtaining an AUC value of 0.99904. Further, we determine through the use of significance testing, that the use of sampling techniques can significantly increase learner performance when detecting Slow HTTP DoS attack traffic.

View more

Show All +