Hi, I am Ken ...

• Seeking for Machine Learning and Data Science Summer Internship opportunity, 2019.

NSF funded Ph.D. Computer Science Student at North Carolina State University.

• Data Science and Machine Learning Researcher at the RAISE Lab (NCSU).


During the day I'm a Data Scientist and Machine Learning Researcher in the making. I am passionate in solving real-world problems empirically especially exploring the synergy of SE development with Computational Science practice and understanding social network of researchers and how that tie mediate to a student’s performance and future scholar attainment. My prime areas of interest are Machine Learning, Data Science, and Algorithms specifically in Text Analytics and Graph Mining.

Beside daily learning and doing rocket sciences, I enjoy running, cooking, playing badminton, poetry, and playing board games.

Education

North Carolina State University

PhD in Computer Science August 2016 - Present

Graduate Merit Fellowship ($10,000+) for GPA 3.5+

Appalachian State University

Bachelor of Science in Computational Mathematics 2011 - 2015

Minor in Computer Science

Graduated with Magna Cum Laude, 3.80/4.0. Top 5% of my graduated class.

Hyperparameter Optimization for Effort Estimation
Huy Tu, Vivek Nair
Software Analytics Workshop @ FSE (SWAN), 2018 (Accepted)

Hyperparameter tuning is the black art of automatically finding a good combination of control parameters for a data miner. An extensive empirical case study for hyperparameter tuning in defect prediction to questions the versatility of tuning’s usefulness while proposing future research and expanding the definition of tuning.

Can You Explain That Text, Better?
Huy Tu, Amritanshu Agrawal
Automated Software Engineering (ASE), 2018 [Revising]

Novel combination method of LDA topic modeling and Fast Frugal Tree (depth of 4) to predict the severeness of software bug reports. Offers comparable performance but simpler (25%-250% smallerin scale) and faster (50 times faster) than the state-of-the-art text mining models (TFIDF+SVM and LDADE+SVM).