Publications

In Situ Identification of Student Struggles

This set of studies explored automated feedback on student programming assignments. This project is ongoing.

Early Identification of Student Struggles at the Topic Level Using Context-Agnostic Features
Kai Arakawa, Qiang Hao, Wesley Deneke, Indie Cowan, Steven Wolfman, and Abigayle Peterson
Proceedings of the 53rd ACM Technical Symposium on Computer Science Education (SIGCSE ’22). ACM, New York, NY, USA.

Abstract:
The identification of student struggles has drawn increasing interests from computing education and learning analytics communities in recent years, considering the high failure rate and fast enrollment growth of computer science courses. Prior studies on this topic employed a multitude of data sources and methodologies with varying degrees of success. Nearly all studies attempted to predict low overall course performance to identify struggling students, risking oversimplifying student learning and struggles. Additionally, many studies utilize data sources that are limited to their original contexts or local student demographics, making it difficult to replicate or put the findings into practice. To address these gaps, we studied the feasibility of identifying student struggles at the topic level using features that are agnostic to courses and contexts. Our results show that it is possible to identify student struggles at a more fine-grained level within days. Our findings contribute new insights into automatic identification of student struggles at the topic level on a large scale, which can be used to guide meaningful interventions on student learning.

In Situ Identification of Student Self-Regulated Learning Struggles in Programming Assignments
Kai Arakawa, Qiang Hao, Tyler Greer, Lu Ding, Christopher D. Hundhausen,and Abigayle Peterson
Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (SIGCSE ’21). ACM, New York, NY, USA.

Abstract:
Effective self-regulated learning (SRL) is important to student academic success. Understanding what SRL struggles students face in programming assignments is critical to guide many efforts in computing education, such as designing scalable interventions and developing effective learning technologies. Prior studies on this topic contributed to understanding what SRL strategies CS students typically use in programming assignments, and the interventions for some SRL struggles such as procrastination. However, few studies have investigated student SRL struggles in programming systematically. To fill this gap, we investigate student SRL struggles in the context of CS2 through a case study. We used multiple approaches to collect real-time data and validate our findings, such as tracking student progress, identifying potential SRL struggles, and interviewing identified struggling students to confirm our identifications. This study contributes to a deeper understanding of what SRL struggles students face in programming at a fine-grained level, and provides guidance on interventions for SRL struggles.

Towards Modeling Student Engagement with Interactive Computing Textbooks: An Empirical Study
David H. Smith IV, Qiang Hao, Christopher D. Hundhausen, Filip Jagodzinski, Josh Myers-Dean, and Kira Jaeger
Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (SIGCSE ’21). ACM, New York, NY, USA.

Abstract:
Interactive textbooks have great potential to increase student engagement with the course content which is critical to effective learning in computing education. Prior research on digital textbooks and interactive visualizations contributes to our understanding of student interactions with visualizations and modeling textbook knowledge concepts. However, research investigating student usage of interactive computing textbooks is still lacking. This study seeks to fill this gap by modeling student engagement with a Jupyter-notebook-based interactive textbook. Our findings suggest that students’ active interactions with the presented interactive textbook, including changing, adding, and executing code in addition to manipulating visualizations, are significantly stronger in predicting student performance than conventional reading metrics. Our findings contribute to a deeper understanding of student interactions with interactive textbooks and provide guidance on the effective usage of said textbooks in computing education.

Towards understanding the effective design of automated formative feedback for programming assignments
Qiang Hao , David H. Smith IV , Lu Ding , Amy Ko , Camille Ottaway , Jack Wilson , Kai H. Arakawa , Alistair Turcan , Timothy Poehlman & Tyler Greer
Computer Science Education, 1–23.

Abstract:
Background and Context: automated feedback for programming assignments has great potential in promoting just-in-time learning, but there has been little work investigating the design of feedback in this context.
Objective: to investigate the impacts of different designs of automated feedback on student learning at a fine-grained level, and how students interacted with and perceived the feedback.
Method: a controlled quasi-experiment of 76 CS students, where students of each group received a different combination of three types of automated feedback for their programming assignments.
Findings: feedback addressing the gap between expected and actual outputs is critical to effective learning; feedback lacking enough details may lead to system gaming behaviors.
Implications: the design of feedback has substantial impacts on the efficacy of automated feedback for programming assignments; more research is needed to extend what is known about effective feedback design in this context.

Investigating the Essential of Meaningful Automated Formative Feedback for Programming Assignments.
Qiang Hao, Jack Wilson, Camille Ottaway, Naitra Iriumi, Kai Hicks and David H. Smith, IV
2019 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). Memphis, TN.

Abstract:
This study investigated the essential of meaningful automated feedback for programming assignments. Three different types of feedback were tested, including (a) What’s wrong - what test cases were testing and which failed, (b) Gap - comparisons between expected and actual outputs, and (c) Hint - hints on how to fix problems if test cases failed. 46 students taking a CS2 participated in this study. They were divided into three groups, and the feedback configurations for each group were different: (1) Group One - What’s wrong, (2) Group Two - What’s wrong + Gap, (3) Group Three - What’s wrong + Gap + Hint. This study found that simply knowing what failed did not help students sufficiently, and might stimulate system gaming behavior. Hints were not found to be impactful on student performance or their usage of automated feedback. Based on the findings, this study provides practical guidance on the design of automated feedback.

How Automated Feedback is Delivered Matters: Formative Feedback and Knowledge Transfer.
Qiang Hao, Michail Tsikerdekis 2019 IEEE Frontier in Education (FIE ’19). Cincinnati, Ohio.

Abstract:
This Full Research Paper investigated the correlations of automated formative feedback and student knowledge transfer in an advanced-level network course. A microservice of automated formative feedback was designed and implemented. The automated feedback was set up to address correctness and coding style along a series of milestones in finishing each of the assigned programming projects. 36 students participated in this study. Different from the concerns of prior studies that automated formative feedback may encourage trial & error and harm learning, this study identified a link between the automated formative feedback and student knowledge transfer. The results of this study confirmed positive findings from other fields on formative feedback, and call for more research on formative feedback in computing education.

Understanding Computing Education

This set of studies explored computing education as a field from different perspectives. This project is ongoing. The current angle we choose is “replication”.

A Systematic Investigation of Replications in Computing Education Research.
Qiang Hao, David H. Smith, IV, Naitra Iriumi, Michail Tsikerdekis, and Andrew J. Ko
ACM Transactions on Computing Education. , 19(4), 1-18.

Abstract:
As the societal demands for application and knowledge in computer science (CS) increase, CS student enrollment keeps growing rapidly around the world. By continuously improving the efficacy of computing education and providing guidelines for learning and teaching practice, computing education research plays a vital role in addressing both educational and societal challenges that emerge from the growth of CS students. Given the significant role of computing education research, it is important to ensure the reliability of studies in this field. The extent to which studies can be replicated in a field is one of the most important standards for reliability. Different fields have paid increasing attention to the replication rates of their studies, but the replication rate of computing education was never systematically studied. To fill this gap, this study investigated the replication rate of computing education between 2009 and 2018. We examined 2,269 published studies from three major conferences and two major journals in computing education, and found that the overall replication rate of computing education was 2.38%. This study demonstrated the need for more replication studies in computing education and discussed how to encourage replication studies through research initiatives and policy making.

Quantifying Student Prior Computer Science Knowledge

This set of studies explored the measurements of student prior CS knowledge and their effects on student performance. This project is ongoing.

Quantifying the Effects of Prior Knowledge in Entry-Level Programming Courses
David H. Smith, IV, Qiang Hao, Filip Jagodzinski, Yan Liu, and Vishal Gupta
ACM 2019 Conference on Global Computing Education. Chendu, China.

Abstract:
Computer literacy and programming are being taught increasingly at the K-12 level with more students than ever matriculating in college with prior programming experience. Accurately assessing student programming skills acquired in high school can inform college faculty about the range of competencies in introductory programming courses. The tool predominantly-used for assessing past CS knowledge and skills is a survey, which lacks quantitative rigor. This study aims to (1) quantify the effects of prior knowledge in entry-level programming courses and (2) compare the different measurement approaches of student prior knowledge in programming, including surveys and aptitude tests. The results of this study reveal that a discrepancy exists between the results of surveys and aptitude tests. Consistent with prior survey studies, our survey results showed that the effects of student prior programming knowledge faded gradually during the course period. In contrast, the aptitude test results indicated that the effects of student prior knowledge did not weaken over time. The accuracy of both measurements and implications for instructors were further discussed.

Online Help seeking

This set of studies explored online help seeking from different angles in the context of computing education. This project is inactive now, but I am happy to answer any questions about our findings.

Towards understanding online question & answer interactions and their effects on student performance in large-scale STEM classes
David H. Smith IV, Qiang Hao, Venessa Dennen, Michail Tsikerdekis, Bradley Barnes, Lilu Martin, Nathan Tresham
International Journal of Educational Technology in Higher Education, 17:20, 1-15.

Abstract:
Online question & answer (Q & A) is a distinctive type of online interaction that is impactful on student learning. Prior studies on online interaction in large-scale classes mainly focused on online discussion and were conducted mainly in non-STEM fields. This research aims to quantify the effects of online Q & A interactions on student performance in the context of STEM education. 218 computer science students from a large university in the southeastern United States participated in this research. Data of four online Q & A activities was mined from the online Q & A forum for the course, including three student activities (asking questions, answering questions and viewing questions/answers) and one instructor activity (answering questions/providing clarifications). These activities were found to have different effects on student performance. Viewing questions/answers was found to have the greatest effect, while interaction with instructors showed minimum effects. This research fills the gap of lacking research in online Q & A, and the results of this research can inform the effective usage of online Q & A in large-scale STEM courses.

Automatic Identification of Ineffective Online Student Questions in Computing Education
Qiang Hao, April Galyardt, Bradley Barnes, Ewan Wright, Robert Maribe Branch
2018 IEEE Frontier in Education (FIE ’18). San Jose, CA.

Abstract:
This Research Full Paper explores automatic identification of ineffective learning questions in the context of large-scale computer science classes. The immediate and accurate identification of ineffective learning questions opens the door to possible automated facilitation on a large scale, such as alerting learners to revise questions and providing adaptive question revision suggestions. To achieve this, 983 questions were collected from a question & answer platform implemented by an introductory programming course over three semesters in a large research university in the Southeastern United States. Questions were firstly manually classified into three hierarchical categories: 1) learning-irrelevant questions, 2) effective learning-relevant questions, 3) ineffective learning-relevant questions. The inter-rater reliability of the manual classification (Cohen’s Kappa) was.88. Four different machine learning algorithms were then used to automatically classify the questions, including Naive Bayes Multinomial, Logistic Regression, Support Vector Machines, and Boosted Decision Tree. Both flat and single path strategies were explored, and the most effective algorithms under both strategies were identified and discussed. This study contributes to the automatic determination of learning question quality in computer science, and provides evidence for the feasibility of automated facilitation of online question & answer in large scale computer science classes.

Online help seeking in computer science education (Doctoral dissertation)
Qiang Hao
University of Georgia, 2017.

Predicting Computer Science Students’ Online Help-Seeking Tendencies
Qiang Hao, Bradley Barnes, Robert Maribe Branch, Ewan Wright
Knowledge Management & E-Learning, 9(1), 19-32, 2017.

Abstract:
This study investigated how computer science students seek help online in their learning and what factors predict their online help-seeking behaviors. Online help-seeking behaviors include online searching, asking teachers online for help, and asking peers online for help. 207 students from a large university in the southeastern United States participated in the study. It was revealed that computer science students tended to search online more frequently than ask people online for help. Five factors, including epistemological belief, interest, learning proficiency level, prior knowledge of the learning subject, and problem difficulty, were explored as potential predictors in this study. It was found that learning proficiency level and problem difficulty were significant predictors of three types of online help-seeking behaviors, and other factors influenced online help seeking to different extents. The study provides evidence to support that online searching should be considered as an integrated part of online help seeking, and gives guidelines for practice of facilitating online help seeking and future studies.

The Influence of Achievement Goals on Online Help Seeking of Computer Science Students
Qiang Hao, Bradley Barnes, Ewan Wright, Robert Maribe Branch
British Journal of Educational Technology, 48(6), 1273-1283, 2017.

Abstract:
This study investigated the online help‐seeking behaviors of computer science students with a focus on the effect of achievement goals. The online help‐seeking behaviors investigated were online searching, asking teachers online for help, and asking peers or unknown people online for help. One hundred and sixty‐five students studying computer science from a large research university in the south‐eastern United States participated in the study. It was found that students searched online significantly more frequently than they asked people online for help. Contrary to prior findings on face‐to‐face help seeking, no achievement goals were found to be significant in predicting the tendencies of students to seek help online. These findings provide evidence to support the role of online searching as an integral part of online help seeking and demonstrate that research findings on face‐to‐face help seeking should not be assumed to be naturally extendable to online help seeking.

What Are the Most Important Predictors of Computer Science Students’ Online Help-Seeking Behaviors?
Qiang Hao, Ewan Wright, Bradley Barnes, Robert Maribe Branch
Computers in Human Behaviors, 62, 467-474, 2016.

Abstract:
This study investigated the online help‐seeking behaviors of computer science students with a focus on the effect of achievement goals. The online help‐seeking behaviors investigated were online searching, asking teachers online for help, and asking peers or unknown people online for help. One hundred and sixty‐five students studying computer science from a large research university in the south‐eastern United States participated in the study. It was found that students searched online significantly more frequently than they asked people online for help. Contrary to prior findings on face‐to‐face help seeking, no achievement goals were found to be significant in predicting the tendencies of students to seek help online. These findings provide evidence to support the role of online searching as an integral part of online help seeking and demonstrate that research findings on face‐to‐face help seeking should not be assumed to be naturally extendable to online help seeking.

Active Learning Environments in Computing Education

This set of studies explored the effects of active learning environments in computing education. This project is inactive now, but I am happy to answer any questions about our findings.

On the Effects of Active Learning Environments in Computing Education
Tyler Greer, Qiang Hao, Mengguo Jing, Bradley Barnes
2019 ACM Technical Symposium on Computer Science Education (SIGCSE ‘19). Minneapolis, MN.

Abstract:
This replication study aims at both quantifying the effects of active learning classrooms in introductory programming courses (CS1) and overcoming some design and methodological limits of prior studies on this topic. 156 students enrolled in three different sections of the same CS1 participated in this study. The three sections differed from each other either in terms of learning pedagogies (conventional lecture vs. peer instruction) or physical learning environments (lecture hall vs. active learning classroom). This study did not replicate the findings of prior studies on this topic. Instead, this study found that when learning pedagogies were controlled, learning environments did not have significant influences on student performance. On the other hand, learning pedagogies were found to have significant influences on student performance. When peer instruction is conducted other than conventional lecturing, students tended to have significantly better performance. Such findings highlight the importance of active learning in computing education, and the feasibility of conducting active learning in CS1 despite of physical environment constraints. Additionally, such findings emphasize the necessity of replication studies on the topic of active learning environments, and invite debates on the investment decisions in active learning classrooms.

Effects of Active Learning Environments and Instructional Methods in Computer Science Education
Qiang Hao, Bradley Barnes, Ewan Wright, Eunjung Kim
2018 ACM Technical Symposium on Computer Science Education (SIGCSE ‘18). Baltimore, MD.

Abstract:
This research investigated the impacts of active learning environments and instructional methods adapted to such environments on the academic performance of computer science students. Two consecutive studies involving a total of 267 novice students in the same course were conducted across two different semesters. The course was taught by the same instructor and set up with two different sections. One section was taught in a conventional lecture hall, while the other was taught in an active-learning classroom with adapted instructional methods. Active learning environments and the adapted instructional methods were found to have significantly positive effects on students’ learning outcomes. Fine-grained results grouped by major were discussed. The findings of this study demonstrate positive effects of active learning environments in computer science education, thereby adding to the literature on both computer science education and learning environments.

Feature Selection on Predicting Post-graduation Income

This set of studies explored the most important factors on predicting post-graduation income. This project is inactive now, but I am happy to answer any questions about our findings.

Examine Educational Opportunity and Inequality Using Machine Learning Methods with US National Data
Yan Liu, Lok Heng Chau, Qiang Hao.
2021 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS 2021). Washington, D.C.

Abstract:
Education opportunity and inequality have been serious concerns in the history of US education. Social economic, ethnic, and racial disparities in academic achievement has been frequently shown in the literature. However, in existing literature researchers investigated these issues in disjointed contexts when using large scale national assessment data because the conventional statistics they adopted only focuses on making inferences from a small number of predictors. Using a largescale national data, this study aims to predict students’ mathematics achievement across 50 states in the US with a total of 74 predictors and over 11,000 school districts. Three machine learning methods, i.e., Random Forests, Lasso Regression, and Genetic Algorithm, were adopted in this study. The results suggest that racial and ethnic proportions in the school district, school related factors (e.g., pupil-teacher ratio, free lunch provided in the school), family socioeconomic status and parent education (e.g., poverty, occupation) are important factors regarding students’ achievement in mathematics across the nation.

Feature Selection of Post-graduation Income of College Students in the United States
Ewan Wright, Qiang Hao, Khaled Rasheed, Yan Liu.
2018 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS 2018). Washington, D.C.

Abstract:
This study investigated the most important attributes of the 6-year post-graduation income of college graduates who used financial aid during their time at college in the United States. The latest data released by the United States Department of Education was used. Specifically, 1,429 cohorts of graduates from three years (2001, 2003, and 2005) were included in the data analysis. Three attribute selection methods, including filter methods, forward selection, and Genetic Algorithm, were applied to the attribute selection from 30 relevant attributes. We discuss how higher numbers of students in a cohort who grew up in Zip code areas where over 25% of the population hold a Professional Degree was predictive of more college graduates being classified as High income.

Feature selection and classification of post-graduation income of college students in United States (Master’s thesis)
Qiang Hao
University of Georgia, 2017.