“The research project dealt with a database associated with two online textbooks, ‘How To Think Like a Computer Scientist’ and ‘Problem Solving with Algorithms and Data Structures,’ written by two of my computer science professors,” Doorenbos says.
As one of the authors, Miller had a goal in mind. “We wanted to understand how students were using the online textbook and get insight into how different kinds of students were learning,” he says. “With the large amount of data available, it quickly turned into a project to simply understand how to get the data set cleaned up and organized so we could do some analysis.”
The interactive textbooks track the activity of students using the books and had over 30 million logged data entries in the two and a half years they have been operational. “My research dealt mainly with developing a cleansing workflow or basically a series of commands that could be used to clean up the database to make it suitable for analysis,” Doorenbos says. “My research also included some initial analysis of the data set.”
“One of the most interesting things we discovered was that a feature we added between our first and second years of using the textbooks made a huge difference in how quickly students were able to do their homework exercises,” Miller says. “This is one of the great qualities of an interactive textbook. You can learn from the data, try out new teaching strategies, and actually measure whether or not they work.”
Social media played a role in what Doorenbos found most fascinating. “While doing some analysis of short-term users (people who only used the website for less than a day), I found one day when the number of hits on the website spiked to over 5 times more than the normal amount,” Doorenbos says. “After some sleuthing, I found the website had been shared to a few Reddit pages that day. While this wasn’t particularly meaningful to our project, it was very cool to see the power of social media and how our data reflects that.”
“I really liked involving a student in the textbook project because he’s closer to the learning problem than I am, in that he just recently experienced the hard introductory learning curve in computer science,” Miller says. “He could see things that should be improved that I may miss.”
Doorenbos is presenting the results of the project at National Conference on Undergraduate Research. “We also plan to make a copy of the data set we’ve collected and make it available to other researchers who can do their own analysis,” Miller says. “They can apply very advanced learning algorithms to the data to see if we can automatically optimize the learning environment for every student.”
My research required that I use what I learned from at least three different classes ranging from an intro level course to my higher level courses.
—John Doorenbos '16
Dealing with the scale of the data.