Student Use of an LLM Incrementally Fine-tuned to behave like a Teaching Assistant in Business and Marketing
Institution: Dublin City University
Discipline: Data Analytics
Authors: Alan Smeaton
GenAI tool(s) used: Chatbot created for purpose
Situation / Context
One hundred and forty-three undergraduate students of business and marketing took a module on data analytics and visualisation in early 2024 which covered technical topics including linear regression, significance testing, correlation, p-values, etc. A conversational virtual teaching assistant TA, a chatbot, was created and made available to students anonymously, so there was no tracking of student identity, though the questions they asked, and the replies generated were logged. The virtual TA was hosted on a third-party platform, agenthiost.ai, which acted as an intermediary between students and an underlying large language model (LLM).
Task / Goal
The aim of this work was to provide an always-available virtual TA and to contribute to the constructivist rather than the instructivist approach to student learning. Students in this course are already able to form their own learning based on interacting with a knowledgeable, reliable, though not always available teaching assistant or lecturer. The virtual TA allows them to ask the questions they would not ask the TA or lecturer directly and to do so in confidence because of the anonymity of the interaction, and at a time that suits them.
Actions / Implementation
The virtual TA was created using an off-the-shelf LLM (GPT 3.5), which was incrementally fine-tuned each week of the semester on that week’s new course materials, including slides, reading materials, transcripts from recommended YouTube videos, text from recommended web pages, assignments, and automatically generated transcripts of recordings of lectures. The audio recordings of lectures included all 150,000 words spoken during 20 hours of face-to-face lectures. This meant that each week the virtual TA would improve its knowledge of the course but would not give answers to questions using materials which had not yet been covered in class.
The virtual TA was prompt engineered to stay within the scope of the fine-tuned materials only and not to answer questions from outside the course. The temperature for the LLM temperature, which is a parameter in the range 0 to 1, which controls the amount of randomness or creativity in the model’s response, was set to 0, thus practically eliminating hallucinations. The environment in which the virtual TA was presented had many warnings for students about what it knows and does not know.
Students were advised to always refer to the lecture and other course materials, and not to depend on the virtual TA.
Outcomes
During the semester and up to the day of the final written exam, 2,787 questions were asked by students and an analysis of a sample of 700 of these found that all were answered correctly from the course materials. In the 24 hours before the final exam, 730 questions were asked and answered.
An analysis of usage indicated that the virtual TA was used for understanding concepts, preparing for the exam and for assignments and that it helped students to ask questions they would not ask the lecturer. Students did not experience any technical issues when using the virtual (bar some issues with response time).
Reflections
We know that the concept of using GenAI has many positives and negatives, but the challenge is using the right form, configured in an appropriate way, and presented as part of a constructivist way of learning.
Some of the questions asked including requests for worked answers to exam questions which we did not anticipate. Students benefited from the initiative and the methodology can be scaled and replicated to other modules. However, the context of the module and its relevance and appropriateness for such an approach should always be considered. For future iterations of this, and we will be doing this for more than a half-dozen modules in the first semester of 2024/2025, we will include a basic introduction and overview of prompt engineering so all students can maximise their benefits from using the virtual TA.
Author Biography
Dr Alan Smeaton is a Professor of Computing at Dublin City University and a founding co-director of the Insight SFI Centre for Data Analytics. In 2021, he was awarded PFHEA status for his work on the development of pioneering and innovative uses of online resources for student learning over an extended period of time. In early 2024, he was appointed to the Government‘s Advisory Council on Artificial Intelligence.