GPTSense
Purpose
GPTSense is a text classification model that was trained on 32,000 GPT-3.5 and 10,000 human responses. My team decided to create GPTSense during the Los Altos Hacks VII Hackathon based on the idea that as generative AI such as ChatGPT and Bard continue to develop, there should be countermeasures put in place to help detect the use of AI. The goal of GPTSense was to better empower teachers and institutions to detect the use of AI as a method of cheating, as even industry leading solutions like GPTZero are still often misclassifying real work as being AI generated.
Process
Collecting training data
The first step towards creating the model was to collect as much data as we could in a short period of time. Luckily for us, OpenAI had a decently well documented API and we were able to use it to generate ~32,000 responses to questions that we found in a questions dataset. We then found a dataset of old blog posts and parsed it into ~10,000 individual human responses. All of this data was uploaded to a MongoDB database which helped us streamline the process of collecting data.
Building the model
My main priority when creating the AI model was to keep it simple and to make sure we had something functional. For this reason, I used the Keras API in Tensorflow to create a text classification model. The general architecture was a Sequential model with an Embedding layer followed by Pooling and Dropout layers to help train the model faster.
Training the model
Once I was satisfied with the amount of training data we had collected, I split it into training, testing, and validation sets with a 70-20-10 split. For each set, I created two categories, GPTResponses and HumanResponses with each containing .txt files with one corresponding response. This was done with the intention of using the Tensorflow text_dataset_from_directory API to create the dataset that I would actually be training the model with.
Evaluation
GPTSense would reach a training accuracy of 99% and a testing accuracy of 98%. Unfortunately, we did not have enough time to create real world tests, but in the few hours we were able to test out GPTSense, we are confident that it has an accuracy of around 75% in real world tests. What we noticed was that when we fed the model data that had been completely generated by ChatGPT, it was almost always confident that it was AI generated. However, we also noticed that it is possible to trick the model by writing in an extremely formal manner. Overall, we were very satisfied with the solution that we had created in only 24 hours and we believe that in the future, we can easily expand on GPTSense by introducing more training data and a wider range of human responses.
Video demonstration
A short clip demonstrating the functionality of GPTSense
GPTSense in use
This image shows GPTSense correctly identifying a blog post as being most likely human written
Links & Credits
Github repository (source code): https://github.com/FakeZhiyuanLi/GPTSense
Devpost submission: https://devpost.com/software/gptsense
Video demonstration link: https://www.youtube.com/watch?v=Lkkg5v1PHC8&t=1s
Los Altos Hacks VII: https://www.losaltoshacks.com/
NQG questions dataset: https://github.com/xinyadu/nqg