I don't want to talk about ChatGPT either, but here we are.
As a computational linguist, I had to try it. Here are my seven key takeaways.
Iām sick of hearing about ChatGPT. And you know whatās sad about that? Iām a data scientist specialised in Natural Language Processing (well, computational linguistics, but thatās the non-sexy name that nobody uses). ChatGPT is a feat of technology in my very own, beloved domain, that should have me filled with joy and wonder: not because itās perfect, but because of what it promises.
And yet, Iām bored.
And yet, I buckled.
In March Iāll deliver a conference speech about NLP to an audience of non-technical business leaders, and I knew there was no way I could get up on that stage and not address the 175 billion parameter elephant in the room. I also knew that if I let ChatGPT generate some of my presentation for me, only to reveal that at the end in a dramatic twist of technology, Iād knock my audiencesā little business socks off.
What can I say? Iām a sucker for putting on a good show.
So I made an account, and struck up a conversation. This is the first time Iāve really tried to work collaboratively with any kind of generative language model. And Iāve pretty-much avoided the now-ubiquitous talk of how to write good prompts, because I was over hearing about ChatGPT by the time the prompt-hacking-bandwagon rolled into my Twitter and LinkedIn feeds. So this conversation was always going to be a learning experience for me. Here are my key takeaways:
Anthropormorphising chatbots is alarmingāand unavoidable
One of the problems with the publicās perception of AI is our human tendency to anthropomorphise it1. We often canāt help assigning personalities, thoughts, and feelings to inanimate objects, and I should know: I canāt even overstuff a pencil case or leave a laptop open by an inch because it looks uncomfortable and I feel guilty about it. You get that feeling too, right?
Right?
Anyway, the problems of anthropomorphism of AI are many. It can lead to public misunderstanding of what it is and what it can and canāt do. It can mask the agency of an algorithmās creators, thus failing to hold them to account for any biases and ethical blindspots they have (unknowingly or otherwise) built into it. I wonāt go on; Iād need another whole article for that.2
So I look at the below prompt and cringe.āWhat do you think of it?ā, I am forced to ask, knowing full well that my āinterlocutorā doesnāt think anything. And addressing it as āyouā gives me the creeps, but what other choice do I have?
Iām distressed about this because I work in this domain and have these issues ever present in my mind. But what really worries me is that there will be plenty of other users who will talk to machines in this way and wonāt be bothered at all. And if (or should I say, when) tools like this become more and more integrated into our daily lives, then this problematic personification will only become more normalised.
Iām honestly not sure what the solution is here. Perhaps we need a special pronoun, like āyou*ā, to address non-human communicators? Iād be keen to hear your thoughts.
Iām not convinced ChatGPT is fully keeping track of what itās said
Everything ChatGPT says in the following answer is spot on, and exactly what I had in mind. It seems like it has also been trained on giving constructive criticism, which is reassuring. But Iāll admit I felt pretty patronised. I know I asked for potential restructuring, but this answer implies I do indeed need some reworking, and then proceeds to just give me my original plan back, with a few extra details.
This inconsistency belies the fact that ChatGPT doesnāt actually understand what itās talking about. If it did, it would know that what it was offering was not a restructuring but a fleshing out. That might sound like nitpicking but itās important. For example, later I asked ChatGPT to write my script for the section outlining the history of NLP. I then asked for the summary bullet points to go with that. And the result did not match up well at all. ChatGPT apparently hadnāt summarised its own output to produce the bullet points; rather, in both answers, it had started from scratch generating the statistically most likely word- or bullet-point-sequence describing NLPās progress until now. This is disappointing, especially in the first case, in which the context it should have had to keep track of was not yet very large or complex.
Donāt be fooled into thinking you have a shared āunderstandingā
Despite this issue, the above answer does include one idea I hadnāt thought of, which is great. Specifically, Iām talking about the suggestion in point 1., of drawing the audienceās attention to the way businesses rely on customer communication. So I decided to explore that. I asked for elaboration, and ChatGPT generated a decent introduction I could use to get my audienceās attention. I have to give it a point there.
Buoyed by this success, I tried to continue our collaborative work:
Unfortunately, this answer is problematic again. I asked for a script for āpoint 2ā. What I meant was, the second point it had previously generated in response to the talk plan I had fed it: āDefine NLP in a way that is accessible to the audience. Avoid technical jargon and focus on explaining how NLP is used to understand, process, and generate human language,ā is what Chat GPT had said.
The above answer is good (although even without my truncating the screenshot, I can tell you there were many more examples that could have been listed), but itās not what I asked for. Rather, itās an answer to point 3: āProvide some real-world examples of how NLP is currently being used in businessā.
I think that, as with working with any real humans, itās important to be explicit to avoid misunderstandings. Clearly me just asking for a script for āpoint 2ā wasnāt enough. If I had asked something like āOk now please generate a script for me which defines NLP in an accessible, non-technical wayā, I would have gotten the answer I was looking for. And in fact, I did:
Weird as it is, you can productively spar with such a model
I had a problem with the above response, and I told ChatGPT so: āI don't like to say that NLP deals with how computers can understand human language,ā I told it. āBecause of course, they don't understand it. So my preference is to talk about how computers can process language and extract information from it. Do you agree?ā
Again, asking for agreement here makes me gag: Iām the human āexpertā feeling like I need the modelās approval, which just hurts my pride too much. Iām also uneasy with how the model immediately acquiesced to my opinion. Iāve seen other cases of people arguing with ChatGPT, and it putting up a real fight, but this was different. Maybe because Iām right, ha! Or put more modestly accurately, much has been written on exactly the point I was making, and I simply triggered ChatGPT to recite it for me. But still, I found it creepy.
Aside from that though, I did like the modelās response. It didnāt just address my concern, but also rephrased the generated script for me without me needing to ask. Points for taking initiative, little language model!:
ChatGPT reveals that weāre all just living language models
I just mentioned that I raised an oft-cited point with ChatGPT, which is probably why it agreed with me so readily in that case. And in fact, by this time in the conversation I was starting to feel like everything ChatGPT said could have come from my own mouth. But then, maybe it was reciting me. After all, Iāve written a textbook on NLP, produced a LinkedIn Learning course on NLP tools and methods, written on Medium about it, and appeared on multiple podcasts and webinars. Thereās a very real chance that Iām in the training data. So I tried to find out:
Ok so maybe I wasnāt in the training data, but Iām still online. So I was still curious as to whether it had any information about me, or whether it would make up some fantastical bio. I persisted. āDo you know anything about Katherine Munro though?ā, I asked. I got the same canned response about being a language model and therefore not knowing about individual people, but instead having general knowledge on the subject.
I felt that this was missing the point, and I told it so. āI think you misunderstood the question, there is lots of information about me online. I'm not famous, but even a Google search will find information about me,ā I said. The answer was almost identical.
So the repeated line about being a language model (yeah, we get it!) frustrated me here, and I guess Iāll never know (nor be able to brag to the hordes of ChatGPT-worshipping tech fanatics) that my work helped train it. But then again, maybe the reason ChatGPTās teachings on NLP sound so much like my own work is that Iām nothing but a large language model myself. I, too, have been trained on hundreds of scientific papers, Kaggle tutorials and Medium blog posts on NLP, such that even my own āexpert explanationsā are nothing more than a rehashing of whatās come before me. Itās a humbling thought.
I gave up our loverās tiff, and got back to work.
Under expert supervision, ChatGPT is incredibly powerful
ChatGPTās answer to my above question about the current state and future of NLP was a definite pass. It phrased it as dot pointsāmissing the fact that I had asked for a script that could actually be said out loudābut the information was valid.
In fact, everything it had said so far was valid. So I decided to apply one more test: I asked it to specifically name some NLP tasks and benchmarks. It was essentially a sanity check for me, and ChatGPT certainly delivered, describing multiple famous datasets, such as the WinoGrande benchmark for common-sense reasoning. I suppose I shouldnāt be surprised: such benchmarks are usually presented very clearly in multiple research papers, which it would be only too easy for ChatGPT to summarise.
So by this point, I was feeling pretty positive. Not under- or over-whelmed but whelmed, for sure. It felt like, under the watchful eye of a domain expert (i.e. me), ChatGPT could generate factually reliable content.
ā¦ But is it useful?
Before I bring this rant article to a close, let me preface my final words with a disclaimer: I know that there are many excellent uses for ChatGPT, in which it can be a real productivity boost for many people. This, I will conclude, was not one of them.
The question is, whether this whole adventure saved me any time. I continued to work with ChatGPT, requesting more information, asking for clarification, correcting it when itās output prose didnāt match the titles I asked to go with it, and so on.3
It was an intriguing experience, but also, a long one. I started to feel like I was supervising a junior on a task I could have done myself, much faster. After all, I started the entire conversation by giving ChatGPT my pre-prepared talk outline. The specific points for each planned section were also already in my head; I just thought Iād try the shortcut of getting ChatGPT to āread my thoughtsā for me. And then when I got into the game and started asking for scripts, all I ended up producing for myself were lengthy texts, which werenāt in my own voice and which I needed to proofread for correctness. Whatās even sillier about that is, given my pre-prepared outline I could have improvised the whole talk if Iād had to (after all, I told you Iāve written about NLP a lot). It wouldnāt be my preferred way of working, but for a 20 minute time slot it would have been possible.
And this leads me to highlight a bit of a dilemma. Given language modelās well-documented tendencies to hallucinate information, you shouldnāt trust their output: you should fact-check everything it says, either by doing your own research or simply proofreading, if youāre confident in your expertise on the subject. But in the first case, why not just do the research yourself and learn as you go? I suppose you could view ChatGPTās output as a research guide, so that you actually know what to look up, based on what it said. You run the risk of not looking into something because ChatGPT didnāt mention it, but I suppose if youāre only at the research stage of any topic then thatās already a risk anyway. As for the second case, why not just produce the content yourself, if you already know it? If I again play the language modelās advocate here, I suppose I could say that the feeling of co-working with someoneā and a non-human, brand spanking new one at thatāis intriguing, and a bit of fun. If nothing else, it helped me push past my procrastination in getting this presentation started. And perhaps thatās itās greatest gift of all. š
For more on our tendency to anthropomorphise AI, see āThe Prompt Box is a Minefield: AI Chatbots and Power of Languageā from
, and 'Love in the Time of AI' from .Anthropomorphising AI is bad, mmmkay? Donāt believe me? See āThe problem with anthropomorphizing artificial intelligenceā.
Co-incidentally, just after doing this I found this interesting piece by Ethan Mollick of
, in which his students found that this was also the best way to work with the tool.