Pre-Project
Once upon a time there was a wee little boy that picked up a science fiction novel. In it, cars could fly and robots could talk. The internet, also, never cut out. The boy kept these ideals at heart, until one day he could bring to life in three weeks for a Hawken Project.
Radical, I know. Society has seen the news. Anyone can talk to a robot from anywhere at anytime. All they have to do is visit https://openai.com/chatgpt, click Start now, and their off to the races. When they do, a multi-million dollar computer running a Large Language Model spools up and they can have a near real-time conversation with a computer: just like the movies. Many easily consider it silicon's valley's biggest debut of the past 10 years and for good reason.
My experience of this technology has been a little more 'down to earth' compared to the rest of society, and I'm not calling everyone crazy. Before they were large, language models were just, ya know, language models. You take a sentence or a whole bunch of them and feed it into the model, much like y = mx + b, and out came basic stuff. They could perform sentiment analysis (Does this sentence carry a cynical or a positive tone? ), detect hate speech, filter spam emails, translate between languages, or fill in the blank.
Here is a basic demo from Bert, a language model created by google from 2018. I've set it up to do a basic fill in the blank.
Input sentence:
"Owen is doing a Hawken Project. BLANK is going to be really fun."
Output:
It | 66% |
This | 28% |
That | 3% |
Everything | 1% |
He | 1% |
For each word there is the associated guess that BERT thinks would belong in the blank. The strongest guess is "It" at 66%. Really cool right. I'll get into more details later in the blog, but the net takeaway is that language models where once a feeble technology that only a researcher could appreciate.