Mastering ChatGPT: From Zero to Hero (1/3)
This year has seen a big bang with a range of products based on Generative AI. The exponential growth in these products has shaken the industry, as all our calculations of effort and accuracy go haywire. As machines push their way into the domain of creative work, it is obvious that the days of the human workforce are numbered. That is the prediction of most of the industry, leading to fear and panic.
It is true that we are not yet there. The code written by ChatGPT and CoPilot has occasional errors. The images generated by MidJourney are unnatural at times. The blogs generated by ChatGPT fail the plagiarism tests. However, we must remember that this is only the beginning. The speed of development of these tools is very high and the day is not far when they produce accurate artifacts that can be deployed without the need for review.
When that happens, all the software developers, bloggers, graphic designers, and their managers — will be unnecessary and jobless! And that day is only a couple of years down the line. Does that mean humans are not required anymore? Certainly not! However, the domain of human contribution will change. If you are sensitive enough to identify this new requirement, and if you are prompt enough to master the new set of technologies, nothing can stop your growth.
This series of blogs will build your skills — not just to retain your job — but to help you rule the new era of Generative LLMs.
Topics covered
This blog starts with a detailed theoretical introduction, then jumps into practical implementation and code examples. We will cover the following topics in the blog series.
- Introduction to ChatGPT, LLM, and Prompt Engineering
- Using Open AI API in your apps. Host your own Chatbot on AWS
- Host your own LLM on AWS, Amazon Bedrock
I am sure you are excited to take up the journey
What is Generative AI
Traditional, inferential AI started as an aid to decision-making — by automating the process of extracting subtle inferences out of the available data.
For example, when we train a face-recognition model to differentiate between faces, the model “learns” the way to differentiate between images, to identify the individuals. It is difficult to build an algorithm that identifies these differences between faces. However, when we train the model with a huge amount of data, it learns by itself. This is inferential AI because we are training the model to extract inferences from the available data.
Inferential AI had huge potential and applications. However, that is too little compared to generative AI — where we train the model to generate content “similar” to the one used for training it. This similarity is hazy and subtle. A concept that is almost impossible to quantify in an algorithm, yet very easy to identify. That makes it a perfect machine-learning problem.
As a theoretical concept, Generative AI dates back to the 1960s. However, it remained a theoretical concept until it was backed by extreme computing power that we can access today. In 2014, it was formalized using Generative Adversarial Networks (GAN) and now we can see a burst of Generative AI models. As the necessary computational power is available to more and more developers, we can expect several disruptive innovations in this domain.
Opportunities
AI is a disruptive innovation. Its disruptive potential is compared to that of electricity, or perhaps the wheel itself. When the social order is disrupted by such an innovation, we can see the world divided into three parts:
- The builders
- The users
- The losers
The builders are the clear winners. The brains involved in building AI models and solutions like ChatGPT will surely rule the industry. However, not everyone is trained enough to do that. But they do have a choice of using the technology to improve their productivity. And the ones who fail in that will surely lose the game.
We will guide you to stay on the first two, to make sure you thrive (not just survive) in this disruptive age.
Introduction to ChatGPT
ChatGPT needs no introduction. Unless you have just descended from Mars, you certainly know about it.
ChatGPT is a large language model developed by OpenAI. It is designed to generate human-like responses to natural language queries and is capable of performing a wide range of language tasks such as language translation, summarization, conversation, and more. ChatGPT is trained on a vast corpus of text data, which enables it to understand and generate responses to a broad range of topics and contexts. As an AI language model, ChatGPT can be integrated into various applications to provide human-like language processing capabilities.
Of course, the last paragraph was generated by ChatGPT :). Gone are the days when we scratched our memory for concise definitions. Are you thinking about stuff like growing dependence on machines, unnatural behavior… Well, you should remember the folks in the stone age who refused to use stones, to reduce dependence (I am sure there were some idiots who did that.) But they were ignored and forgotten and so will be the ones who refuse to use ChatGPT.
Instead, leverage it for what it can do and add to it. (The above paragraph cannot be generated without a human mind.) That is the key to your success in this new era. Don’t waste yourself doing what a machine can do. Instead, apply your skill to make the machine do what it can, and then add value to it with your own creativity.
What is a Large Language Model?
Okay, so everyone has told us that ChatGPT is based on a large language model (LLM). But what exactly is this LLM? And what gives it this miraculous power? I asked again, and this was the answer.
A large language model is a type of artificial intelligence model designed to process, understand, and generate human-like text based on the input it receives. These models are “large” because they consist of a vast number of parameters, which are the elements of the model that are learned during the training process. The more parameters a model has, the better it becomes at understanding and generating complex language patterns, as well as handling a wider range of topics.
Large language models are typically trained on massive amounts of text data from diverse sources, such as websites, books, articles, and more. This extensive training enables the models to grasp linguistic structures, context, and various subject matters, making them capable of generating coherent, contextually relevant, and often creative text.
Examples of large language models include OpenAI’s GPT series (GPT-3, GPT-4, etc.), Google’s BERT, and Microsoft’s Turing-NLG. These models are used in various applications, such as chatbots, content generation, translation, question-answering systems, and more.
Let us understand what this means. As we saw above, a deep learning model is essentially a collection of weights on a huge neural network that transforms an input vector into an output. The weights are trained iteratively using the training data. And the effectiveness of the model depends upon the structure of the network, the data used to train the model, and also on the way it was trained.
This is all fine when we talk about numbers. We can perform arithmetic operations on numbers. However, things get messy when we talk about natural languages. How do we perform arithmetic operations on words? And how do we train neural networks without such arithmetic operations? This problem is solved using Word Vectors and extended into large language models. Using word vectors, we map the entire vocabulary of a language into vectors, in a way that related words are mapped to vectors that are numerically closer to each other.
Generating a word vector representation is a massive task in itself. It requires extensive iteration and training. However, that is not enough. Words are not alone. They carry different meanings when used in different contexts — and that context can stretch over a large span. The larger the span we use for calculating, more the accuracy of the model. At the same time, the larger the model, the more the cost of training and using it.
Until a few years ago, these were topics of theoretical research alone. Nobody had access to the computational power required for this. However, with the growth in available computing power, people have successfully trained such models and that has brought forth miracles like ChatGPT.
Other LLMs
Famous is not necessarily the best. ChatGPT was the first one to hit the market, and it came with a bang. However, it is not the only one in the market. Today, we have several LLMs that generate text content. Some are specialized in specific fields and others are as generic.
Copilot, Code Whisper
These tools from Github and AWS respectively, can generate huge amounts of code in no time. They specialize in code and documentation and are way more accurate compared to ChatGPT (which also writes code)
Bing AI
Perhaps the dumbest of all. Can’t expect more in a free product from Microsoft.
Chatsonic/Writesonic
A wonderful generative tool, that uses ChatGPT4 under the hood. It has a comprehensive plan and offers a lot of functionality above ChatGPT4, like integration with several other systems.
Rix
This is a chatbot, specialized for developers. Not as powerful as ChatGPT, however, it is good for developers looking for technical questions. Its knowledge is way ahead of ChatGPT, which proudly boasts about a cutoff in September 2021.
Google Bard
Although their public demo failed miserably, the product is quite good, and it is growing fast. Not yet at the level of ChatGPT, however, it is not far either.
Elicit
Elicit specializes as a research assistant. It has a simple chat interface, but a good amount of knowledge hidden beneath it.
Learnt
Learnt specializes in the world of education. It is a wonderful tool for generating educational content.
Prompt Engineering
ChatGPT is available to the world, and if all goes well, it will be available to us for years to come. They do not differentiate between users and the response is generated based on the state of the model at that time and the prompt provided by the user.
It is very simple. When you interview a wizard, the benefit you can extract is restricted by the questions you ask. The same holds true when we work with large language models like ChatGPT. We can get a better and more useful response when we ask a question in the “right way”. This can get very intricate, and we can build really complex prompts to get amazing output.
Let us demonstrate this with a trivial example. I ask the two questions
What is Prompt Engineering?
Prompt engineering is the process of designing and refining input prompts for AI language models like ChatGPT in order to obtain more accurate, relevant, and useful responses. As these AI models generate text based on the input they receive, the quality and structure of the input prompt play a crucial role in determining the output’s usefulness.
In prompt engineering, you carefully craft and modify the input query to guide the AI model toward generating the desired output. This process often involves experimenting with different phrasings, adding context, or specifying constraints to obtain a response that better matches the intended purpose.
As a technical architect explain to developers, what is prompt engineering.
Prompt engineering is a crucial aspect of working with AI language models like ChatGPT. It involves the art of carefully designing and refining input prompts to obtain more accurate, relevant, and useful responses from the AI system. The quality and structure of the input prompts you provide play a significant role in determining the output’s usefulness.
In the context of developers, it’s essential to understand that when interacting with AI models, crafting the right input prompt can make a significant difference in the generated output. Think of prompt engineering as fine-tuning a query to get the most precise and relevant information from a database.
Some key techniques for effective prompt engineering include:
1. Making the prompt more explicit: Be clear and specific about the information or response you’re looking for to guide the AI model in generating a relevant output.
2. Providing context: Supply relevant background information or additional details that can help the AI model better understand the problem or scenario.
3. Restricting the output length: Set a limit on the number of words or sentences the AI should generate to maintain focus and conciseness in the response.
4. Using examples: Demonstrate the desired output format or content by providing examples that the AI can use as a reference when generating a response.
As developers, understanding and applying prompt engineering techniques can help you obtain more meaningful and accurate information from AI language models, ultimately improving the effectiveness of your AI-driven solutions.
As we can see, the second question sets some context, that allows a detailed response. As we give more information to the question, we can get more and more accurate responses.
Nobody has the time to learn everything by themselves. All over the net, you can find hundreds of useful prompts that enable ChatGPT to produce useful content.
If you want to understand the real power of prompts, try asking this question to ChatGPT:
Please give an example of how prompt engineering can have a significant impact on the response from ChatGPT. Show how to use complex prompts to demonstrate different concepts of prompt engineering. How do I use the power of complex prompts?
Don’t worry if that question does not give you all the detail you need. In a following section, we will check out a detailed set of prompt engineering techniques to build a customized chatbot.