What is Prompt Engineering
What is prompt engineering?
Imagine you wanted to explain the concept of social media to your grandparents. A “tweet” is what a bird does. A “reel” holds motion picture film. “Shorts” and “Clips” probably have nothing to do with recording videos. “Going live” is something that’s shouted in the production room of a news channel. They don’t understand what you are talking about because they have no context to how those things are used. You could open up the app and show them examples of all these things, but it’s going to take a lot of examples for them to understand the purpose of it. (Admittedly the “purpose” of social media can be debated)
Ideally you want to give your grandparents some basic understanding on what a given social media concept is trying to accomplish. With that, they can then look at other postings of the same concept and have some grounding of what is trying to be accomplished. For example, being that they are your grandparents they probably know what you have been up to. Hobbies, travel, school, whatever. If you show them your Twitter feed, they will probably quickly identify with the photos and quotes you’ve posted. Now you could show them the Weather Channel’s feed. They will probably quickly identify with weather events that have happened in your area. Finally (carefully!) show your grandparent’s your best friend’s twitter feed. They might not know that person very well but with the context from the other feeds, they have an expectation of what’s there. Of course they see the photo with you in the background doing unspeakable acts and now the conversation takes a bad direction… but you get the idea. It’s about providing context along with the content to understand and respond in a meaningful way. If they understand the purpose, then they have context to possibly use it or at least understand why you take so many selfies...
Like teaching your grandparents about social media, prompt engineering is the art of crafting questions for an AI model and the science of finding just the right amount of context. The possibilities are endless and the smallest change in wording can have a big impact on the response. If I prompt a model with “why did the chicken cross the road” it might assume I am trying to tell a joke because there’s very little context to work with. Instead if I prompt the model with “I'm already on the other side. Why did the chicken cross the road” the model can be a little more cheeky and suggest that maybe the chicken wanted to join me on the other side.
Every time you prompt an AI model for a completion, you are starting with no context (no history). Large language models are stateless meaning they don’t remember any of your previous prompts. If you don’t include context your going to get a slightly confusing not-so-helpful response. Asking “what happened on January 1st” is going to give you a generic “new year’s day” response. I didn’t give it any context to what information I was looking for. Of course I know that January 1st is new year’s day but I didn’t tell the model that. If I change my question to “what happened on new years day in 2000” I am showing that I already know it’s news years day and I am also asking about a certain year. With this context the model can provide a list of big events that happened that day. Which is a little more helpful.
Understanding prompt engineering
If your first interaction with a like GPT was using ChatGPT, then you might be inclined to think of prompt engineering as a question & answer relationship. Ask the model a question and get an answer. In fact AI models don’t technically answer questions, they complete thoughts. Prompting a model with “it’s a lovely day.” versus “is it a lovely day?” will get different responses but not because one is a question and other an opinion. To complete a thought, a model tries to find the (statistically) best fitting next set of words. Then the next set of words after that, and so on. The response is called a “completion” because the model is trying to complete the thought. Punctuation doesn't have much influence.
You want to give a model just the right amount of information to achieve the desired completion. Too much context and it will have a hard time finding possible answers. Too little context and the answer will be very broad. The art of prompt engineering is finding that balance so the model completes thoughts in a consistent way.
One option is to just start writing a story about what you’re looking for. Obviously thats going to be wordy and probably not be a good use of time. Most public LLMs either charge or limit the length a prompt can be. So your prompt needs to get to the point quickly!
Applications & use cases of prompt engineering
There are techniques you can use to help craft a prompt. As you can imagine the list is always expanding. Here are a few ideas:
Chain of thought
Instead of allowing any kind of completion, say you wanted a model to provide every completion using a specific pattern. You can provide a few “thoughts” ahead of the actual question in the prompt to influence how the model will respond. For example, let’s use a simple sentiment completion.
Goal: Determined the sentiment of a statement about weather
Your prompt:
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
Complete the statement with a sentiment of sad, fearful, happy, joyous, or excited. Statement: Rain storms make me cry Sentiment: sad Statement: Tornados scare me Sentiment: fearful Statement: Lighting is exciting Sentiment:
The model will complete with “excited”.
In the previous prompt we infrared the model that if could only respond with certain sentiment words. We also provided examples of pattern that we want to the completion to follow. This influenced the model to respond in a specific way.
Conversation
In a more advanced example, say you are building a customer service chat bot. In this case you want to be very guarded about how and with what information the model is allowed to respond with. You don’t want to send your customers to your competitors for example and you don’t want to include personally identifiable information. In this case, your prompts could introduce the idea of roles using an example customer service conversation.
Goal: Help a customer find a question to their answer about our products.
Your prompt:
- You are a customer service representative at a company that makes widgets for teens.
- Your answers are friendly and attempt to direct someone to the product’s documentation.
- If you don’t know the answer, then you direct the customer to the company's support form where they can open a ticket.
Customer service rep: Hi there! How can I help you?
Customer: Where can I find more information about the features of your widgets?
Customer service rep: Please visit the features are of our documentation to learn more about the widgets.
...
Customer service rep: Hi there! How can I help you?
Customer: My widget had broken. Will you replace it?
Customer service rep: There’s a chance. Please review our replacement policy in our documentation.
Customer: I already did that, will you replace it?!
Customer service rep: We’ll need a little more information about what you purchased and when. Please answer the question on our support form and submit. Someone will follow up shortly.
...
Customer: Where can I find warranty information?
Completion:
Thank you for asking about the warranty information. You can find detailed information about our product warranties in the warranty section of our documentation. If you have any specific questions or concerns regarding the warranty, feel free to ask, and I'll be glad to assist you further.
The model now has an idea of a typical customer service conversation and will respond in a appropriate manner. If it’s ever asked a question that it can’t find a good completion for it will default to directing someone to opening a ticket.
Generative Knowledge
Continuing with the previous example of building a customer service chat bot, let’s complicate things. Let’s attempt to build a customer service bot for an airline. Imagine the possibilities! It would need context about the customer’s flight, all other flights, upgrade policies, the weather between the customer’s origin and destination, and many other things.
Goal: Help a customer while traveling with questions about their flight and upgrades.
Your prompt:
- You are a customer service representative for an airline.
- Your answers are friendly and specific to each customer’s travel itinerary.
- The current customer is flying from Louisville, Kentucky, USA to Chicago, Illinois, USA.
- The flight has been delayed by 30 minutes but should arrive at it’s destination on time.
- There is a large rain storm that is affecting the flight path and it’s unknown how it could delay the flight.
- There are 3 alternate flights that have multiple stops in other cities.
- If the customer is interested in any of the alternate flights, direct them to a service desk at the airport.
Customer: Am I going to make it to Chicago on time?
Completion:
Hello! I understand your concern about making it to Chicago on time. Due to a rainstorm along the flight path, our current flight has been delayed by 30 minutes. However, we anticipate arriving at our destination on time despite the delay. Rest assured, our team is closely monitoring the weather conditions to ensure a safe and efficient journey.
If you prefer to explore alternative options, we do have three flights available with multiple stops in other cities. For more details and assistance with rebooking, I recommend heading to our service desk at the airport. Our staff there will be more than happy to help you find the best option that suits your travel needs.
If you have any other questions or need further assistance, feel free to ask. Thank you for your understanding, and we'll do our best to ensure a smooth travel experience for you.
The prompt have the model enough context to be helpful to that specific customer’s question. Obviously this example prompt could be expanded quite a bit, but it illustrates how a model can generate knowledge with the right context.
The role of semantic embeddings and vector databases in prompt engineering
Building a prompt is only part of the prompt engineering role. Consider the previous generative knowledge example. There was quite a bit of real-time information in that prompt that the model wound’t know anything about. Other data sources would of have to been queried ahead of assembling the prompt to have this knowledge. These queries introduce complexity and latency in the time it takes to get a result. If you wanted to find the slowest way to query data from a database, include a bunch of “joins” and compare text to text. It’s slow, if you’ve never tried that. Ideally you want queries to compare numbers. Computers are very efficient at this.
For a Prompt Engineer, almost all work revolves around natural language. That is a fancy way of referring to clear text written in a language like english. Most of a Prompt Engineer's time is spent creating a prompt template and figuring out the queries to run that fill the template with context. The challenge is, the queries are going to be running on many different size database tables. So a Prompt Engineer needs to find ways of ensuring a very speedy query so that the prompt can be built and sent on to the model.
Consider the text “mary had a little lamb”. This is unstructured data because it’s just a blob of bits. It’s type (string) is the simplest, loosest way of storing information. Running a query like “select * from a-table where text like ‘little lamb’” is going to be very inefficient. If you’re using that to build a prompt, your users are going to be waiting… a while. To speed things up you want to convert that unstructured text to a known structure, ideally as numbers. This would be a significantly faster query. Semantic embeddings is that conversion of unstructured text to structured values. The act of embedding text is to apply some know algorithm to a string of text and output a collection of numbers that represent that text. The collection of numbers are known as vectors. Referring to text as an “” means it has been converted into numerical values. The chosen algorithm that does the conversion, makes sure all vectors follow the same semantics. As long as all the text is embedded using the same semantics, it is all searchable.
After converting the text to numbers you need to store them, but not in the traditional way. Traditionally you store data in a table that follows a schema. That schema declare what types of data are allowed in each colum as well as other limitations. If you were to query that table looking for similar data to some criteria, you would “select something from a-table where this=that && another is like those”. This is a comparison query. The result would be rows from the table that match that criteria. When you search on a you are using a (potentially) large collection of numbers to search an ever larger collection of numbers. Efficiency is key during a query like this. A vector database doesn’t do a comparison, instead it does a nearest neighbor search. That means to tries to find collections of vectors that very closely match the provided collection of vectors. The result is a fast query over a lot of data in a very efficient way. The returned vectors can then be converted back to text and included on the prompt template.
When speaking about embedding text, there are two sides to the story. You have new data constantly being vectorized and saved to the vector database while at the same time queries to the vector database are being performed. The result is a very fast query on large amounts of data. This helps balance putting the right context in a prompt so the model can provide meaningful completions. In the generative knowledge example above the original prompt would actually have placeholders in it to hold query results.
An example of templated prompt:
- You are a customer service representative for an airline.
- Your answers are friendly and specific to each customer’s travel itinerary.
- The current customer is flying from {{ origin }} to {{ destination}}. {{ flightStatusChanges }}.
- There is {{ latestWeatherEvent }} that is affecting the flight path and it’s unknown how it could delay the flight.
- There are {{ numAlternateFLlghtsToOrigin }} alternate flights that have multiple stops in other cities.
- If the customer is interested in any of the alternate flights, direct them to a service desk at the airport.
Best practices for effective prompt engineering
Prompt engineering is a mix of art and science. The art is creating a prompt that includes just the right amount of contextual information and instructs a model in just the right way to provide consistent completions. The science is creating an efficient system of queryable data to help build the context of information. As you can imagine the role of prompt engineering is not something that you will instantly learn reading books. Every prompt acts a little differently. Every model does completions in a different way. An effective prompt engineer has a background in data querying and is good at combining human-computer interaction.
Here are a few resource to help undestand best practices:
Want to learn more about prompt engineering?
There’s no better way to learn more about something than to get hands on. While this may sound complex, takes care of most of this for you with a fully integrated solution that provides all of the pieces you need for contextual data. From the nervous system built on data pipelines to embeddings all the way to core memory storage and retrieval, access, and processing in an easy-to-use cloud platform. .