During my free time, since I finished my master’s degree in April, I have started to read more about Large Language Models (LLMs) and prompting. My specialization in computer science is in computing systems, but I did take a few ML classes and come from an Economics background. This has helped put me in a position to start self-studying advanced ML topics to understand large language models and how they are built under the hood. As a part of my research, I’d like to share my findings through a series of articles that I entail topics in a workshop on prompt engineering that I have given as a volunteer during my recent time volunteering for Birthright Armenia. This series will also go deeper and cover additional topics.
What is Prompt Engineering?
Prompt Engineering, or In-Context Prompting, refers to communication methods with LLMs to steer their behavior for desired outcomes without updating the model weights. It is a new empirical science and the effect of prompt engineering methods can vary a lot among models, thus requiring heavy experimentation and heuristics.(1)
By methods, this means using natural language or programming or a combination of both to design “prompts” that interact with large language models.
One can think of it as a new type of programming where the skill lies in curating prompts.
As Steven Wolfram touched on in a podcast with Lex Fridman, LLMs give people access to computation via some sort of linguistic interface that uses natural language. The fascinating revelation is that this applies to computation across all layers of abstraction, from higher-level programming languages, through lower levels, to operating systems and hardware.
Computation might seem broad but it takes imagination to understand what ChatGPT can do. Since the advent of ChatGPT and the productization of other LLMs, we’ve seen the various capabilities of these LLMs across multiple domains supported by the easy access of users worldwide. These domains include programming, creative writing, information retrieval, conversations, summarization, etc. Beyond the obvious, instructing an LLM to complete a complex task is not trivial, especially if you are interested in the best quality response. It is worth exploring the art and science behind it, as we have already seen, empirical results support that prompting format does affect results. Here is an example.
A recent article by Anthropic (an AI company developing LLMs), shows that you can increase the recall capacity of LLMs by 70% with 1 addition to your prompt: “Here is the most relevant sentence in the context:”
It was enough to raise Claude 2.1's score from 27% to 98% on the 200K context window.
How you write your prompt to the LLM matters; a carefully crafted prompt can achieve a better result than one that isn’t. But what are these concepts, prompt, prompt engineering, and how do I improve what I send to the LLM? Questions like these are what the workshop series of articles in Medium will help you answer.
If you are on Quora, please follow this space I created to keep up with LLMs and Prompt Engineering.
https://promptengineeringhub.quora.com/