Published: May 30, 2023

Chain of Thought Prompting

In Chain-of-Thought prompting we generate a series of concise, logical steps, known as reasoning chains, to guide the model along. If you are somewhat familiar with langchain and it’s agents, Chain of Thought is pretty much how it works (together with other prompt engineering techniques).

What is Chain-of-Thought prompting?

Chain-of-Thought prompting is a technique that enhances complex problem-solving by breaking down the process into intermediate reasoning steps. It’s like creating a roadmap for the model to follow, guiding it toward the correct answer.

How does Chain-of-Thought work?

You provide a prompt and a corresponding answer that involves reasoning. You’re helping the model better understand and solve the task by outlining these reasoning steps.

When to use Chain-of-Thought Prompting?

You’ll get the most use out of Chain-of-Thought prompting for complicated reasoning tasks with larger and more intricate models.

Elementary tasks may not see any improvement from Chain-of-thought prompting, and it isn’t worth the extra effort and tokens you’ll have to put in to get your output. As with any machine learning problem, it’s always advisable to start simple and slowly add complexity to your solution to improve it.

Combining CoT with Few-Shot Prompting

CoT can be used alongside few-shot prompting to improve results on tasks that require reasoning before responding. Few-shot prompting involves providing a few examples of the task at hand, which helps the model understand the context better.

Variations of Chain-of-Thought Prompting

Zero-shot CoT: This is a variation where you add “Let’s think step by step” to the original prompt. This phrase acts as a cue for the model to reason through the problem. The reasoning is the same as I’ve explained in guiding the model in sequential steps or giving the model time to “think” in the Deeplearning.ai course.
Auto-CoT: This automated process uses large language models to generate reasoning chains for demonstrations. It reduces the need for manual effort in creating examples, although it might still have some errors. The process includes question clustering and demonstration sampling to create a diverse range of examples for the model to learn from.

Extentions of Chain-of-Thought Prompting

Self-consistency Sampling (Wang et al. 2022a): This method improves the accuracy of reasoning by generating a variety of answers and then choosing the one that appears most often (majority vote).
Tree of Thoughts (Yao et al. 2023) enhances Chain-of-Thought (CoT) by exploring various reasoning paths at each step. It starts by breaking down the problem into several thought steps, and for each step, it generates multiple thoughts. This process creates a tree-like structure. W can explore this tree using either Breadth-First Search (BFS) or Depth-First Search (DFS) methods (Ah, this is where I finally need my Algorithms and Datastructures Knowledge. Thank you, Prof. Steurer!). Each state or step in this process is evaluated by a classifier (using a prompt) or by taking the most common outcome (majority vote).
Ensemble Learning (Wang et al. 2022b): This approach introduces randomness by changing the order of examples or using model-generated rationales instead of human-written ones. The final answer is determined by taking a majority vote from the model outputs.
STaR Method (Zelikman et al. 2022): If training examples only have true answers but no rationales, the STaR (Self-Taught Reasoner) method can be used. It involves asking the model to generate reasoning chains and keeping only those that lead to correct answers. The model is then fine-tuned with these rationales, and the process is repeated until convergence.
Complexity-Based Prompts (Fu et al. 2023): Prompts with more complex reasoning steps can perform better. The number of reasoning steps in the chains measures the complexity. Also, using a newline symbol (\n) to separate reasoning steps works better than using a period (.) or semicolon (;).
Complexity-Based Consistency (Fu et al. 2023): This method prefers complex chains among all the generations by taking a majority vote from only the top complex chains.
CoT Prompts with Complex Examples (Shum et al. 2023): CoT prompts with only complex examples can improve the accuracy of difficult questions, but they perform poorly on simple questions.
Prompt Formatting (Fu et al. 2023): Changing “Q:” to “Question:” in prompts can be helpful. This principle aligns with labeling and being clear and precise with your prompting instructions.
Including Explanations (Ye & Durrett 2022): Including explanations in the prompt has a small to moderate benefit for tasks that involve reasoning over text. However, nonfactual explanations are more likely to lead to incorrect predictions.
Self-Ask Method (Press et al. 2022): This method prompts the model to ask follow-up questions to construct the thought process iteratively. Think perplexity.ai and its co-pilot if you’ve ever used it. If you haven’t, it’s seriously awesome and should be an inspiration of what you can do with good Prompt Engineering. The follow-up questions can be answered using search engine results.
IRCoT and ReAct (Trivedi et al. 2022, Yao et al. 2023): These methods combine iterative CoT prompting with queries to Wikipedia APIs to search for relevant entities and content, which is then added back into the context.