GPT学习之PAL

参考资料

https://arxiv.org/pdf/2211.10435.pdf

PAL: Program-aided Language Models

解释一下, PAL 就是 基于程序辅助的语言模型. 下面例子中具体说怎么个辅助

总结

我们使用 chat-gpt 最重要的部分就是 prompt.

一般我们使用 prompt 有三种策略

  • 直接自己造
  • chain-of-thought (COT)
  • PAL prompting.

下面直接比较 COT 和 PAL

左侧是用的 cot 的方式, 也就是我们常用的
请一步一步分析,并给出详细的解释过程. + 问题 的方案. 让 gpt 做自己去推理怎么解题.然后在给出答案

右侧是 PAL 的方式, 把自然语言的逻辑,转化为可执行的程序,最后通过执行程序来获取结果

文档中重点突出了 PAL 在 数学, 算法, 比较逻辑等方面的准确率比 前两种策略更高

示例

下面看 langchain 中的示例来理解 PAL

from langchain.chains import PALChain
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0, max_tokens=512) # type: ignore


pal_chain = PALChain.from_math_prompt(llm, verbose=True)

question = "Jan has three times the number of pets as Marcia. Marcia has two more pets than Cindy. If Cindy has four pets, how many total pets do the three have?"

answer = pal_chain.run(question)
print("answer: ",answer)

其实他内置了一个模板来获取可执行的程序

# flake8: noqa
from langchain.prompts.prompt import PromptTemplate

template = (
'''
Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?

# solution in Python:


def solution():
"""Olivia has $23. She bought five bagels for $3 each. How much money does she have left?"""
money_initial = 23
bagels = 5
bagel_cost = 3
money_spent = bagels * bagel_cost
money_left = money_initial - money_spent
result = money_left
return result





Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?

# solution in Python:


def solution():
"""Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?"""
golf_balls_initial = 58
golf_balls_lost_tuesday = 23
golf_balls_lost_wednesday = 2
golf_balls_left = golf_balls_initial - golf_balls_lost_tuesday - golf_balls_lost_wednesday
result = golf_balls_left
return result





Q: There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?

# solution in Python:


def solution():
"""There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?"""
computers_initial = 9
computers_per_day = 5
num_days = 4 # 4 days between monday and thursday
computers_added = computers_per_day * num_days
computers_total = computers_initial + computers_added
result = computers_total
return result





Q: Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?

# solution in Python:


def solution():
"""Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?"""
toys_initial = 5
mom_toys = 2
dad_toys = 2
total_received = mom_toys + dad_toys
total_toys = toys_initial + total_received
result = total_toys
return result





Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?

# solution in Python:


def solution():
"""Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?"""
jason_lollipops_initial = 20
jason_lollipops_after = 12
denny_lollipops = jason_lollipops_initial - jason_lollipops_after
result = denny_lollipops
return result





Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?

# solution in Python:


def solution():
"""Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?"""
leah_chocolates = 32
sister_chocolates = 42
total_chocolates = leah_chocolates + sister_chocolates
chocolates_eaten = 35
chocolates_left = total_chocolates - chocolates_eaten
result = chocolates_left
return result





Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?

# solution in Python:


def solution():
"""If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?"""
cars_initial = 3
cars_arrived = 2
total_cars = cars_initial + cars_arrived
result = total_cars
return result





Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?

# solution in Python:


def solution():
"""There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?"""
trees_initial = 15
trees_after = 21
trees_added = trees_after - trees_initial
result = trees_added
return result





Q: {question}

# solution in Python:
'''.strip()
+ "\n\n\n"
)
MATH_PROMPT = PromptTemplate(input_variables=["question"], template=template)

PALChain 就是把拿到的 py 代码 用 py 的 PythonREPL 来执行并获取到结果

def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_chain = LLMChain(llm=self.llm, prompt=self.prompt)
code = llm_chain.predict(stop=[self.stop], **inputs)
self.callback_manager.on_text(
code, color="green", end="\n", verbose=self.verbose
)
repl = PythonREPL(_globals=self.python_globals, _locals=self.python_locals)
res = repl.run(code + f"\n{self.get_answer_expr}")
output = {self.output_key: res.strip()}
if self.return_intermediate_steps:
output["intermediate_steps"] = code
return output