使用Groq的API实现可以持续会话。.md

使用Groq的API实现可以持续会话。

文章

先说结论：
AI没有我之前想的那么聪明。

再说结果：
好处：可以实现比较完美的上下文持续会话。
坏处：吃tokens.仍然需要联网和科学。

后面准备部署到本地的模型，70B的模型才20-40g左右。两张P40或者M40就搞定了。
在准备买板子，争取可以六张卡，实现144G的显存。

这里提醒一下最好是用X99的寨版，不要用技嘉、EVGA和华硕的板子，这些都不支持开ABOVE 4G。
寨板也建议用支持DDR3服务器内存的。原因嘛是因为DDR4的内存太贵了，到时候万一显存不够，可以使用CPU版本的，虽然到时候会慢点，但是至少可以跑不是吗？搞个双路，开满线程，例如双路2699V3，应该也还可以。为啥不用2696V3,还是因为软妹币呗（其实我有几个，在低价的时候买的）

进入正题：

1、Groq简介

不赘述了，既然你看到这篇文章了，说明你是跟着这个关键词来的，我就不赘述了。有不是很了解的，可以到官方网站自己浏览。
https://console.groq.com/playground?model=llama3-70b-8192

2、安装和申请API_Key

同样是官方,就一行命令，我用的Anaconda，在环境是用命令行进行安装

再到官方网站上申请API_KEY

3、示例


import os

from groq import Groq

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],
    model="mixtral-8x7b-32768",
)

print(chat_completion.choices[0].message.content)

4、示例的问题

按照上述示例，都是单次会话，没有上下文。

5、参考

enter description here

参考截图右下角的‘view code’,实现如下：

1、Step 1 导入模块


from groq import Groq

Step 2 编写函数

def ChatContext(prompt:'str',history:'list',groqClient,modelid='llama3-70b-8192',temperature=1,max_tokens=8192,top_p=1,stream=True,stop=None):
    history.append({'role':'user','content':prompt})
    completion = groqClient.chat.completions.create(
        model=modelid,
        messages=history,
        temperature=temperature,
        max_tokens=max_tokens,
        top_p=top_p,
        stream=stream,
        stop=stop
    )
    resultstr=''
    for chunk in completion:
        if chunk.choices[0].delta.content is None:
            pass
        else:
            resultstr=resultstr+chunk.choices[0].delta.content
    history.append({'role':'assistant','content':resultstr})
    return resultstr,history

Step 3 准备Groq

api_key='your key'
historyChat=[]
client = Groq(api_key=keystr)

Step 4 使用函数和显示

promptstr='hello'
contentstr,historyChat=ChatContext(promptstr,historyChat,client)
print(contentstr)

以下是效果：