OpenAI Key

如何获取OpenAI Api Key?

访问https://platform.openai.com/api-keys，即可申请API-Key。

OpenAI提供了哪些资源？

在https://platform.openai.com/docs/models中，列出了可以调用的模型。

截止本文上传时间，OpenAI的模型迭代至了GPT-4，GPT-4o和OpenAI-o1。GPT-4o是目前最强的多模态对话模型，可以处理复杂多轮对话，根据文本、语音、图片、视频对人和周围环境作出回应。OpenAI-o1则是目前最强的推理模型。

Introducing GPT-4o:https://www.youtube.com/watch?v=DQacCB9tDaw

其余有意思的模型还有：

DALL·E：可以根据Prompt生成图片
TTS：可以根据Prompt生成语音
Whisper：将语音转换成文本
Embeddings：将文本转换成数值向量

在https://platform.openai.com/tokenizer中，可以了解ChatGPT的Tokenizer的实现效果。

关于Tokenizer需要注意的是，字符不是token，每一个模型有自己的分词规则。且收费是按照token收费。

此为GPT模型的测试界面。可以看到，Prompt包含SYSTEM和USER，两部分。前者可以为模型设定一些角色，后者则是需要模型回答的内容。

右侧可调节参数有：

Temperature：用于调整随机从生成模型中抽样的程度，因此每次“生成”时，相同的提示可能会产生不同的输出。温度为0则将始终产生相同的输出。温度越高随机性越大，主要用于控制创造力。
Maximum length：用于限制输入序列的最大长度。
Top P：在生成文本等任务中，选择可能性最高的前P%个词采样。例如，如果将Top P参数设置为 0.7，那么模型会选择可能性排名超过 70% 的词进行采样。这样可以保证生成的文本准确性较高，但可能会缺之多样性。相反，如果将Top P参数设置为 0.3，则会选择可能性超过 30% 的词进行采样，这可能会导致生成文本的准确性下降，但能够更好地增加多样性。
Frequency penalty：用于控制生成文本中重复使用某些词语的频率。鼓励使用新词汇，能使生成的文本更加多样化和平衡。
Presence penalty：控制对模型重用任何已经提到的词汇或短语。设置为1.0，适用于探索性会话或引入新想法的任务；设置为0.0，适用于需要强化关键词汇或概念的技术文档或教程。

https://towardsdatascience.com/a-visual-explanation-of-llm-hyperparameters-daf61d3b006e

OpenAI API如何使用？

from openai import OpenAI

api_key = ''

client = OpenAI(api_key=api_key)

# 打印所支持的模型
model_lst = client.models.list()

for model in model_lst:
    print (model.id)

gpt-4-turbo
gpt-4-turbo-2024-04-09
tts-1
tts-1-1106
chatgpt-4o-latest
dall-e-2
whisper-1
gpt-4-turbo-preview
gpt-3.5-turbo-instruct
gpt-4o-2024-08-06
gpt-4-0125-preview
gpt-3.5-turbo-0125
gpt-3.5-turbo
babbage-002
davinci-002
gpt-4o-realtime-preview-2024-10-01
o1-preview-2024-09-12
dall-e-3
o1-preview
gpt-4o-realtime-preview
gpt-4o-mini
gpt-4o-2024-05-13
gpt-4o-mini-2024-07-18
gpt-4o-audio-preview-2024-10-01
gpt-4o-audio-preview
tts-1-hd
tts-1-hd-1106
gpt-4-1106-preview
text-embedding-ada-002
gpt-4o
gpt-3.5-turbo-16k
text-embedding-3-small
text-embedding-3-large
gpt-3.5-turbo-1106
gpt-4-0613
o1-mini
gpt-4
o1-mini-2024-09-12
gpt-3.5-turbo-instruct-0914

简单对话-使用GPT-4-Turbo

# 调用API接口
completion = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "system", "content": "你是一名专业的英语助教，给学生提供必要的支持如提供提示、纠正错误等，但对于学生的请求不能直接给出答案，而是要一步步引导学生完成任务。 你仅需要回复跟英语有关的问题，如与英语无关，不要回复。"},
        {"role": "user", "content": "如何学好英语？"},
    ],
    max_tokens = 500,
    temperature=0.7
)

print(completion.choices[0].message.content)

ChatCompletionMessage(content='学好英语可以通过多种方法来实现，关键是要持之以恒并结合多种学习手段。以下是一些有效的建议：\n\n1. **坚持每天学习**：每天花时间学习英语，哪怕只有10分钟，也比每周只学习一次几个小时要有效得多。\n\n2. **多听多说**：尽可能多地听英语材料，如英语歌曲、电影、电视剧或播客。尝试模仿所听到的对话，这不仅可以提高你的听力，还能改善发音和口语流利度。\n\n3. **阅读和写作**：通过阅读书籍、文章或其他任何英文材料来扩展你的词汇和理解力。同时，坚持写英语日记或文章可以加强语法结构和表达能力。\n\n4. **使用应用程序和在线资源**：利用各种在线资源和应用程序，如Duolingo、Babbel、Rosetta Stone等，这些都是学习语言的好工具。\n\n5. **参加语言课程**：如果可能的话，参加英语课程或小组可以提供交流和实践的机会，同时也能得到专业的指导。\n\n6. **和母语为英语的人交流**：与英语为母语的人交谈是提高英语水平的最佳途径之一。你可以通过语言交换找到愿意帮助你的人，同时你也可以教他们你的母语。\n\n7. **持续反馈和修正**：寻求老师或更有经验的人的反馈，并根据他们的建议进行改进。\n\n8. **设置具体目标**：设定清晰的学习目标和计划，比如每天学', role='assistant', function_call=None, tool_calls=None, refusal=None)

图片生成-使用DALL-E-3

response = client.images.generate(
    model = 'dall-e-3',
    prompt = '中国大学生寝室里，四个同学正在一起吃火锅，寝室里的床是木制的，上床下桌，寝室含有一个阳台。',
    size = '1024x1024',
    quality = 'standard',
    n = 1
)
# 返回一个图片连接
image_url = response.data[0].url

图片的风格可选，比如Photo,Cartoon,Illustration。并可以设计对应的Prompt。

def generate_image(prompt, image_type="Cartoon", style="vivid", quality="standard"):
    response = client.images.generate(
        model="dall-e-3",
        prompt=f"{image_type} of {prompt}",
        size="1024x1024",
        quality=quality,
        style=style,
        n=1,
    )
    image_url = response.data[0].url
    return image_url

prompt = "在一个教室里，学生在上数学课。"
image_types = ["Photo", "Cartoon", "Illustration"]

urls_styles = []

for image_type in image_types:
    image_url = generate_image(prompt, image_type)
    urls_styles.append(image_url)

图片推理-使用GPT-4o-mini

# 对于本地图片，首先要将图片转换成base64格式
import base64

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "vision.png"

# Getting the base64 string
base64_image = encode_image(image_path)

response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?",
        },
        {
          "type": "image_url",
          "image_url": {
            "url":  f"data:image/png;base64,{base64_image}"
          },
        },
      ],
    }
  ],
)

print(response.choices[0])

Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The image depicts a group of children participating in a tug-of-war event. They are positioned on a track surrounded by buildings. The children are dressed in matching outfits, predominantly in shades of pink. There are adults supervising or participating in the event, and the atmosphere appears energetic and engaged, likely during a school or community sports day.', role='assistant', function_call=None, tool_calls=None, refusal=None))

语音生成-使用TTS-1

speech_file_path = "speech.mp3"
response = client.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="衬衫的价格是九磅十五便士"
)

response.stream_to_file(speech_file_path)