千“垂”百炼:垂直领域与语言模型(1)

Using Language Models in Specific Domains (1)

微信公众号版本:https://mp.weixin.qq.com/s/G24skuUbyrSatxWczVxEAg

这一系列文章仍然坚持走“通俗理解”的风格,用尽量简短、简单、通俗的话来描述清楚每一件事情。本系列主要关注语言模型在垂直领域尝试的相关工作。

This series of articles still sticks to the "general understanding" style, describing everything in as short, simple and easy-to-understand terms as possible. This series focuses on the work of language models in specific domains.

目录 (Table of Contents):

1 引言(←)

  • 1.1 语言模型的能力
  • 1.2 落地垂直领域的灵魂发问

2 归根到底是可用的垂直领域数据

  • 2.1 医疗领域的尝试:医患对话(ChatDoctor)
  • 2.2 Stanford Alpaca解决数据稀缺的思路
  • 2.3 Self-Instruct半自动生成数据

更多(待定)

1 Introduction(←)

  • 1.1 Power of Language Models
  • 1.2 Questioning: Are You Sure Specific Domains?

2 Essential: Domain-specific Training Data

  • 2.1 Attempts in Medical Domain (ChatDoctor)
  • 2.2 Stanford Alpaca: Idea for Obtaining Data
  • 2.3 Self-Instruct: Semi-automatic Data Generation

More (to be confirmed)

1 引言(←)

Introduction

1.1 语言模型的能力

Power of Language Models

最近,语言模型让我们看到,它回应人类指令的表现效果大大提高了。Recently, language models have shown us that it responds to human commands even more amazingly well compared to before.

而在此之前,人们与AI智能体的聊天互动基本上只局限于 Prior to this, people's chat interactions with AI intelligence were mostly limited to:

  • 真正的闲聊(并且聊天质量不高)chit-chat (and the quality of the chat was not good)
  • 让AI完成特定的任务(订餐、订票、问答等,这种互动方式几乎不允许聊与此任务无关的内容)having the AI perform a specific task (ordering food, booking tickets, Q&A, etc., and this type of interaction barely allowed chatting about anything unrelated to this task)

如今,我们可以自由地发出指令。尽管这些指令五花八门,语言模型总是可以给出不错、甚至超出预期的回应。Nowadays, we can give instructions freely. Despite the variety of these commands, the language model can always respond well, or even beyond our expectations.

1.2 落地垂直领域的灵魂发问

Questioning: Are You Sure Specific Domains?

这一部分内容仁者见仁,智者见智。There are a thousand Hamlets in a thousand people's eyes.

是否能够、有必要将这种语言模型和自己的垂直领域业务相结合,可能要先问自身几个问题。In order to figure out whether it is possible and necessary to combine this language model with your own domain-specific business, you may want to ask yourself a few questions first.

1. 我不缺钱,我就是想把这种AI语言模型想尽办法和我的业务结合。我不管这种结合是真的契合还是勉强的。这样可以吗? I'm not short of money, I just want to combine this AI language model with my business in any way I can. I don't care if that combination is a real fit. Is that alright?

可以,因为不缺钱,可以尽情的试错(羡慕)。No problem because there is no shortage of money and you can try and experiment to your heart's content (extremely jealous).

言归正传,可以的原因大概有2个 Back to the main story, why is it possible:

  • 它很可能已经具备垂直领域的知识。It is likely to already have domain-specific knowledge. 这种AI模型是学习过海量资料的,无论你是在哪个垂直领域,它可能都有所涉及。它对于垂直领域的互动不见得会效果不好。This AI model is learned from vast amounts of information, and it has probably covered whatever domain you are in. The model may work well in your particular domain.

  • 看重的是它的某项技能。You are looking at it for a particular skill. 你可能也不需要这个AI模型学习过垂直领域相关的资料(换句话说,它即使不懂这个领域,同样可以帮助到你)。在这种情况下,取决于你看上了语言模型的哪些语言技能。比如,AI语言模型具有不错的文字总结能力,随便扔给它一篇业内的文章,虽然它可能看不太懂,但是它仍然可以总结出质量不错的简报。You may also not need this AI model to have learned the knowledge of a certain domain (in other words, it can help you equally well even if it doesn't know the domain). In this case, it depends on which linguistic skills you look for in a language model. For example, an AI language model with good text summarisation skills can be given a casual article from the domain and it can still summarise a good-quality brief.

2. 我的垂直领域能接受语言模型的不完美吗? Can my domain accept the imperfections of the language model?

虽然现在语言模型很强大,但它仍然有一些不完美的地方需要引起注意。As powerful as the language model is now, it still has some imperfections that need to be drawn to our attention.

  • 会犯错 Mistakes can be made:它的回答可能会出现违背事实的错误。换句话说,可能会一本正经的胡说八道。its answers may be wrong against the facts.

  • 不确定性 Uncertainty:面对同一个问题,语言模型每次的回答是可以不一样的。你喜欢它某一次的回答,不代表它每次的回答都会令你满意。A language model can respond differently to the same instruction each time. It does not guarantee that every answer will be to your satisfaction.

  • 不方便“教训”它 Not convenient to "teach" it:目前很多厂家会提供语言模型的接口,但是我们只可以使用,不能直接去“教训”它。如果在自己的领域有表现不满意的地方,在短时间内我们几乎无能为力。Many companies currently provide interfaces to language models, but we can only use them and cannot "teach" them directly. If we are not satisfied with the performance, there is very little we can do about it in the short term.

  • 不灵活 Not lightweight:即使你拥有属于自己的语言模型并且你可以任意“教训”它,如果你想修改、校正、调整它的记忆和技能可不容易。你可能需要“教训”它很多次、给它看很多例子它才能记住你的训导。即使它说它记住了,那它是否真的记住了、它记住了这个是否又忘记了别的、教训完后它每次的表现是否都能够达到预期等都需要经过严格的测试才能知道。总之,训导它和训练真正的人类还是有很大区别。Even if you have your own language model and you can 'teach' it as much as you like, it is not easy to tune, calibrate and modify its memory and skills if you want to. You may have to 'teach' it many times and show it many examples before it understands your instructions. Even if it says it remembers, you need to test it carefully to see if it really remembers, if it remembers one thing and forgets another, and if it performs as expected after the training. In short, there is a big difference between training it and training a real human.

  • 带来额外支出 Additional costs required:如果调用第三方接口去使用语言模型,会收取费用(一般来讲,与接口传送的数据越多,收费越高);如果自己部署语言模型,需要购置能够运行语言模型的软硬件资源;拥有语言模型并不是全部,还是需要投入人力、财力、时间去打磨如何让模型与自己的业务相结合。If you call a third-party interface to use the language model, you will be billed (generally speaking, the more data you transfer to the interface, the higher the bill); if you deploy the language model by yourself, you will need to purchase the hardware and software resources to run it; owning the language model is not the end of the story; you will still need to invest manpower, money and time to work out how to integrate the model with your business.

3. 我想把这种语言模型融入到自己的垂直领域,这到底是我无意识陷入了盲目跟随潮流,还是真的会对我的业务有帮助? I want to incorporate this language model into my domain - am I unconsciously falling into blindly following a trend, or will it really help my business?

梦想和理智并存。Dreams and sanity exist together.

  • 有梦想合理 Having dreams is reasonable:出现跟随潮流的想法是合理的。因为语言模型确实在很多方面表现不错,有潜力。It is reasonable to have the idea of following trends because language models do perform well in many ways and have potential.

  • 是否有帮助看效果 Whether it helps depends on the results:对业务有无帮助看实际验证的效果,不凭空想象。如果找不到和自己业务类似的先例,这个问题的答案只有自己才能找到。Whether it helps your business or not depends on the actual validated results, not on imagination. If you can't find a previous example similar to your own business, the answer to this question can only be found by yourself.

  • 不失理智 No loss of sanity:

    • 不做超出自己承受能力的尝试(能够承担的住失败的代价)Do not try beyond what you can afford (can afford to fail)
    • 一开始可以先精选一个或少数业务进行尝试 You can start with one or a few selected cases to explore

4. 我不懂技术原理,如果我提出来一些天马行空、甚至不切实际、超出模型能力范围的想法,技术/研发人员会笑话我、反感我吗? I don't have any technical background, if I come up with some pie-in-the-sky, even unrealistic, ideas that are beyond the model's capabilities, will the technical/R&D team laugh at me and dislike me?

不会,垂直领域的落地正需要非技术和技术想法之间的碰撞。Will not, making models work in specific domains is requiring the collaboration between non-technical and technical ideas.

两者之间需要互相配合、彼此校正。The two need to work together and correct each other. 碰撞的过程可能不总是愉快的,需要有商有量,互相理解。The collaboration may not always be pleasant and requires mutual understanding.

  • 从非技术人员的角度来看,我们需要他/她进行大胆、创新的业务规划。同时也需要技术人员对能够实现的功能进行评估(比如需要多少资源),对无法实现的业务功能及时提醒对方。From the perspective of a non-technical person, we need him/her to make brave and creative business plans. We also need the technical person to assess the features that can be achieved (e.g. how many resources are needed) and to remind the other person in a timely manner of the business features that cannot be achieved.

  • 从技术人员的角度看,我们同样可以为业务规划贡献想法。AI技术是不断发展的,以前很难实现、遥不可及的功能,在今天可能很容易就可以实现,但非技术人员可能没有及时的意识到这一点。这需要我们去提醒非技术人员,耐心的向他们科普目前技术能够做到哪些事情。From the perspective of technical staff, we can also contribute ideas to business planning. AI technology is constantly evolving, and what was once difficult and out-of-reach may be easily implemented today, but non-technical staff may not realise this in time. It's important for us to remind non-technical people and patiently explain to them what the technology can currently do.

对语言模型设置合理预期,避免过高过低。Set reasonable expectations for the language model and avoid going too high or too low. 语言模型确实很强大,但它不是完美的。The language model is indeed powerful, but it is not perfect.

  • 预期不能过高 Expectations must not be too high:一个想法可能是好的但无法/很难实现(如果经济实力足够可以转为研发项目。但需要沉得住气,不能指望短期出成果)An idea may be good but unrealisable or require a great deal of cost (can be turned into an R&D project if financially strong enough. But we need to be patient and not expect short term results)

  • 预期不用太低 Expectations don't have to be low:非技术人员以为无法实现,砍掉了本来可以上线的功能(此时需要技术人员及时指正)a feature could have been implemented, but was removed because a non-technical person thought it couldn't be implemented (at which point it needed to be corrected by a technical person in a timely manner)

  • 模型一时表现不佳,不代表一直不佳 A model that performs poorly for a while does not mean it will always perform poorly:如果功能可以实现,但距离预期仍有差距,给模型适应的时间。它可以持续学习(尤其是从人类的反馈中)。经过坚持不懈的努力,它可以做的更好。If features can be implemented, but the performance of the feature still falls short of expectations, we need to give the model some time to improve. It can continue to learn (especially from human feedback). With consistent effort, it can do better.

5. 我听说做这个很烧钱,但是我没有那么多钱,我还有机会试一试吗? I've heard it's very expensive to do this but I don't have that much money, do I have a chance to try it?

有机会。这里的“烧钱”主要是指从0到1创造模型的过程需要很大的开销。而我们的目的主要是借助现有经验或使用现有模型,不是从头创造。There is. The reason it is expensive is mainly that the process of creating a model from scratch requires a lot of expense. Our aim is to leverage existing experience or use existing models, not to create them from scratch.

  • 创造什么都会的模型很烧钱:搜一下创造模型的公司都投入了多少资源就大概知道了 Creating models that can do everything is expensive: you can find out how much resources have been invested based on some news

  • 仅使用已经创造好的模型有一定开销,但没那么烧钱:相比“很烧钱”,这部分开销非常非常非常小 There is some cost in only using models that have already been created, but not very expensive: this cost to us is small compared to the cost of creating these models from scratch

  • 创造专精自己领域的模型不一定很烧钱 Creating models that specialise in your own domain does not have to be expensive:

    • 现有工作已经向我们证实了一条可行之路,我们可以少走弯路 The existing work has confirmed feasible solutions, saving us the effort of exploring on our own

    • 有关研究已经证明,即使使用很小的模型(小模型的学习效率和知识储备能力不如大模型),经过恰当的训练(尤其根据人类的反馈),小模型是有机会与大模型的表现相媲美的(在垂直领域表现如何需要自行验证)Studies have demonstrated that small models (smaller models do not learn as efficiently or have the same knowledge-base capacity as larger models) have the opportunity to match the performance of large models with appropriate training (especially based on human feedback). How well it performs in your specific domains needs to be validated.

    • 现有的模型训练技术允许我们低成本的在大模型的基础上再次训练(并且效果还不错)Existing model training techniques allow us to retrain on a large model at a low cost (and with good results)

6. 现有的可用语言模型很好,但是在我的领域表现还不够出色,我还是想要针对自己的领域研发一个模型。最应该注意什么? The existing available language models are good, but they don't perform well enough in my domain and I still want to develop a model for my own domain. What are the most important things to be aware of?

至少应该注意3点 At least 3 points should be noted:

  • 业务刚需还是为了华而不实的功能 Develop your own models for essential functions or for impractical ones
  • 巧妇难为无米之炊,有无语言模型可用的学习数据 Availability of learning data for language models
  • 在现有模型基础上继续研发是否合规 Whether it is appropriate to continue to develop on the basis of existing models

业务刚需还是为了华而不实的功能 Develop your own models for essential functions or for impractical ones 在开展这个工作之前,需要结合自身的情况(例如战略布局、业务规划)来决定开展自研工作是否是刚需。如果仅仅是为了实现华而不实的功能或者预算紧张,则需要再三考虑。Before undertaking this work, you need to decide whether the work to develop your own model is just what you need in the context of your own situation (e.g. strategic plan, business plan). If the purpose is simply to achieve an impractical function or if you are on a tight budget, you need to think twice.

巧妇难为无米之炊,有无语言模型可用的学习数据 Availability of learning data for language models 语言模型读懂指令并做出反应的能力是学习出来的,这需要学习数据的支持。同理,在垂直领域你是否有合适的模型学习数据是非常重要的。目前业务上积累下来的数据可不可以直接用、如何将其转化成语言模型可用的学习数据等,我们在后续的文章中有所提及。The ability of language models to understand and respond to instructions is learned, and this needs to be supported by learning data. Similarly, it is important that you have appropriate model learning data in your domain. Whether the data currently gathered in your business can be used directly and how it can be transformed into learning data usable by language models will be covered in subsequent articles.

在现有模型基础上继续研发是否合规 Whether it is appropriate to continue to develop on the basis of existing models 需仔细阅读现有模型的许可证。有些模型虽然是开源直接可用的,但是在它们的许可证(license) 中有明确描述:模型以及模型的变体(例如再次训练之后的模型)不能用于商用,不能用于提供医疗意见、解读医疗报告等。The licences of existing models need to be read carefully. Some models are open-sourced and directly available, but their license clearly states that the model and derivatives of the model (e.g. after fine-tuning) cannot be used for commercial purposes, providing medical advice, interpreting medical reports, etc.

2.1 医疗领域的尝试:医患对话(ChatDoctor)

(未完待续, To be continued)

posted @ 2023-04-09 12:05  createMoMo  阅读(253)  评论(0编辑  收藏  举报