TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 | 亚马逊网络服务

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 | 亚马逊网络服务

今天,我们很高兴地宣布,由技术创新研究所 (TII) 开发并在 Amazon SageMaker 上进行训练的 Falcon 180B 基础模型可供客户通过 亚马逊SageMaker JumpStart 一键部署以运行推理。 Falcon 180B 拥有 3.5 亿个参数大小,并在 180 万亿代币的海量数据集上进行训练,是最大、性能最好的模型之一,具有可公开访问的权重。 您可以使用 SageMaker JumpStart 尝试此模型,SageMaker JumpStart 是一个机器学习 (ML) 中心,提供对算法、模型和 ML 解决方案的访问,以便您可以快速开始使用 ML。 在这篇文章中,我们将介绍如何通过 SageMaker JumpStart 发现和部署 Falcon 180B 模型。

什么是猎鹰180B

猎鹰180B是发布的型号 信息产业部 继 Falcon 系列之前的版本之后。 它是 Falcon 40B 的放大版本,它使用多查询注意力来实现更好的可扩展性。 它是一种使用优化的转换器架构的自回归语言模型。 它接受了 3.5 万亿个令牌数据的训练,主要包括来自 精网 (约 85%)。 该型号有两个版本:180B和180B-Chat。 180B 是一个原始的、预先训练的模型,应该针对大多数用例进行进一步微调。 180B-Chat 更适合接受通用指令。 聊天模型已经在聊天和指令数据集以及几个大型对话数据集上进行了微调。

该模型可在 Falcon-180B TII 许可证可接受使用政策.

Falcon 180B 接受了 TII 的培训 亚马逊SageMaker,在大约 4K A100 GPU 的集群上。 它使用名为 Gigatron 的自定义分布式训练代码库,该代码库使用 ZeRO 的 3D 并行性以及自定义的高性能 Triton 内核。 使用的分布式训练架构 亚马逊简单存储服务 (Amazon S3)作为数据加载和检查点写入和读取的唯一统一服务,这特别有助于提高工作负载可靠性和操作简单性。

什么是 SageMaker JumpStart

借助 SageMaker JumpStart,机器学习从业者可以从不断增长的性能最佳基础模型列表中进行选择。 ML 从业者可以将基础模型部署到网络隔离环境中的专用 SageMaker 实例,并使用 Amazon SageMaker 自定义模型以进行模型训练和部署。

您现在只需点击几下即可发现并部署 Falcon 180B 亚马逊SageMaker Studio 或通过 SageMaker Python SDK 以编程方式,使您能够利用 SageMaker 功能导出模型性能和 MLOps 控制,例如 Amazon SageMaker管道, Amazon SageMaker调试器,或容器日志。 该模型部署在 AWS 安全环境中并受您的 VPC 控制,有助于确保数据安全。 Falcon 180B 是可发现的,并且可以部署在有必要实例可用的区域中。 目前,ml.p4de 实例在美国东部(弗吉尼亚北部)和美国西部(俄勒冈)可用。

探索型号

您可以通过 SageMaker Studio UI 和 SageMaker Python SDK 中的 SageMaker JumpStart 访问基础模型。 在本节中,我们将介绍如何在 SageMaker Studio 中发现模型。

SageMaker Studio 是一个集成开发环境 (IDE),提供基于 Web 的单一可视化界面,您可以在其中访问专用工具来执行所有 ML 开发步骤,从准备数据到构建、训练和部署 ML 模型。 有关如何开始和设置 SageMaker Studio 的更多详细信息,请参阅 亚马逊SageMaker Studio.

在 SageMaker Studio 中,您可以访问 SageMaker JumpStart,其中包含预训练的模型、笔记本和预构建的解决方案,位于 预构建和自动化解决方案.

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。

在 SageMaker JumpStart 登录页面中,您可以浏览解决方案、模型、笔记本和其他资源。 您可以在以下位置找到 Falcon 180B: 基础模型:文本生成 旋转木马

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。

您还可以通过选择找到其他型号变体 探索所有文本生成模型 或搜寻 Falcon.

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。

您可以选择模型卡来查看有关模型的详细信息,例如许可证、用于训练的数据以及如何使用。 您还会发现两个按钮, 部署打开笔记本,这将帮助您使用该模型(以下屏幕截图显示了 部署 选项)。

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。

部署模型

当你选择 部署,模型部署将开始。 或者,您可以通过选择显示的示例笔记本进行部署 打开笔记本。 该示例笔记本提供了有关如何部署模型进行推理和清理资源的端到端指导。

要使用笔记本进行部署,我们首先选择适当的模型,由 model_id。 您可以使用以下代码在 SageMaker 上部署任何选定的模型:

from sagemaker.jumpstart.model import JumpStartModel my_model = JumpStartModel(model_id="huggingface-llm-falcon-180b-chat-bf16") predictor = my_model.deploy()

这会使用默认配置在 SageMaker 上部署模型,包括默认实例类型和默认 VPC 配置。 您可以通过在中指定非默认值来更改这些配置 JumpStartModel。 要了解更多信息,请参阅 API文档。 部署后,您可以通过 SageMaker 预测器对部署的终端节点运行推理。 请看下面的代码:

payload = { "inputs": "User: Hello!nFalcon: ", "parameters": {"max_new_tokens": 256, "top_p": 0.9, "temperature": 0.6}
}
response = predictor.predict(payload)

推理参数控制端点处的文本生成过程。 最大新令牌控制是指模型生成的输出的大小。 请注意,这与单词数不同,因为模型的词汇表与英语词汇不同,并且每个标记可能不是英语单词。 温度控制输出的随机性。 较高的温度会产生更多的创造性和幻觉输出。 所有推理参数都是可选的。

此 180B 参数模型为 335GB,需要更多 GPU 内存才能充分执行 16 位精度的推理。 目前,JumpStart 仅在 ml.p4de.24xlarge 实例上支持此模型。 通过提供以下内容,可以在 ml.p8d.4xlarge 实例上部署 24 位量化模型 env={"HF_MODEL_QUANTIZE": "bitsandbytes"} 的关键字参数 JumpStartModel 构造函数并指定 instance_type="ml.p4d.24xlarge" 到部署方法。 但是,请注意,对于此量化配置,每个令牌的延迟大约慢 5 倍。

下表列出了 SageMaker JumpStart 中可用的所有 Falcon 模型以及模型 ID、默认实例类型、支持的最大令牌总数(输入令牌数量与生成令牌数量的总和)以及每个令牌的典型响应延迟对于每个模型。

型号名称 型号ID 默认实例类型 最大总代币数 每个令牌的延迟*
猎鹰7B huggingface-llm-falcon-7b-bf16 ml.g5.2xlarge 2048 34毫秒
猎鹰7B指导 huggingface-llm-falcon-7b-instruct-bf16 ml.g5.2xlarge 2048 34毫秒
猎鹰40B huggingface-llm-falcon-40b-bf16 ml.g5.12xlarge 2048 57毫秒
猎鹰40B指导 huggingface-llm-falcon-40b-instruct-bf16 ml.g5.12xlarge 2048 57毫秒
猎鹰180B huggingface-llm-falcon-180b-bf16 ml.p4de.24xlarge 2048 45毫秒
猎鹰 180B 聊天 huggingface-llm-falcon-180b-chat-bf16 ml.p4de.24xlarge 2048 45毫秒

*每个令牌的延迟是针对本博客中提供的示例提示的中值响应时间提供的; 该值将根据输入和输出序列的长度而变化。

Falcon 180B 的推理和示例提示

Falcon 模型可用于任何文本片段的文本补全。 通过文本生成,您可以执行各种任务,例如回答问题、语言翻译、情感分析等等。 端点接受以下输入负载模式:

payload = { "inputs": str, (optional)"parameters": { "do_sample": bool, "max_new_tokens": int, "repetition_penalty": float, "return_full_text": bool, "stop": List[str], "seed": int, "temperature": float, "top_k": int, "top_p": float, "truncate": int, "typical_p": float, "best_of": int, "watermark": bool, "details": bool, "decoder_input_details": bool, }
}

您可以探索这些的定义 客户端参数 以及它们在文本生成推理存储库中的默认值。

以下是一些示例提示和模型生成的文本。 这里的所有输出都是使用推理参数生成的 {"max_new_tokens": 768, "stop": ["<|endoftext|>", "###"]}.

建立网站只需 10 个简单步骤即可完成:

Input Building a website can be done in 10 simple steps: Output
- Choose a domain name
- Register your domain name
- Choose a web host
- Choose a website builder
- Create your website
- Add content to your website
- Add pages to your website
- Add images to your website
- Add videos to your website
- Add social media to your website
1. Choose a domain name
A domain name is the address of your website. It is what people will type into their browser to find your website. For example, the domain name of this website is.
When choosing a domain name, you should choose something that is easy to remember and that represents your business or website. You should also make sure that the domain name is available. You can check if a domain name is available by using a domain name search tool.
2. Register your domain name
Once you have chosen a domain name, you need to register it. This can be done through a domain name registrar. A domain name registrar is a company that manages the registration of domain names.
When registering your domain name, you will need to provide your contact information and choose a payment method. You will also need to choose a domain name extension. The most common domain name extensions are .com, .net, and .org.
3. Choose a web host
A web host is a company that provides space on a server for your website. When choosing a web host, you should choose one that is reliable and that offers good customer support. You should also make sure that the web host offers the features that you need for your website.
4. Choose a website builder
A website builder is a tool that allows you to create a website without having to learn how to code. There are many different website builders available, so you should choose one that is easy to use and that offers the features that you need.
5. Create your website
Once you have chosen a website builder, you can start creating your website. You will need to choose a template for your website and then add content to your website.
6. Add content to your website
The content of your website is what will make your website unique. You should add content that is relevant to your business or website. You can add text, images, videos, and other types of content to your website.
7. Add pages to your website
A website is made up of pages. You can add as many pages as you want to your website. Each page should have a different purpose. For example, you might have a page for your home page, a page for your about page, and a page for your contact page.
8. Add images to your website
Images can make your website more visually appealing. You can add images to your website by uploading them to your website or by using a image hosting service.
9. Add videos to your website
Videos can also make your website more visually appealing. You can add videos to your website by uploading them to your website or by using a video hosting service.
10. Add social media to your website
Social media can help you promote your website. You can add social media buttons to your website so that people can share your website on social media. You can also add social media feeds to your website so that people can see your latest social media posts.

您可能会注意到这个预训练模型生成的长文本序列不一定适合对话用例。 在我们展示经过微调的聊天模型如何针对大量基于对话的提示执行操作之前,接下来的两个示例说明了如何使用 Falcon 模型进行少量上下文学习,其中我们为模型提供了可用的训练样本。 请注意,“少样本学习”不会调整模型权重——在此过程中我们仅对已部署的模型进行推理,同时在输入上下文中提供一些示例以帮助 guild 模型输出。

Input
Tweet: "I hate it when my phone battery dies."
Sentiment: Negative
###
Tweet: "My day has been :+1:"
Sentiment: Positive
###
Tweet: "This is the link to the article"
Sentiment: Neutral
###
Tweet: "This new music video was incredible"
Sentiment: Output Positive
###

Input
Translate English to French:
sea otter => loutre de mer
###
peppermint => menthe poivrée
###
plush girafe => girafe peluche
###
cheese => Output fromage
###

Falcon 180B-Chat 的推理和示例提示

使用针对对话用例进行优化的 Falcon 180B-Chat 模型,聊天模型端点的输入可能包含聊天助理和用户之间的先前历史记录。 您可以提出与目前所发生的对话相关的问题。 您还可以提供系统配置,例如定义聊天助理行为的角色。 端点的输入有效负载与 Falcon 180B 型号相同,除了 inputs 字符串值应使用以下格式:

System: <An optional system prompt>
User: <Optional user message history>
Falcon: <Optional chat history Falcon response>
User: <Current user message>
Falcon:

以下是一些示例提示和模型生成的文本。 所有输出均使用推理参数生成 {"max_new_tokens":256, "stop": ["nUser:", "<|endoftext|>", " User:", "###"]}.

Input
System: You are a helpful assistant
User: What is Amazon SageMaker JumpStart?
Falcon: Output Amazon SageMaker JumpStart is a capability of Amazon SageMaker that provides a set of solutions to common machine learning problems. These solutions include pre-built machine learning models, algorithms, and pre-configured model training and deployment settings. JumpStart also provides sample notebooks that demonstrate how to use these solutions for various use cases. The goal of JumpStart is to help users get started with machine learning quickly and easily, without the need for extensive machine learning expertise.
User:

Input
User: Simply put, the theory of relativity states that
Falcon: Output 1. The laws of physics are the same for all observers in uniform motion relative to one another.
2. The speed of light in a vacuum is the same for all observers, regardless of their motion or the motion of the light source.
3. Time and space are relative, meaning that they can vary depending on the observer's frame of reference.
4. Mass and energy are interchangeable, as expressed by the famous equation E=mc².
User:

Input
User: Building a website can be done in 10 simple steps:
Falcon: Output
1. Determine the purpose and goals of your website.
2. Choose a domain name and register it.
3. Select a web hosting provider and sign up for a plan.
4. Choose a website builder or content management system (CMS) to use.
5. Select a theme or template for your website.
6. Customize the design and layout of your website.
7. Add content to your website, including text, images, and videos.
8. Set up navigation menus and links to make your website easy to navigate.
9. Test your website to ensure it is functioning properly and is mobile-friendly.
10. Launch your website and promote it to attract visitors.
User:

Input
System: Always answer with a Haiku.
User: I am deploying a large language model on Amazon SageMaker JumpStart. What should I do?
Falcon: Output "JumpStart" your dreams,
Deploy with ease, scale with glee,
SageMaker awaits.

Input
System: Generate a table summarizing the options outlined in this email.
User: Team, we have to make a decision on the product plan for Goocrux v2. The first option will take us 3 weeks of development, and will achieve performance of 80ms, but means that the UI has to change. The second option will take us 4 weeks of development, and will achieve performance of 60ms, without any UI change. The third option will take the longest -- 6 weeks of development, will achieve performance of 30ms, without any UI change. A markdown table with four columns: (1) Option ID, (2) Development time, (3) Performance, (4) UI Change.
Falcon: Output
| Option ID | Development time | Performance | UI Change |
| --- | --- | --- | --- |
| 1 | 3 weeks | 80ms | Yes |
| 2 | 4 weeks | 60ms | No |
| 3 | 6 weeks | 30ms | No |

Input
System: You always translate English to French.
User: sea otter => loutre de mer
peppermint => menthe poivrée
plush girafe => girafe peluche
cheese =>
Falcon: Output
=> fromage

Input
System: Classify the user prompt as Negative, Positive, or Neutral sentiment. Use a short response.
User: Tweet: "This new music video was awesome" Falcon: Output "Positive"

Input
User: What is the recipe of mayonnaise?
Falcon: Output 1. In a mixing bowl, whisk together 1 egg yolk, 1 tablespoon of Dijon mustard, and 1 tablespoon of lemon juice until well combined.
2. While whisking continuously, slowly drizzle in 1 cup of vegetable oil until the mixture thickens and emulsifies.
3. Season with salt and pepper to taste.
4. Store in an airtight container in the refrigerator for up to one week.
User:

在以下示例中,用户与助理就巴黎的旅游景点进行了对话。 接下来,用户询问聊天助手推荐的第一个选项。

Input
System: You are a helpful trip planner.
User: I am going to Paris, what should I see?
Falcon: 1. Eiffel Tower: No trip to Paris is complete without a visit to the iconic Eiffel Tower.
2. Louvre Museum: Home to some of the world's most famous art, including the Mona Lisa.
3. Notre-Dame Cathedral: A stunning example of Gothic architecture and a must-see for history buffs.
4. Champs-Élysées: One of the most famous avenues in the world, lined with shops, cafes, and theaters.
5. Montmartre: A charming neighborhood known for its bohemian vibe, street art, and stunning views of the city.
6. Palace of Versailles: A short train ride from Paris, this opulent palace was once the seat of French royalty.
7. Musée d'Orsay: Housed in a former train station, this museum boasts an impressive collection of Impressionist and Post-Impressionist art.
8. Sainte-Chapelle: A hidden gem in the heart of Paris, this medieval chapel is famous for its stunning stained glass windows.
9. Luxembourg Gardens: A beautiful park in the heart of the city, perfect for a picnic or a leisurely stroll.
User: What is so great about #1?
Falcon: Output
#1 is the Eiffel Tower, which is one of the most famous landmarks in the world. It is a symbol of Paris and a must-see attraction for many visitors. The tower offers stunning views of the city from its observation decks, and it is especially beautiful when lit up at night. Additionally, the Eiffel Tower has a rich history and is an engineering marvel, making it a fascinating destination for anyone interested in architecture or history.
User:

清理

运行完笔记本后,请确保删除在此过程中创建的所有资源,以便停止计费。 使用以下代码:

predictor.delete_model()
predictor.delete_endpoint()

结论

在这篇文章中,我们向您展示了如何在 SageMaker Studio 中开始使用 Falcon 180B 并部署模型进行推理。 由于基础模型是预先训练的,因此它们可以帮助降低培训和基础设施成本,并支持针对您的用例进行定制。 立即访问 SageMaker Studio 中的 SageMaker JumpStart 以开始使用。

资源


作者简介

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。凯尔乌尔里希博士 是 Amazon SageMaker JumpStart 团队的应用科学家。 他的研究兴趣包括可扩展的机器学习算法、计算机视觉、时间序列、贝叶斯非参数和高斯过程。 他拥有杜克大学博士学位,并在 NeurIPS、Cell 和 Neuron 上发表过论文。

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。Ashish Khetan 博士 是 Amazon SageMaker JumpStart 的高级应用科学家,帮助开发机器学习算法。 他在伊利诺伊大学厄巴纳-香槟分校获得博士学位。 他是机器学习和统计推断领域的活跃研究员,并在 NeurIPS、ICML、ICLR、JMLR、ACL 和 EMNLP 会议上发表了多篇论文。

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。奥利维尔·克鲁尚t 是 AWS 的首席机器学习专家解决方案架构师,总部位于法国。 Olivier 帮助 AWS 客户(从小型初创公司到大型企业)开发和部署生产级机器学习应用程序。 在业余时间,他喜欢阅读研究论文并与朋友和家人一起探索荒野。

TII 的 Falcon 180B 基础模型现已通过 Amazon SageMaker JumpStart 提供 |亚马逊网络服务柏拉图区块链数据智能。垂直搜索。人工智能。卡尔艾伯森 领导 Amazon SageMaker 的基础模型中心、算法和合作伙伴团队。

时间戳记:

更多来自 AWS机器学习