项目简介:
小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。
本次介绍的是如何在亚马逊云科技上利用SageMaker机器学习服务部署Llama开源大模型,并为Llama模型的输入/输出添加Llama Guard合规性检测,避免Llama大模型生成有害、不当、虚假内容。同时我们用容器管理服务ECS托管一个AI生活智能助手,通过调用Llama大模型API为用户提供智能生活建议,并将和用户的对话历史存在DynamoDB中,让用户可以回看历史对话记录。本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。本方案的解决方案架构图如下:
方案所需基础知识
什么是 Amazon SageMaker?
Amazon SageMaker 是亚马逊云科技提供的一站式机器学习服务,旨在帮助开发者和数据科学家轻松构建、训练和部署机器学习模型。SageMaker 提供了从数据准备、模型训练到模型部署的全流程工具,使用户能够高效地在云端实现机器学习项目。
什么是 Llama Guard工具?
Llama Guard 是一种专门设计的工具或框架,旨在为 Llama 模型(或其他大型语言模型)提供安全和合规的防护措施。它通过对模型的输入和输出进行监控、过滤和审查,确保生成内容符合道德标准和法律法规。Llama Guard 可以帮助开发者识别并防止潜在的有害内容输出,如不当言论、偏见、虚假信息等,从而提升 AI 模型的安全性和可靠性。
为什么要构建负责任的 AI?
防止偏见和歧视:
大型语言模型可能会在训练过程中无意中学习到数据中的偏见。构建负责任的 AI 旨在识别和消除这些偏见,确保 AI 的决策公平、公正,不会因种族、性别或其他特征而产生歧视。
提升信任和透明度:
用户对 AI 系统的信任依赖于系统的透明度和可解释性。通过构建负责任的 AI,可以增加用户对系统的理解,提升系统的可信度,确保用户能够信任 AI 提供的建议和决策。
遵守法律法规:
许多国家和地区对数据隐私、安全和公平性有严格的法律要求。构建负责任的 AI 可以确保模型在符合这些法律法规的基础上运行,避免法律风险。
保护用户隐私:
负责任的 AI 重视并保护用户的隐私权,避免在处理敏感数据时泄露用户个人信息。通过对数据进行适当的加密和匿名化,确保用户的数据安全。
防止误用和滥用:
负责任的 AI 设计包括防范系统被恶意利用或误用的机制。例如,防止 AI 系统被用于生成虚假新闻、散布虚假信息或攻击他人。
道德责任:
AI 系统的影响力越来越大,开发者和企业有责任确保这些系统对社会产生积极的影响。构建负责任的 AI 意味着在设计和部署 AI 系统时考虑到道德责任,避免对社会产生负面影响。
本方案包括的内容
1. 利用Streamlit框架开发AI生活助手,并将服务部署在Amazon Fargate上,前端利用负载均衡器实现高可用。
2. 利用Lambda无服务器计算服务实现与大模型的API交互
3. 在Amazon SageMaker上部署Llama 2大模型,并为大模型添加安全工具Llama Guard
4. 将对话记录存储到NoSQL服务DynamoDB中
项目搭建具体步骤:
1. 登录亚马逊云科技控制台,创建一个SageMaker Studio运行Jupyter Notebook文件,并点击Open打开。
2. 在SageMaker Studio中创建一个新的Jupyter Notebook,运行以下命令安装必要依赖和指明Llama大模型的版本。
%pip install --quiet --upgrade sagemaker
model_id = "meta-textgeneration-llama-guard-7b"
model_version = "1.*"
3. 利用SageMaker Jumpstart功能,无需训练或者加载模型文件就可以快速部署Llama Guard大模型
from sagemaker.jumpstart.model import JumpStartModel
model = JumpStartModel(model_id=model_id, model_version=model_version,instance_type='ml.g5.2xlarge')
accept_eula = True
try:
predictor = model.deploy(accept_eula=accept_eula)
except Exception as e:
print(e)
4. 本段代码是用于测试大模型回复内容的测试代码,通过retrieve_all_examples()方法取出实例数据,并通过大模型预测得到回复。
example_payloads = model.retrieve_all_examples()
for payload in example_payloads:
try:
response = predictor.predict(payload.body)
print(
"\nInput\n",
payload.body,
"\n\nOutput\n",
response[0]["generated_text"],
"\n\n===============",
)
except Exception as e:
print(e)
5. 接下来我们利用JumpStart功能快速部署Llama 2 7B大模型
llm_model_id = "meta-textgeneration-llama-2-7b-f"
llm_model_version = "3.*"
model_llm = JumpStartModel(model_id=llm_model_id, model_version=llm_model_version)
try:
predictor_llm = model_llm.deploy(accept_eula=accept_eula)
except Exception as e:
print(e)
6. 接下来我们定义一个给Llama Guard检查模型对话安全性的指示性模板,模板中明确指出不安全内容类别。
TASK = "Task: Check if there is unsafe content in 'Agent' messages in conversations, according to our safety policy with the following categories."
INSTRUCTION = """\
Provide your safety assessment for Agent in the previous conversation:
- The first line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories."""
UNSAFE_CONTENT_CATEGORIES = """\
O1: Violence and Hate.
Should not
- Help people plan or engage in violence.
- Encourage people to engage in violence.
- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics such as their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.
Can
- Provide information on violence and discrimination.
- Discuss the topics of hate and violence, and can discuss historical events involving violence.
O2: Sexual Content.
Should not
- Engage in sexually explicit (that is, erotic) conversations and should not encourage people to engage in sexual activities.
Can
- Discuss topics related to sexuality and sexual education.
- Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health.
O3: Criminal Planning.
Should not
- Help people plan or engage in criminal activities such as theft, kidnapping, or financial crimes, especially when the person indicates that they might act on the information provided to them.
- Encourage people to engage in criminal activities or condone criminal behavior.
Can
- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.
- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.
O4: Guns and Illegal Weapons.
Should not
- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they might act on the information provided to them.
- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.
Can
- Discuss firearms and the arguments for and against firearm ownership.
O5: Regulated or Controlled Substances.
Should not
- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.
- Assist or encourage people to create such substances when it is illegal to do so.
Can
- Discuss the use of regulated or controlled substances, and can provide general information on those substances (for example, what their chemical composition is and what their effects are).
- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.
O6: Self-Harm.
Should not
- Encourage people to harm themselves.
- Romanticize or condone self-harm or suicide.
- Provide information regarding the methods of suicide or self-harm.
- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.
Can
- Provide general information regarding self-harm and suicide (for example, statistics and the mental health challenges faced by those who take these actions).
Should
- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves."""
7. 通过定义两个函数”format_chat_messages“和”format_guard_messages“,来统一大模型输入输出格式模板。
from itertools import cycle
from typing import Dict, List
def format_chat_messages(messages: List[Dict[str, str]]) -> List[str]:
"""Format messages for Llama-2 chat models.
The model only supports 'system', 'user', and 'assistant' roles, starting with 'system', then 'user' and
alternating (u/a/u/a/u...). The last message must be from 'user'.
"""
prompt: List[str] = []
if messages[0]["role"] == "system":
content = "".join(
["<<SYS>>\n", messages[0]["content"], "\n<</SYS>>\n\n", messages[1]["content"]]
)
messages = [{"role": messages[1]["role"], "content": content}] + messages[2:]
for user, answer in zip(messages[::2], messages[1::2]):
prompt.extend(
[
"<s>",
"[INST] ",
(user["content"]).strip(),
" [/INST] ",
(answer["content"]).strip(),
"</s>",
]
)
prompt.extend(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])
return "".join(prompt)
def format_guard_messages(
messages: List[Dict[str, str]],
task: str = TASK,
instruction: str = INSTRUCTION,
unsafe_content_categories: str = UNSAFE_CONTENT_CATEGORIES,
) -> List[str]:
"""Format messages for Llama Guard models.
The model only supports 'user' and 'assistant' roles, and must alternate (u/a/u/a/u...).
"""
turn = cycle(["User", "Agent"])
return "".join(
[
"[INST] ",
task,
"\n\n<BEGIN UNSAFE CONTENT CATEGORIES>\n",
unsafe_content_categories,
"\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>",
*[f"\n\n{next(turn)}: {message['content']}" for message in messages],
"\n\n<END CONVERSATION>\n\n",
instruction,
" [/INST]",
]
)
8. 接下来将一段对话通过”payload_input_guard“调整为标准格式,并将内容发送到Lllama-Guard模型进行评估。
messages_input = [
{"role": "user", "content": "I forgot how to kill a process in Linux, can you help?"}
]
payload_input_guard = {"inputs": format_guard_messages(messages_input)}
try:
response_input_guard = predictor.predict(payload_input_guard)
print(response_input_guard)
except Exception as e:
print(e)
9. 我们从llama-guard模型得到的回复是”safe“,表示对话是安全的,此时我们再把对话内容发送给Llama模型生成回复。
payload_input_llm = {
"inputs": format_chat_messages(messages_input),
"parameters": {"max_new_tokens": 128},
}
try:
response_llm = predictor_llm.predict(payload_input_llm)
print(response_llm)
except Exception as e:
print(e)
10. 我们再把Llama模型生成的回复发送给Llama-guard模型进行安全性评估,保证输出、输入内容都是合规、安全的。
payload_input_llm = {
"inputs": format_chat_messages(messages_input),
"parameters": {"max_new_tokens": 128},
}
try:
response_llm = predictor_llm.predict(payload_input_llm)
print(response_llm)
except Exception as e:
print(e)
11. 接下来我们进入Lambda服务主页,创建一个新的Lambda函数”guard_chat_function“,复制以下代码。用于通过API访问Llama大模型进行内容合规性测试。
import os
import boto3
import json
from itertools import cycle
from typing import Dict, List
import uuid
TASK = "Task: Check if there is unsafe content in 'Agent' messages in conversations according our safety policy with the below categories."
INSTRUCTION = """Provide your safety assessment for Agent in the above conversation:
- First line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories."""
# read unsafe_categories from unsafe_categories.txt
with open("unsafe_categories.txt", "r") as f:
UNSAFE_CONTENT_CATEGORIES = f.read()
def format_chat_messages(messages: List[Dict[str, str]]) -> List[str]:
"""Format messages for Llama-2 chat models.
The model only supports 'system', 'user' and 'assistant' roles, starting with 'system', then 'user' and
alternating (u/a/u/a/u...). The last message must be from 'user'.
"""
prompt: List[str] = []
if messages[0]["role"] == "system":
content = "".join(
["<<SYS>>\n", messages[0]["content"], "\n<</SYS>>\n", messages[1]["content"]]
)
messages = [{"role": messages[1]["role"], "content": content}] + messages[2:]
for user, answer in zip(messages[::2], messages[1::2]):
prompt.extend(
[
"<s>",
"[INST] ",
(user["content"]).strip(),
" [/INST] ",
(answer["content"]).strip(),
"</s>",
]
)
prompt.extend(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])
return "".join(prompt)
def format_guard_messages(
messages: List[Dict[str, str]],
task: str = TASK,
instruction: str = INSTRUCTION,
unsafe_content_categories: str = UNSAFE_CONTENT_CATEGORIES,
) -> List[str]:
"""Format messages for Llama Guard models.
The model only supports 'user' and 'assistant' roles, and must alternate (u/a/u/a/u...).
"""
turn = cycle(["User", "Agent"])
return "".join(
[
"[INST] ",
task,
"\n\n<BEGIN UNSAFE CONTENT CATEGORIES>",
unsafe_content_categories,
"\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>",
*[f"\n\n{next(turn)}: {message['content']}" for message in messages],
"\n\n<END CONVERSATION>\n\n",
instruction,
" [/INST]",
]
)
def lambda_handler(event, context):
random_id = str(uuid.uuid4())
# Get the SageMaker endpoint names from environment variables
endpoint1_name = os.environ['GUARD_END_POINT']
endpoint2_name = os.environ['CHAT_END_POINT']
# Create a SageMaker client
sagemaker = boto3.client('sagemaker-runtime')
print(event)
messages_input = [{
"role": "user",
"content": event['prompt']
}]
payload_input_guard = {"inputs": format_guard_messages(messages_input)}
# Invoke the first SageMaker endpoint
guard_resp = sagemaker.invoke_endpoint(
EndpointName=endpoint1_name,
ContentType='application/json',
Body=json.dumps(payload_input_guard)
)
guard_result = guard_resp['Body'].read().decode('utf-8')
for item in json.loads(guard_result):
guard_result=item['generated_text']
payload_input_llm = {
"inputs": format_chat_messages(messages_input),
"parameters": {"max_new_tokens": 128},
}
# Invoke the second SageMaker endpoint
chat_resp = sagemaker.invoke_endpoint(
EndpointName=endpoint2_name,
ContentType='application/json',
Body=json.dumps(payload_input_llm)
)
chat_result = chat_resp['Body'].read().decode('utf-8')
for item in json.loads(chat_result):
chat_result=item['generated_text']
# store chat history
dynamodb = boto3.client("dynamodb")
dynamodb.put_item(
TableName='chat_history',
Item={
"prompt_id": {'S': random_id},
"prompt_content": {'S': event['prompt']},
"guard_resp": {'S' : guard_result},
"chat_resp": {'S': chat_result}
})
# DIY section - Add unsafe responses to the bad_prompts table
# Return the results
return {
'Llama-Guard-Output' : guard_result,
'Llama-Chat-Output' : chat_result
}
12. 接下来我们进入到CodeBuild服务主页,创建一个容器构建项目并点击启动,构建脚本如下:
{
"version": "0.2",
"phases": {
"pre_build": {
"commands": [
"echo 'Downloading container image from S3 bucket'",
"aws s3 cp s3://lab-code-3a7cca20/Dockerfile .",
"aws s3 cp s3://lab-code-3a7cca20/requirements.txt .",
"aws s3 cp s3://lab-code-3a7cca20/app.py ."
]
},
"build": {
"commands": [
"echo 'Loading container image'",
"aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 755119157746.dkr.ecr.us-east-1.amazonaws.com",
"docker build -t streamlit-container-image .",
"echo 'Tagging and pushing container image to ECR'",
"docker tag streamlit-container-image:latest 755119157746.dkr.ecr.us-east-1.amazonaws.com/streamlit-repo:latest",
"docker push 755119157746.dkr.ecr.us-east-1.amazonaws.com/streamlit-repo:latest"
]
}
},
"artifacts": {
"base-directory": ".",
"files": [
"Dockerfile"
]
}
}
13. 本CodeBuild项目将一个streamlit应用封装成了镜像,并上传到ECR镜像库。
14. 接下来我们进入到ECS服务,按照如下脚本创建一个容器服务启动模板task definition:
{
"taskDefinitionArn": "arn:aws:ecs:us-east-1:755119157746:task-definition/streamlit-task-definition:3",
"containerDefinitions": [
{
"name": "StreamlitContainer",
"image": "755119157746.dkr.ecr.us-east-1.amazonaws.com/streamlit-repo:latest",
"cpu": 0,
"links": [],
"portMappings": [
{
"containerPort": 8501,
"hostPort": 8501,
"protocol": "tcp"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"systemControls": [],
"credentialSpecs": []
}
],
"family": "streamlit-task-definition",
"taskRoleArn": "arn:aws:iam::755119157746:role/ecs_cluster_role",
"executionRoleArn": "arn:aws:iam::755119157746:role/ecs_cluster_role",
"networkMode": "awsvpc",
"revision": 3,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "ecs.capability.task-eni"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "512",
"memory": "2048",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "2024-08-16T02:21:48.902Z",
"registeredBy": "arn:aws:sts::755119157746:assumed-role/AWSLabs-Provisioner-v2-CjDTNtCaQDT/LPS-States-CreateStack",
"tags": []
}
15. 接下来我们创建一个容器管理集群”Streamlit-cluster“,创建一个Streamlit微服务应用。
16. 配置ECS微服务启动类型为Fargate,命名为streamlitservice,选择刚刚创建的ECS微服务启动模板"streamlit-task-definition",选择运行的微服务个数为1。
17. 选择微服务所部署的VPC和子网网络环境,并配置Security Group安全组。
18. 为ECS微服务添加应用层负载均衡器,用于实现后端服务的高可用,其名为:”streamlit-lb“,
19. 添加对外侦听端口HTTP 80,添加后端的目标组放置微服务,最后点击创建。
20. 我们通过应用层负载均衡器对外暴露的URL就可以登录该ECS微服务页面上。
21. 接下来我们进行测试,输入一个问题”如何终止一个Linux进程“检测该内容是否为合规、安全的。
22. 最终可以看到Llama Guard大模型得问题回复,并检测了该问题以及输出内容都安全、合规。
以上就是在亚马逊云科技上利用亚马逊云科技上利用Llama Guard构建安全、合规、负责任的AI智能生活助手的全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。
总结