使用 Amazon Rekognition 自动裁剪图像

由柏拉图重新发布

关注： 0

数字出版商一直在寻找方法来简化和自动化他们的媒体工作流程，以便尽可能快地生成和发布新内容。

许多出版商都有一个庞大的库存图像库，用于他们的文章。这些图像可以多次重复用于不同的故事，尤其是当出版商拥有名人图像时。很多时候，记者可能需要从图像中裁剪出想要的名人，以用于他们接下来的报道。这是一项手动的重复性任务，应该自动化。有时，作者可能想使用名人的图像，但它包含两个人，并且需要从图像中裁剪掉主要的名人。其他时候，名人图片可能需要重新格式化才能发布到各种平台，如移动、社交媒体或数字新闻。此外，作者可能需要更改图像纵横比或将名人放在清晰的焦点上。

在这篇文章中，我们演示了如何使用亚马逊重新认识执行图像分析。 Amazon Rekognition 使您无需任何机器学习 (ML) 专业知识即可轻松将此功能添加到您的应用程序，并附带各种 API 来实现对象检测、内容审核、人脸检测和分析以及文本和名人识别等用例，我们在这个例子中使用。

名人识别功能 in 亚马逊重新认识使用 ML 自动识别图像和视频中数以万计的知名人物。名人识别不仅可以检测给定名人的存在，还可以检测图像中的位置。

解决方案概述

在这篇文章中，我们演示了如何传入照片、名人姓名和输出图像的纵横比，以便能够生成给定名人的裁剪图像，在中心捕捉他们的脸。

当使用 Amazon Rekognition 名人检测 API, 许多元素在响应中返回。以下是一些关键的响应要素：

匹配置信度 – 可用于控制 API 行为的匹配置信度分数。我们建议在您的应用程序中对此分数应用合适的阈值，以选择您喜欢的操作点。例如，通过将阈值设置为 99%，您可以消除误报，但可能会错过一些潜在的匹配项。
名称、ID 和网址 – 名人姓名、唯一的 Amazon Rekognition ID 和 URL 列表，例如名人的 IMDb 或维基百科链接以获取更多信息。
边界框 – 每个已识别名人面孔的矩形边界框位置坐标。
已知性别 – 每个公认名人的已知性别认同。
情绪 – 名人脸上表达的情绪，例如快乐、悲伤或愤怒。
Pose – 名人面部姿势，使用 roll、pitch 和 yaw 三个轴。
微笑 – 名人是否在微笑。

来自 Amazon Rekognition 的部分 API 响应包括以下代码：

{ "CelebrityFaces": [ { "Urls": [ "www.wikidata.org/wiki/Q2536951" ], "Name": "Werner Vogels", "Id": "23iZ1oP", "Face": { "BoundingBox": { "Width": 0.10331031680107117, "Height": 0.20054641366004944, "Left": 0.5003396272659302, "Top": 0.07391933351755142 }, "Confidence": 99.99765014648438,
...

在本练习中，我们将演示如何使用边界框元素来识别人脸的位置，如以下示例图像所示。所有维度都表示为整体图像大小的比率，因此响应中的数字介于 0–1 之间。例如，在示例 API 响应中，边界框的宽度为 0.1，这意味着面部宽度为图像总宽度的 10%。

Werner Vogels 边界框

有了这个边界框，我们现在可以使用逻辑来确保脸部保持在我们创建的新图像的边缘内。我们可以在这个边界框周围应用一些填充，以将面部保持在中心。

在以下部分中，我们将展示如何使用 Werner Vogels 以清晰的焦点创建以下裁剪图像输出。

我们推出一个亚马逊SageMaker notebook，它提供了一个 Python 环境，您可以在其中运行代码以将图像传递给 Amazon Rekognition，然后自动修改图像并突出名人。

Werner Vogels 裁剪

该代码执行以下高级步骤：

向 recognize_celebrities 具有给定图像和名人姓名的 API。
过滤边界框信息的响应。
向边界框添加一些填充，以便我们捕获一些背景。

先决条件

对于本演练，您应该具有以下先决条件：

上传示例图片

将示例名人图片上传到您的 S3 存储桶。

运行代码

为了运行代码，我们使用 SageMaker 笔记本，但是在安装 Python、pillow 和 Boto3 之后，任何 IDE 也可以运行。我们创建了一个 SageMaker notebook 以及 AWS身份和访问管理 (IAM) 角色具有所需的权限。完成以下步骤：

创建笔记本并命名它 automatic-cropping-celebrity.

默认执行策略是在创建 SageMaker notebook 时创建的，它有一个简单的策略，可以为角色授予与 Amazon S3 交互的权限。

更新 Resource 使用 S3 存储桶名称的约束：

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3::: # your-s3-bucket-name " ] } ]
}

创建另一个策略以添加到 SageMaker notebook IAM 角色，以便能够调用认识名人 API：

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "rekognition:RecognizeCelebrities", "Resource": "*" } ]
}

IAM 权限

在SageMaker控制台上，选择 笔记本实例 在导航窗格中。
找到 automatic-cropping-celebrity 笔记本并选择 打开Jupyter.
全新和 conda_python3 作为你笔记本的内核。

Jupyter笔记本

对于以下步骤，将代码块复制到您的 Jupyter notebook 中并通过选择运行它们运行.

首先，我们导入辅助函数和库：

import boto3
from PIL import Image

设置变量

bucket = '<YOUR_BUCKET_NAME>' file = '<YOUR_FILE_NAME>'
celeb = '<CELEBRITY_NAME>'
aspect_ratio = <ASPECT_RATIO_OF_OUTPUT_IMAGE, e.g. 1 for square>

创建服务客户端

rek = boto3.client('rekognition')
s3 = boto3.client('s3')

名人识别功能

def recognize_celebrity(photo): with open(photo, 'rb') as image: response = rek.recognize_celebrities(Image={'Bytes': image.read()}) image=Image.open(photo) file_type=image.format.lower() path, ext=image.filename.rsplit(".", 1) celeb_faces = response['CelebrityFaces'] print(f'Detected {len(celeb_faces)} faces for {photo}') return celeb_faces, image, path, file_type

获取给定名人边界框的函数：

def get_bounding_box(celeb_faces, img_width, img_height, celeb): bbox = None for celebrity in celeb_faces: if celebrity['Name'] == celeb: box = celebrity['Face']['BoundingBox'] left = img_width * box['Left'] top = img_height * box['Top'] width = img_width * box['Width'] height = img_height * box['Height'] print('Left: ' + '{0:.0f}'.format(left)) print('Top: ' + '{0:.0f}'.format(top)) print('Face Width: ' + "{0:.0f}".format(width)) print('Face Height: ' + "{0:.0f}".format(height)) #dimenions of famous face inside the bounding boxes x1=left y1=top x2=left+width y2=top+height bbox = [x1,y1,x2,y2] print(f'Bbox coordinates: {bbox}') if bbox == None: raise ValueError(f"{celeb} not found in results") return bbox

向边界框添加一些填充的功能，因此我们可以捕获脸部周围的一些背景

def pad_bbox(bbox, pad_width=0.5, pad_height=0.3): x1, y1, x2, y2 = bbox width = x2 - x1 height = y2 - y1 #dimenions of new image with padding x1= max(x1 - (pad_width * width),0) y1= max(y1 - (pad_height * height),0) x2= max(x2 + (pad_width * width),0) y2= max(y2 + (pad_height * height),0) #dimenions of new image with aspect ratio, 1 is square, 1.5 is 6:4, 0.66 is 4:6 x1= max(x1-(max((y2-y1)*max(aspect_ratio,1)-(x2-x1),0)/2),0) y1= max(y1-(max((x2-x1)*1/(min((aspect_ratio),1))-(y2-y1),0)/2),0) x2= max(x2+(max((y2-y1)*max((aspect_ratio),1)-(x2-x1),0)/2),0) y2= max(y2+(max((x2-x1)*1/(min((aspect_ratio),1))-(y2-y1),0)/2),0) print('x1-coordinate after padding: ' + '{0:.0f}'.format(x1)) print('y1-coordinate after padding: ' + '{0:.0f}'.format(y1)) print('x2-coordinate after padding: ' + "{0:.0f}".format(x2)) print('y2-coordinate after padding: ' + "{0:.0f}".format(y2)) return [x1,y1,x2,y2]

将图像保存到笔记本存储和 Amazon S3 的功能

def save_image(roi, image, path, file_type): x1, y1, x2, y2 = roi image = image.crop((x1,y1,x2,y2)) image.save(f'{path}-cropped.{file_type}') s3.upload_file(f'{path}-cropped.{file_type}', bucket, f'{path}-cropped.{file_type}') return image

使用 Python main() 功能结合前面的功能来完成保存我们名人的新裁剪图像的工作流程：

def main(): # Download S3 image to local s3.download_file(bucket, file, './'+file) #Load photo and recognize celebrity celeb_faces, img, file_name, file_type = recognize_celebrity(file) width, height = img.size #Get bounding box bbox = get_bounding_box(celeb_faces, width, height, celeb) #Get padded bounding box padded_bbox = pad_bbox(bbox) #Save result and display result = save_image(padded_bbox, img, file_name, file_type) display(result) if __name__ == "__main__": main()

当您运行此代码块时，您会看到我们找到了 Werner Vogels 并创建了一个以他的脸为中心的新图像。

Werner Vogels 裁剪