AWS uses machine learning to better protect data

AWS logo on building window

Amazon used its AWS Summit in New York this week to make a series of product and customer announcements.

Amazon Macie is a new security service that utilises machine learning to identify and protect sensitive data stored in AWS from breaches, data leaks and unauthorised access.

It can also discover and classify a user's data that is stored in Amazon S3. It then assigns each data item a business value and monitors that item to detect any suspicious activity based on access patterns.

The new service uses data security automation and monitoring to proactively prevents data loss. Macie, which was given its moniker because it means "weapon" and a person who is "bold, sporty and sweet" uses machine learning algorithms for natural language processing (NLP).

Macie can also detect common sources of personally identifiable information, which, Amazon said, will prove particularly useful for GDPR in terms of providing customers with dashboards and alerts. AWS said: "it will enable customers to comply with GDPR regulations around encryption and pseudonymisation of data."

Stephen Schmidt, CIO at AWS, added: "By using machine learning to understand the content and user behavior of each organisation, Amazon Macie can cut through huge volumes of data with better visibility and more accurate alerts, allowing customers to focus on securing their sensitive information instead of wasting time trying to find it."

Amazon also announced AWS Glue, which it described it as a "fully managed, serverless, and cloud-optimised extract, transform and load (ETL) service.

It is designed to make it easier for customers to prepare and load their data into the Amazon cloud. Glue will discover associated metadata from a customer's data and then classifies it, generates data transformative ETL scripts and then loading the resultant data into a destination data store and completing the task by provisioning the necessary infrastructure.

Analysis can be carried out in minutes and users don't have to manage any resources as it is serverless so they will only pay for Glue when it is running. Billing is in data processing unit hours with one hour currently costing $0.44 in us-east-1.

Raju Gulabani, vice president of databases, analytics, and AI at AWS, said: "We developed AWS Glue to eliminate much of the undifferentiated heavy lifting involved with ETL. By cataloging all of a customer's data and automating the ETL process, AWS Glue not only takes a lot of the hassle out of analytics.

"It also makes it possible for customers to store their data in as many sources as they want, and very quickly start analysing all of it with whatever AWS service they choose."

In July, AWS introduced G3 instances for Amazon EC2. This offering had twice the CPU power and eight times the host memory of the G2. They combined CPU and RAM for workloads such as 3D rendering and visualisations.

Zach Marzouk

Zach Marzouk is a former ITPro, CloudPro, and ChannelPro staff writer, covering topics like security, privacy, worker rights, and startups, primarily in the Asia Pacific and the US regions. Zach joined ITPro in 2017 where he was introduced to the world of B2B technology as a junior staff writer, before he returned to Argentina in 2018, working in communications and as a copywriter. In 2021, he made his way back to ITPro as a staff writer during the pandemic, before joining the world of freelance in 2022.