AWS-Powered PPDM and APEX Protection Storage (DDVE) Part 1
What does APEX Protection Storage look like, how does it all function within a hyber-scaler ? Well I thought I'd build it and then do a write-up. What better way is there to learn about it ! I've also posted two demos within this blog, so hope you enjoy !
Amazon Web Services (AWS)
We'll use Amazon Web Services (AWS) as our target Public Cloud installation. If you need a refresh, I wrote about the products here in my blog Multi-Cloud, APEX and Data Protection.
I'm going to have a look at the complete setup of PowerProtect Data Manager (PPDM), and PowerProtect DD Virtual Edition (DDVE) within Amazon Web Services (AWS). I'll start of with the finished install and then work back through the setup, the problems I'd had, what I 've learnt along the way as well as going through the build. We'll also explore the recommended achitecture so the VPC (Virtual Private Cloud) architecture for deploying PPDM & DDVE in a private subnet , and leveraging the various VPC components. This is all based on the recommended and 'best Practice" architecture from DELL and Amazon.
**First important point: DDVE will not do much on it's own ! DDVE acts as a target so it needs something to send data to it. From the Installation & Administration Guide:
" DD Virtual Edition (DDVE) is a software-only protection storage appliance: a virtual deduplication appliance that provides data protection for entry, enterprise and service provider environments. Like any DD system, DDVE is always paired with backup software. DDVE runs the DD Operating System (DDOS), and includes the DD System"
PPDM & DDVE Demo and Build
I've drawn the setup of my demo in the below image. This shows the network and security configuration within AWS as well as the EC2 instances built including PPDM and DDVE instances within the private subnet, and then a Microsoft Windows Server 2022 jump host placed in the Public subnet. This jump host is used to select and access the instances within the Private subnet, as there is no direct access to PPDM or DDVE from outside/Internet.
Demo Setup Architecture Diagram
As mentioned, DDVE does not function on its own, however once the data is on DDVE it can do all types of funky stuff, which we will come to later. So I am going to build PPDM within AWS, to function alongside DDVE. Going back to one of my earlier blogs "Data Protection: Where do they fit ?" we'd discussed how PPDM integrates multiple data protection products on premise. Architecturally that's pretty much the same setup within AWS or Azure, the differences being the additional configuration required which is the network and security related configs within AWS, Azure, GCP, or Ali Baba.
How about we first start of with a demo of the setup:
I'm going to add multiple demos into this series and i'll break these into parts. It's easier consuming all of this in small chunks...This first one I'll show the finished install and then work our way back, through the setup. I've built Dell PowerProtect Data Manager v 19.14 and PowerProtect DD Virtual Edition v18.104.22.168
PPDM and DDVE within AWS tour
Recommended Architecture within the Public Cloud
When deploying PPDM & DDVE in the cloud, a critical part is the network conectivity and security setup. My collegue Martin wrote some great blogs on the security aspect of this setup and why so important (Martin's Blog). The basic architecture for an install should look similar to the below diagram.
PPDM & DDVE architecture diagram
One thing I have come to realise as I've been learning, is that the manuals and guides available both on DELL and AWS's sites are short on detail when it comes to the network and security setup. Remember that teh above architecture must be in place before you can even begin deploying DDVE or PPDM from the AWS Marketplace. There tend to be diagrams but then the rest are bullet points...for example:
The Deployment Guide for both PPDM and DDVE has the foll0wing detail:
"Set up the network environment. For secure access to the PowerProtect Data Manager on AWS, it is recommended that you use the Virtual Private Cloud (VPC) architecture provided by AWS. Set up and configure the following components:
● The VPC
● A subnet
● Routing tables
● Security groups
● A network access control"
And....that's about it on detail !!
You do have to build networking and security foundations before any deployment can take place so you need to consider the following :
Choosing the Right Subnet Architecture
The foundation of a secure VPC architecture lies in selecting either a public or private subnet. For the deployment of both PPDM and DDVE, it is highly advisable to opt for the private subnet architecture so the VMs are isolatied from the open internet. One critical aspect is to eliminate any direct exposure of PPDM and DDVE from the public internet through Public IP addresses. By doing so, we reduce the risk of potential attacks from malicious entities. The Installation and Administration guide for DDVE mentions "Do not configure public IP for DDVE in AWS if possible"
Leveraging VPC Components for Security
In the private subnet, we fortify PPDM and DDVE with the use of VPC components, including route tables, access control lists (ACLs), and security groups. These components work in unison to create a security perimeter around our VMs, protecting them from unauthorized access and potential cyber threats.
Storage - Facilitating Object Store Connectivity
Backup data is stored in a Simple Storage Services (S3) bucket while the backup Meta-Data is stored on EBS volumes assigned to DDVE. There is a Control path between PPDM and the hosts being backed up, and then a Data path between DDVE and the hosts.
s3 vs EBS
Incase you're new to AWS, Amazon Simple Storage Service or more commonly known as S3 is an object storage service designed to store and retrieve vast amounts of unstructured data, such as documents, images, videos, and backups. Amazon Elastic Block Store or EBS, on the other hand, offers block-level storage volumes that are intended for use with Amazon Elastic Compute Cloud (EC2) instances. It provides persistent storage, acting as a virtual hard disk for EC2 instances. This makes EBS suitable for applications that demand real-time data updates and frequent read/write operations.
AWS Network and Security Configuration
We'll go back now to before the PPDM and DDVE setup was completed, back to the infrastrusture that needed to be built before any deployment could take place. I've written down the steps to creating the network and security architecture and foundation before deployment of PPDM or DDVE . I have also described and showed these steps within the demo below. Watch the demo first then have a look at the flow charts further on.
AWS Network and Security Setup: pre PPDM and DDVE Deployment
As you saw in the demo above, I've broken the steps into a number of procedures and drawn them into flow charts. I seperated the procedures into 3 phases as follows:
Phase 1: VPC Network and Security
We need to build the following components:
Internet Gateway (IGW)
AWS Network and Security Configuration
Phase 2: Build the S3 storage and bucket configuration.
DDVE requires object storage/S3 for Backup and Archiving, so uses Amazon S3 as a target for backups and archives. An interesting point which we will explore further in a future blog are the deduplication and compression capabilities of DDVE which help reduce the amount of data transferred and stored in the S3 bucket, reducing the overall storage costs and network usage within AWS. Another key use case , is to replicate backup data to Amazon S3, providing an off-site storage solution for disaster recovery scenarios. We'll create an S3 bucket as per below and then configure access from DDVE.
AWS S3 Configuration
In AWS, Identity and Access Management (IAM) allows you to manage user identities and their permissions to securely control access to your AWS resources. IAM Policies and Roles are key components of AWS IAM and are used to define and control permissions. This part of the design is based on the principle of least privilege and secure access control.
So now we need to create the IAM policies and Roles within AWS to define the security and access before we can begin deploying PPDM and DDVE and any other EC2 instances. Without the IAM policies and Roles in place, installations and later functions/access will fail.
AWS IAM Policies and Roles + Endpoint creation.
IAM Policies: are JSON documents that define permissions. These policies are attached to IAM users, groups, or roles. Each policy contains one or more statements, and each statement specifies a set of permissions along with the resources those permissions apply to.
A policy statement includes:
Effect: Specifies whether the statement allows or denies access ("Allow" or "Deny").
Action: Lists the actions or operations that are allowed or denied.
Resource: Specifies the AWS resources (e.g., EC2 instances, S3 buckets) to which the actions apply.
Policies can be attached directly to IAM users or groups, granting them specific permissions. They can also be attached to IAM roles, which can then be assumed by trusted entities like AWS services or EC2 instances. Below is a JSON I've used within my setup:
This is to setup S3 access for DDVE. To access the S3 bucket, create and attach the Identity and Access Management (IAM) role to DDVE. I've copied the JSON below and attached the file if you want to have a look or use. You'll need to change the S3 bucket name to your own bucket.
Lets have a look at the policy line by line and how it defines permissions for specific actions on the Amazon S3 bucket "s3bucketprotectdemo." I've put the Json in the left hand box, and then an explanation line by line on the right hand side:
AWS IAM Policy and Explanation
his IAM policy allows the following actions for the specified S3 bucket and its objectist the objects in the bucket
Get (read), Put (write/upload), and Delete objects within the bucket.
IAM Roles: are used to delegate permissions to services or applications running on AWS resources.
Roles are useful in scenarios like:
Allowing an EC2 instance to access other AWS services securely without embedding credentials in the instance.
Granting cross-account access, where users or services in one AWS account need to access resources in another account.
Enabling AWS services to access specific resources, such as Lambda functions accessing S3 buckets.
To create a role, you first define the permissions using an IAM policy which we created above, and then you specify which AWS service or user can assume that role. When a service or user assumes a role, they temporarily inherit the permissions associated with that role. The Role will have policies attached to define permissions. For example, you might have a role with an S3 read-only policy attached, allowing an EC2 instance to read from an S3 bucket.
Below you can see the Roles I've built and then attached to the appropriate policies I'd created earlier. Some of these Roles have also been built by the system during deployment when using Cloud Formation.
IAM Roles within my demo setup
The Amazon S3 Endpoint in AWS enables secure communication from the private subnet in our demo usecase, to our S3 bucketassigned to DDVE, without needing to use the public internet. Essentially, an S3 endpoint provides a direct and private connection to the S3 storage, so this is important in terms of security. By default, when resources within your VPC access S3, the traffic goes over the public internet. An S3 bucket is not bound to a specific VPC; it exists at the regional level. This means that the data stored in an S3 bucket is accessible from anywhere on the internet, unless you configure specific access controls or policies to restrict access. It's also important to note that while S3 buckets themselves aren't located within a VPC, you can use VPC endpoints to allow your VPC resources to access S3 privately without traversing the public internet. With an S3 endpoint, communication to S3 stays within the AWS network, avoiding exposure to the public internet.
AWS Endpoint within our Demo setup
Apart from the security aspect, there are a number of additional benefits to using an Endpoint to connect to the S3 bucket. As the data transfer to S3 does not leave the AWS network, it can result in lower latency and improved overall performance and as one is uing AWS's internal network, this means lower transfer costs compared to using the internet for data transfers. Its also a very straight forward configuartion within AWS. There is no requiremet to build NAT or anything similar, which can be complex.
That's Probably enough for now !! Hope you enjoyed this first part . In the next part of this series I'm going to discuss the deployment experience and demo the deployment as well as any issues I'd experienced. I'll also discuss:
Finding the appliances within the AWS Marketplace
and then the deployment
I am Faisal Choudry, the author of this article, and I work for DELL.
My expertise lies in Data Protection and Recovery implementations. I specialize in fortifying systems against Disasters and Cyber attacks. I have actively participated in orchestrating numerous comprehensive Disaster Recovery (DR) implementations and failover testing, all of which have undergone audits.
My involvement extends beyond theory, as I've been intricately engaged in the engineering, development, and initial rollouts of various products. Notably, I've contributed to the evolution of products like VxRail and VCF on VxRail, alongside vBlock and SAP HANA. This background has enabled me to cultivate a profound understanding of the intricacies within the realm of Data Protection, Cybersecurity, and cutting-edge product development.
It's important to note that the views expressed in this article are my own and do not necessarily represent those of the company.