CoffeiNerd - Tech tips and articles

segunda-feira, 31 de julho de 2017

AWS RDS: Changing the Subnet Group from a RDS Instance to same VPC

Today I run into an interesting issue. I use for one of my customers Elastic Beanstalk intensively. At some point in past a colleague created a RDS instance which was not directly created by Elastic Beanstalk, but he used the Subnet Group from one automatically managed EB stack.

Point is that the EB environment needs to be terminated, but it cannot be clean due the fact that the Subnet Group is attached to other entity, in this case, the RDS instance that was manually created.

You say, why you do not create a new Subnet Group then modify the RDS instance?

I tried that, but then the situation started to be funny. First, the option to change the Subnet Group was not even available in the Web Console, as you see in the figure below

So why it is not there? Simply because if you are running a MULTI-AZ RDS this option is NOT available, in order to proceed you need to change it, this means, disable Multi-AZ. This change does not incur in downtime as stated here:

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.DBInstance.Modifying.html

To change the Multi-AZ option just set the option Multi-AZ to No and mark the box to Apply changes immediately.

OK, now the option is available as I can see below:

As the title says the goal is to change the Subnet Group to the same VPC, which is NOT SUPPORTED. When you change the option and press Modify you will get the nice error below.

If you are changing this from another VPC you should succeed. But not for the same VPC. But, as usual there are workarounds.

The one I will leverage is to create a Subnet Group in another VPC, move the subnet (DOWNTIME WILL HAPPEN) and then move it back to the right Subnet Group in the VPC you need it.

It can take circa 10-15 minutes to move to the new VPC, during this time your RDS instance will become unavailable and the message “moving-to-vpc” will appear.

When it is finished and the status is marked as available again you can modify it and now select the right Subnet Group you want from the beginning. More 10-15 minutes downtime and you are done.

Important: you will need also to define the right Security Group when moving back, as it changes when you move to another VPC.

That's it. Hope this helps someone.

terça-feira, 23 de maio de 2017

Amazon AI - Some Notes and Best Practices from Webinar

Webinar Amazon AI

Today I attended a live Webinar covering the AI offering of AWS and a more deep focus on deep learning.

Here are some basic notes/screenshots I took fromthe Webinar:

The Amazon AI platform has a layered approach from high-level ready to consume Services, with powerful features but limited control over the find-tuning/algorithms to raw construction blocks for complex AI self-developed workloads, where currently Deep Learning resides.

Apache MXNet: deep learning engine recommended and strongly developed by AWS
GreenGrass hub and spoke seen high potential IoT platform
AI solution Categories: API Based or Do it yourself

Walkthrough on some services

POLLY

Text to speech with good quality,

LEX

The Advent of Conversational Interactions: evolution of human-computer interactions

Machine-oriented interactions (punch cards you understand the machine)
Control-oriented and translated (you command the interaction)
Intent-oriented: expect the computer to understand human interactions

For that 3. there is Amazon Lex - voice or text bots

Example architecture / deployment: hotel / flight booking platform

Use API Gateway and Lambda to securely communicate with backend

Amazon Rekognition

Image recognition service for 4 use cases:

Amazon ML

Gets you a number/prediction based on history data
Uses regression models to predict a specific number or binary classification
Limited to 3 ML models.

Recommended to watch:

DEEP LEARNING

AI is an old subject, in fact one of the oldest Computer Science topics, discussed since Lady Lovelace first computing works back in a pre-computers era in the XIX century, advanced by Allan Turing with his Turing test being a strong research topic in the early computing days, back to 50's and 60's but neglected as the vision for its promises did not executed.

This was the case until less then a decade ago when the mix of factors contributed to the explosion in the AI development, more specifically Machine Learning and more recently Deep Learning topics.

The slide below shows the factors that contributed to this explosion and consequent realization of several tasks that were envisioned but not possible in a not so distant past:

Data availability: Deep Learning requires a huge amount of data for its learning/evaluation which just became available with the Internet explosion and the data growth in the last decades.
Programming Models: distributed computing, clustering and shared nothing programming models and subsequent frameworks (MapReduce, for example) allowed the reduction of complexity for ML/DL problems.
Algorithms: better and faster algorithms
Processing Power: GPUs and accessible hardware as you go.

Examples/Notes:

Autonomous computing is as long envisioned computing area that is gaininng strong momentum with Deep Neural Networks (Deep Learning) like

Autonomous cars
Autonomous drones/flight
Computational Vision

How being trained

There is no High level service for DL, it requires GPU intense instances with DL frameworks
p2 instances with 2000s of cores

AWS provides a Deep Learning AMI:

CloudFormation template,
Containers
Or EC2 Image
Included frameworks: MXNet, TensorFlow Theano Cafee Torch

Problems that DL can solve:

Normal traditional ML algorithms detects only CHIHUHAUAs in the image below:

Some traditional Image Classification methods (which fail to classify properly the Image above):

Short abstraction: applying linear algebra, it is a Matrix of computer detected collor numbers where the differences between a test image and a trained image constitutes the evaluation factor for similarity, this is the nearest neighbor classifier.

The linear classification also uses functions to determine from which threshold the classifier will identify the category. If you lower too much the threshold, you get more kinds classified but also include a high number of false positives, for example the boat in the airplane classifier would be classified as plane if the threshold is changed.

Solution is a mix of multiple filtering algorithms and a deep neural networks with multiple hidden layers (each one with a specific classifier) to identify the image.

ML and DL recommendations

Best Practices:

Build your DATA LAKE (S3 as data lake)
Clean/pre-process the data with Lambda, serverless functions
For ML / DL models:

Create a test environment for models evaluation and testing
The resulted accepted test and training sets are saved in the Prod S3 model
Create a Prod environment and feed it with the evaluated training models in the Prod S3
It is like an A/B deployment for ML/DL

quinta-feira, 28 de julho de 2016

AWS CSA: Professional - Core Services - Compute: ECS and ECR

In this blog I will briefly cover ECS and ECR, the Docker-based AWS solution for automated Docker containers deployment.

Both services are tightly related, as ECR stores the images and manages the deployment and permissions on the Docker repositories, and ECS is a scalable EC2-based cluster service to run and scale Docker containers.

ECR - Elastic Container Registry

Description

As official AWS docs says:
Amazon EC2 Container Registry (Amazon ECR) is a managed AWS Docker registry service that is secure, scalable, and reliable . Amazon ECR supports private Docker repositories with resource-based permissions using AWS IAM so that specific users or Amazon EC2 instances can access repositories and images. Developers can use the Docker CLI to push, pull, and manage images.

Components

Amazon ECR contains the following components:

Registry An Amazon ECR registry is provided to each AWS account; you can create image repositories in your registry and store images in them.
Authorization token Your Docker client needs to authenticate to Amazon ECR registries as an AWS user before it can push and pull images.The AWS CLI get-login command provides you with authentication credentials to pass to Docker.
Repository An Amazon ECR image repository contains your Docker images.
Repository policy You can control access to your repositories and the images within them with repository policies.
Image You can push and pull Docker images to your repositories.You can use these images locally on your development system, or you can use them in Amazon ECS task definitions.

Registry Concepts

You can use Amazon ECR registries to host your images in a highly available and scalable architecture, allowing you to deploy containers reliably for your applications.You can use your registry to manage image repositories and Docker images. Each AWS account is provided with a single (default) Amazon ECR registry.

• The URL for your default registry is https://aws_account_id.dkr.ecr.us-east-1.amazonaws.com.

• By default, you have read and write access to the repositories and images you create in your default registry.

• You can authenticate your Docker client to a registry so that you can use the docker push and docker pull command to push and pull images to and from the repositories in that registry.

• Repositories can be controlled with both IAM user access policies and repository policies.

You can manage your repositories through the CLI, API or Mgmt Console, but for some image related actions you would prefer the Docker CLI. Docker CLI does not authenticate in AWS per default, so you will need to use the command get-login from AWS cli to get a Docker compatible auth string.

Repository Concepts

Amazon ECR provides API operations to create, monitor, and delete repositories and set repository permissions that control who can access them.You can perform the same actions in the Repositories section of the Amazon ECS console. Amazon ECR also integrates with the Docker CLI allowing you to push and pull images from your development environments to your repositories.

By default, you have read and write access to the repositories you create in your default registry (aws_account_id.dkr.ecr.us-east-1.amazonaws.com).
Repository names can support namespaces, which you can use to group similar repositories. For example if there are several teams using the same registry, Team A could use the team-a namespace while Team B uses the team-b namespace. Each team could have their own image called web-app, but because they are each prefaced with the team namespace, the two images can be used simultaneously without interference. Team A's image would be called team-a/web-app, while Team B's image would be called team-b/web-app.
Repositories can be controlled with both IAM user access policies and repository policies.

Images

Amazon ECR stores Docker images in image repositories.You can use the Docker CLI to push and pull images from your repositories.

Important Amazon ECR users require permissions to call ecr:GetAuthorizationToken before they can authenticate to a registry and push or pull any images from any Amazon ECR repository.

Using ECR images with ECS

You can use your Amazon ECR images with Amazon ECS, but you need to satisfy some prerequisites:

• Your container instances must be using at least version 1.7.0 of the Amazon ECS container agent. The latest version of the Amazon ECS-optimized AMI supports Amazon ECR images in task definitions.
• The Amazon ECS container instance role (ecsInstanceRole) that you use with your container instances must possess the following IAM policy permissions for Amazon ECR.

Pricing

You pay only for the storage used by your images.
Data transfer IN is free of charge
Data transfer OUT is charged in layers according the amount of data transferred.

Service Limits

When to use ECS?

When you already have Docker images or utilizes Docker for your applications you can have benefit for images store, solid security control, automated deployment and integration with ECS.

ECS - Elastic Container Service

<to be continued>

quinta-feira, 21 de julho de 2016

AWS CSA: Professional - Core Services - Compute: EC2

Definition:

Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity in the Amazon Web Services (AWS) cloud.

Warning: the Pro Exam will not focus on the deep elements from the services, but how you can make use of the "pieces" to build an architecture on AWS Cloud.

Features of Amazon EC2

Amazon EC2 provides the following features:

Virtual computing environments, known as instances
Preconfigured templates for your instances, known as Amazon Machine Images (AMIs), that package the bits you need for your server (including the operating system and additional software)
Various configurations of CPU, memory, storage, and networking capacity for your instances, known as instance types
Secure login information for your instances using key pairs (AWS stores the public key, and you store the private key in a secure place)
Storage volumes for temporary data that's deleted when you stop or terminate your instance, known as instance store volumes
Persistent storage volumes for your data using Amazon Elastic Block Store (Amazon EBS), known as Amazon EBS volumes
Multiple physical locations for your resources, such as instances and Amazon EBS volumes, known as regions and Availability Zones
A firewall that enables you to specify the protocols, ports, and source IP ranges that can reach your instances using security groups
Static IP addresses for dynamic cloud computing, known as Elastic IP addresses
Metadata, known as tags, that you can create and assign to your Amazon EC2 resources
Virtual networks you can create that are logically isolated from the rest of the AWS cloud, and that you can optionally connect to your own network, known as virtual private clouds (VPCs)

Instance Types:

Type	Class	Characteristics	Use Cases
T2	General Purpose	High Frequency Intel Xeon Processors with Turbo up to 3.3GHz Burstable CPU, governed by CPU Credits, and consistent baseline performance Lowest-cost general purpose instance type, and Free Tier eligible (t2.micro only) Balance of compute, memory, and network resources	Development environments, build servers, code repositories, low-traffic websites and web applications, micro services, early product experiments, small databases.
M4	General Purpose	2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors EBS-optimized by default at no additional cost Support for Enhanced Networking Balance of compute, memory, and network resources	Small and mid-size databases, data processing tasks that require additional memory, caching fleets, and for running backend servers for SAP, Microsoft SharePoint, cluster computing, and other enterprise applications.
M3	General Purpose	High Frequency Intel Xeon E5-2670 v2 (Ivy Bridge) Processors SSD-based instance storage for fast I/O performance Balance of compute, memory, and network resources	Small and mid-size databases, data processing tasks that require additional memory, caching fleets, and for running backend servers for SAP, Microsoft SharePoint, cluster computing, and other enterprise applications.
C4	Compute Optimized	High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2 EBS-optimized by default and at no additional cost Ability to control processor C-state and P-state configuration on the c4.8xlarge instance type Support for Enhanced Networking and Clustering	Same as C3
C3	Compute Optimized	High Frequency Intel Xeon E5-2680 v2 (Ivy Bridge) Processors Support for Enhanced Networking Support for clustering SSD-backed instance storage	High performance front-end fleets, web-servers, batch processing, distributed analytics, high performance science and engineering applications, ad serving, MMO gaming, and video-encoding.
X1	Memory Optimized	High Frequency Intel Xeon E7-8880 v3 (Haswell) Processors Lowest price per GiB of RAM 1,952 GiB of DDR4-based instance memory SSD Storage and EBS-optimized by default and at no additional cost Ability to control processor C-state and P-state configuration	We recommend X1 instances for running in-memory databases like SAP HANA, big data processing engines like Apache Spark or Presto, and high performance computing (HPC) applications. X1 instances are certified by SAP to run Business Warehouse on HANA (BW), Data Mart Solutions on HANA, Business Suite on HANA (SoH), and the next-generation Business Suite S/4HANA in a production environment on the AWS cloud.
R3	Memory Optimized	High Frequency Intel Xeon E5-2670 v2 (Ivy Bridge) Processors SSD Storage Support for Enhanced Networking	We recommend R3 instances for high performance databases, distributed memory caches, in-memory analytics, genome assembly and analysis, Microsoft SharePoint, and other enterprise applications.
G2	GPU	High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors High-performance NVIDIA GPUs, each with 1,536 CUDA cores and 4GB of video memory Each GPU features an on-board hardware video encoder designed to support up to eight real-time HD video streams (720p@30fps) or up to four real-time full HD video streams (1080p@30fps) Support for low-latency frame capture and encoding for either the full operating system or select render targets, enabling high-quality interactive streaming experiences	3D application streaming, machine learning, video encoding, and other server-side graphics or GPU compute workloads.
I2	Storage Optimized	High Frequency Intel Xeon E5-2670 v2 (Ivy Bridge) Processors SSD Storage Support for TRIM Support for Enhanced Networking High Random I/O performance	NoSQL databases like Cassandra and MongoDB, scale out transactional databases, data warehousing, Hadoop, and cluster file systems.
D2	Storage Optimized	D2 instances feature up to 48 TB of HDD-based local storage, deliver high disk throughput, and offer the lowest price per disk throughput performance on Amazon EC2.	Massively Parallel Processing (MPP) data warehousing, MapReduce and Hadoop distributed computing, distributed file systems, network file systems, log or data-processing applications

Networking and storage features:

#	VPC only	EBS only	SSD volumes	Placement group	HVM only	Enhanced networking
C3			Yes	Yes		Intel 82599 VF
C4	Yes	Yes		Yes	Yes	Intel 82599 VF
D2				Yes	Yes	Intel 82599 VF
G2			Yes	Yes	Yes
I2			Yes	Yes	Yes	Intel 82599 VF
M3			Yes
M4	Yes	Yes		Yes	Yes	Intel 82599 VF
R3			Yes	Yes	Yes	Intel 82599 VF
T2	Yes	Yes			Yes
X1	Yes		Yes	Yes	Yes	ENA

Pricing

What is important to know are the 3 basic pricing types:

On-demand - pay as you go. Good for eventual usage, testing, etc.
Reserved - discounts for partial, total or no upfront - but reserved for 1 or 3 years. Good for 24x7 long-term running systems
Spot - like the stock market, you say your price to buy and when this price is reached you get the instances, when price goes up, you lose them. Good for batch processing, workflow-based apps.

AMIs

The following diagram summarizes the AMI lifecycle. After you create and register an AMI, you can use it to launch new instances. (You can also launch instances from an AMI if the AMI owner grants you launch permissions.) You can copy an AMI to the same region or to different regions. When you are finished launching instance from an AMI, you can deregister the AMI.

The AMI lifecycle (create, register, launch, copy, deregister).

Network and Security

This is a broad topic. I suggest that you review the features as stated in the documentation:

Amazon EC2 provides the following network and security features.

Features

Elastic Load Balancing

Elastic Load Balancing provides the following features:

You can use the operating systems and instance types supported by Amazon EC2. You can configure your EC2 instances to accept traffic only from your load balancer.
You can configure the load balancer to accept traffic using the following protocols: HTTP, HTTPS (secure HTTP), TCP, and SSL (secure TCP).
You can configure your load balancer to distribute requests to EC2 instances in multiple Availability Zones, minimizing the risk of overloading one single instance. If an entire Availability Zone goes offline, the load balancer routes traffic to instances in other Availability Zones.
There is no limit on the number of connections that your load balancer can attempt to make with your EC2 instances. The number of connections scales with the number of concurrent requests that the load balancer receives.
You can configure the health checks that Elastic Load Balancing uses to monitor the health of the EC2 instances registered with the load balancer so that it can send requests only to the healthy instances.
You can use end-to-end traffic encryption on those networks that use secure (HTTPS/SSL) connections.
[EC2-VPC] You can create an Internet-facing load balancer, which takes requests from clients over the Internet and routes them to your EC2 instances, or an internal-facing load balancer, which takes requests from clients in your VPC and routes them to EC2 instances in your private subnets. Load balancers in EC2-Classic are always Internet-facing.
[EC2-Classic] Load balancers for EC2-Classic support both IPv4 and IPv6 addresses. Load balancers for a VPC do not support IPv6 addresses.
You can monitor your load balancer using CloudWatch metrics, access logs, and AWS CloudTrail.
You can associate your Internet-facing load balancer with your domain name. Because the load balancer receives all requests from clients, you don't need to create and manage public domain names for the EC2 instances to which the load balancer routes traffic. You can point the instance's domain records at the load balancer instead and scale as needed (either adding or removing capacity) without having to update the records with each scaling activity.

Auto Scaling

The following table describes the key components of Auto Scaling.

	Groups Your EC2 instances are organized into groups so that they can be treated as a logical unit for the purposes of scaling and management. When you create a group, you can specify its minimum, maximum, and, desired number of EC2 instances. For more information, see Auto Scaling Groups.
	Launch configurations Your group uses a launch configuration as a template for its EC2 instances. When you create a launch configuration, you can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances. For more information, see Launch Configurations.
	Scaling plans A scaling plan tells Auto Scaling when and how to scale. For example, you can base a scaling plan on the occurrence of specified conditions (dynamic scaling) or on a schedule. For more information, see Scaling Plans.

Conclusion

As told you in the beginning this is a brief overview from EC2 service and its main components. You should go deeper on some points but for the Professional Exam the goal is not to reply questions from the inner settings from each service, but to know how to combine them to build secure, cost-effective, elastic and scalable solutions.

AWS Certified Solutions Architect: Professional - An Introduction

So, in my last blog entry I started structuring the learning process, starting with AWS core sevices ( Compute and Networking, Storage and CDN, Database, Application Services, Deployment and Management ), beginning with Compute services.

In the end I make a short break and notify you that I would talk about AWS Security more specifically the Shared Responsibility Model.

But wait. I realised that this approach was missing the big picture, so I restarted with the basics.

What I need to study in order to be approved? Which contents? Which weight they have?

So, nothing better than go to the official AWS requirements for the Cert:

AWS Knowledge:

AWS core services, including: Compute and Networking, Storage and CDN, Database, Application Services, Deployment and Management.
Security features that AWS provides and best practices
Able to design and implement for elasticity and scalability
Network technologies as they relate to AWS networking, including: DNS and load balancing, Amazon Virtual Private Cloud (VPC), and AWS Direct Connect
Storage and archival options
State management
Database and replication methodologies
Self-healing techniques and fault-tolerant services
Disaster Recovery and fail-over strategies
Application migration plans to AWS
Network connectivity options
Deployment and management

General IT Knowledge:

Large-scale distributed systems architecture
Eventual consistency
Relational and non-relational databases
Multi-tier architectures: load balancers, caching, web servers, application servers, networking and databases
Loose coupling and stateless systems
Content Delivery Networks
System performance tuning
Networking concepts including routing tables, access control lists, firewalls, NAT, HTTP, DNS, TCP/IP, OSI model
RESTful Web Services, XML, JSON
One or more software development models
Information and application security concepts including public key encryption, remote access, access credentials, and certificate-based authentication

As I said in the warning in the previous blog post, you need to have previous IT experience and knowledge of several IT topics as you can see in the General IT Knowledge section above.

As important as this is the weight of each topic in the exam:

As you can see, you will not be asked specific things about the services, like what is bucket name rules but you will need to know the services in order to build secure, elastic, reliable, high available and cost-friendly solutions.

Pay attention to Security, Scalability, Data Storage and High Availability topics as they represent 65% from the entire exam points.

Read the detailed domain areas and topics covered in the AWS cert content blueprint: http://d0.awsstatic.com/Train%20&%20Cert/docs/AWS_certified_solutions_architect_professional_blueprint.pdf

How will we structure the content? Following the "AWS Knowledge" section above, starting with Core Services, than focusing on Security and so on.

terça-feira, 28 de junho de 2016

Brain Dump and Notes for AWS Architect Professional Certification - Introduction

Only what matters: first I will make a review from AWS services and what is really important to know for the exam.

Warning: it is not impossible but will be much harder to achieve the Pro Cert only studying the website information or an online course. It is not impossible, but hands-on experience is really a key point to master the exam. Also previous IT experience counts as many decisions and questions are based on in depth experience with IT environments and design decisions.

In this review I will cover up to certain level the services and AWS offering. In the bottom of each article the sources will be included for further review and reading. I strongly suggest you create an AWS account and try/experiment with the services.

We will start with the Compute Services, which are (as of July, 2016) the following:

All these services run somehow around the core service which is EC2. Some only work with EC2 directly, like ELB and Auto Scaling, while others cover network services (VPC), Docker-compatible containers (EC2 Container Registry and ServicE) and Microservices/Serverless applications (Lambda).

The next blog will not enter into the Compute Details but I will briefly talk about Security on AWS to introduce important concepts like the Shared Responsibility Model and Security Best Practices on AWS.