terça-feira, 23 de maio de 2017

Amazon AI - Some Notes and Best Practices from Webinar

Webinar Amazon AI

Webinar Amazon AI

Today I attended a live Webinar covering the AI offering of AWS and a more deep focus on deep learning.

Here are some basic notes/screenshots I took fromthe Webinar:

The Amazon AI platform has a layered approach from high-level ready to consume Services, with powerful features but limited control over the find-tuning/algorithms to raw construction blocks for complex AI self-developed workloads, where currently Deep Learning resides.
  • Apache MXNet: deep learning engine recommended and strongly developed by AWS
  • GreenGrass hub and spoke seen high potential IoT platform
  • AI solution Categories: API Based or Do it yourself

Walkthrough on some services


Text to speech with good quality,


The Advent of Conversational Interactions: evolution of human-computer interactions

  • Machine-oriented interactions (punch cards you understand the machine)
  • Control-oriented and translated (you command the interaction)
  • Intent-oriented: expect the computer to understand human interactions
For that 3. there is Amazon Lex - voice or text bots

  • Example architecture / deployment: hotel / flight booking platform
  • Use API Gateway and Lambda to securely communicate with backend

Amazon Rekognition

Image recognition service for 4 use cases:

Amazon ML

  • Gets you a number/prediction based on history data
  • Uses regression models to predict a specific number or binary classification
  • Limited to 3 ML models.
Recommended to watch:


AI is an old subject, in fact one of the oldest Computer Science topics, discussed since Lady Lovelace first computing works back in a pre-computers era in the XIX century, advanced by Allan Turing with his Turing test being a strong research topic in the early computing days, back to 50's and 60's but neglected as the vision for its promises did not executed.

This was the case until less then a decade ago when the mix of factors contributed to the explosion in the AI development, more specifically Machine Learning and more recently Deep Learning topics.

The slide below shows the factors that contributed to this explosion and consequent realization of several tasks that were envisioned but not possible in a not so distant past:

  • Data availability: Deep Learning requires a huge amount of data for its learning/evaluation which just became available with the Internet explosion and the data growth in the last decades.
  • Programming Models: distributed computing, clustering and shared nothing programming models and subsequent frameworks (MapReduce, for example) allowed the reduction of complexity for ML/DL problems.
  • Algorithms: better and faster algorithms
  • Processing Power: GPUs and accessible hardware as you go.
  • Autonomous computing is as long envisioned computing area that is gaininng strong momentum with Deep Neural Networks (Deep Learning) like
    • Autonomous cars
    • Autonomous drones/flight
    • Computational Vision
  • How being trained
    • There is no High level service for DL, it requires GPU intense instances with DL frameworks
    • p2 instances with 2000s of cores
  • AWS provides a Deep Learning AMI:
    • CloudFormation template,
    • Containers
    • Or EC2 Image
    • Included frameworks: MXNet, TensorFlow Theano Cafee Torch
Problems that DL can solve:

Normal traditional ML algorithms detects only CHIHUHAUAs in the image below:
Some traditional Image Classification methods (which fail to classify properly the Image above):

Short abstraction: applying linear algebra, it is a Matrix of computer detected collor numbers where the differences between a test image and a trained image constitutes the evaluation factor for similarity, this is the nearest neighbor classifier.

The linear classification also uses functions to determine from which threshold the classifier will identify the category. If you lower too much the threshold, you get more kinds classified but also include a high number of false positives, for example the boat in the airplane classifier would be classified as plane if the threshold is changed.

Solution is a mix of multiple filtering algorithms and a deep neural networks with multiple hidden layers (each one with a specific classifier) to identify the image.

ML and DL recommendations

Best Practices:

  • Build your DATA LAKE (S3 as data lake)
  • Clean/pre-process the data with Lambda, serverless functions
  • For ML / DL models:
    • Create a test environment for models evaluation and testing
    • The resulted accepted test and training sets are saved in the Prod S3 model
    • Create a Prod environment and feed it with the evaluated training models in the Prod S3
    • It is like an A/B deployment for ML/DL