Building a Web Application with PushAuth™

Welcome back to the Power of PushAuth™ blog series! This is the fifth post of the Power of PushAuth™ blog series. The first post of the series was a comprehensive guide to push authentication. The subsequent three posts comprised an end-to-end sample implementation of PushAuth™ in a simple user login flow:

  1. Web Server tutorial
  2. iOS Mobile App tutorial
  3. Android Mobile App tutorial

In this post, we will create a sample website from scratch using Rails, and will integrate PushAuth™ APIs into the user login flow. Along the way, we will also provide technical details on how the website interacts with PushAuth™ APIs, which can help readers incorporate PushAuth™ into any existing website. The end result of this tutorial will be similar to the sample web server we deployed in Web Server tutorial.

Setup

To follow this tutorial, you will need:

This tutorial assumes a basic familiarity with the Rails framework.

Step 1: Make basic Rails app with simple session-based authentication

First, we will create website with a simple username/password authentication, without incorporating UnifyID PushAuth™. Step 2 will integrate PushAuth™ into the website we create in Step 1.

Project Initialization

$ rails new push_auth_demo
$ cd push_auth_demo

Let’s add the bcrypt gem to our Gemfile by uncommenting this line in Gemfile:

gem 'bcrypt', '~> 3.1.7'

Now, we will run bundle install to update the Gemfile.lock.

Next, we can add the User model to our database; each User will have a username and a password hash.

$ rails generate model user username:uniq password:digest 
$ rails db:migrate

Now, we can generate the controller for handling sessions.

$ rails generate controller sessions new create destroy

Controller Logic

The basic idea of session-based authentication is pretty simple:

  • When a user logs in, session[:user_id] is set to be the unique index of the corresponding User in the database.
  • When no one is logged in, session[:user_id] should be unset.

We’ll start with writing the Application controller, where we have a couple of simple page actions:

  • GET / (application#home) renders a page that shows links to other pages and actions
  • GET /restricted (application#restricted) renders a page that is only accessible when logged in.

A few helper functions will live here as well:

  • current_user should either return a User object, or nil if there’s no one logged in.
  • logged_in? should return whether someone is logged in
  • authorized is an action that is called before loading pages that require the a user to be logged in.
# app/controllers/application_controller.rb

class ApplicationController < ActionController::Base
  before_action :authorized
  helper_method :current_user
  helper_method :logged_in?

  skip_before_action :authorized, only: [:home]

  def current_user
    User.find_by(id: session[:user_id])
  end

  def logged_in?
    !current_user.nil?
  end

  def authorized
    unless logged_in?
      redirect_to login_path, alert: "You must be logged in to perform that action."
    end
  end
end

Now, we’ll handle users logging in or out through the Sessions controller.

  • GET /login (sessions#new) action displays a login page unless the user is already logged in
  • POST /login (sessions#create) will authenticate the user and set the session.
  • DELETE /logout (sessions#destroy) action will clear the user’s session cookie.
# app/controllers/sessions_controller.rb

class SessionsController < ApplicationController
  skip_before_action :authorized, except: [:destroy]

  def new
    redirect_to root_path if logged_in?
  end

  def create
    @user = User.find_by("lower(username) = ?", params[:username].downcase)
    if @user && @user.authenticate(params[:password])
      session[:user_id] = @user.id
      redirect_to root_path, notice: "Successfully logged in!"
    else
      redirect_to login_path, alert: "Sorry, that didn't work."
    end
  end

  def destroy
    session[:user_id] = nil
    redirect_to root_path, notice: "Successfully logged out."
  end
end

We also need to add routes for these actions.

# config/routes.rb

Rails.application.routes.draw do
  root "application#home"

  get "restricted", to: "application#restricted"

  get "login", to: "sessions#new"

  post "login", to: "sessions#create"

  delete "logout", to: "sessions#destroy"
end

Views

home page where you can click on the link to log in/out:

<!-- app/views/application/home.html.erb -->

<h1> Welcome! </h1>

<% if logged_in? %>
  Welcome, <%= current_user.username %>. <br />
  <%= link_to "Log out", logout_path, method: :delete %> <br />
  Click <%= link_to "here", restricted_path %> to see a super secret page!
<% else %>
  Please <%= link_to "log in", login_path %>.
<% end %>

restricted page to test access control:

<!-- app/views/application/restricted.html.erb -->

Shhh, this page is a secret!

sessions#new renders simple login form:

<!-- app/views/sessions/new.html.erb -->

<h1>Login</h1>

<%= form_tag "/login", {class: "form-signin"} do %>
  <%= label_tag :username, nil, class: "sr-only" %>
  <%= text_field_tag :username, nil, class: "form-control", placeholder: "Username", required: true, autofocus: true %>

  <%= label_tag :password, nil, class: "sr-only" %>
  <%= password_field_tag :password, nil, class: "form-control", placeholder: "Password", required: true%>

  <%= submit_tag "Log in", {class: ["btn", "btn-lg", "btn-primary", "btn-block"]} %>
<% end %>

Lastly, we’ll modify the default template to include a navigation bar at the top and flash messages for notice and alert from controllers:

<!-- Replace the contents of the <body> tag in app/views/layouts/application.html.erb with the following -->

    <nav class="navbar navbar-dark bg-dark">
      <a class="navbar-brand" href="/">
        <span class="logo d-inline-block align-top"></span>
        UnifyID PushAuth Sample
      </a>
    </nav>
    <% if flash[:notice] %>
      <div class="alert alert-primary alert-dismissible fade show" role="alert">
        <%= flash[:notice] %>
        <button type="button" class="close" data-dismiss="alert" aria-label="Close">
          <span aria-hidden="true">×</span>
        </button>
      </div>
    <% end %>
    <% if flash[:alert] %>
      <div class="alert alert-danger alert-dismissible fade show" role="alert">
        <%= flash[:alert] %>
        <button type="button" class="close" data-dismiss="alert" aria-label="Close">
          <span aria-hidden="true">×</span>
        </button>
      </div>
    <% end %>

    <main role="main" class="container">
      <div class="main-content">
        <%= yield %>
      </div>
    </main>

Styling

We can add styling to our website by simply adding bootstrap. The views we created above already use class names that are recognized by bootstrap.

First, add bootstrap and some of its dependencies.

$ yarn add bootstrap jquery popper.js

Then, add the following to the end of app/javascript/packs/application.js:

import 'bootstrap'
import 'stylesheets/application.scss'

Next, make a file called app/javascript/stylesheets/application.scss and add this:

@import "~bootstrap/scss/bootstrap";

Optionally, you may add your own custom CSS files as well. See an example in our sample code.

At this point, you should be able to run rails server and navigate to http://localhost:3000 to interact with the basic authentication server! You can create sample users as follows:

$ bundle exec rails console
> User.create(:username => "<your_username>", :password => "<your_password>").save
> exit

Step 2 – Integrate with UnifyID PushAuth™ APIs

Now that we have a website with a simple username/password authentication, let’s incorporate UnifyID PushAuth™ APIs to further enhance security.

Interface to PushAuth™ APIs

First, let’s tell Rails about your UnifyID API key.

Run rails credentials:edit and add the following

unifyid:
  server_api_key: <Your UnifyID API Key created from dashboard>

Next, add the following to config/application.rb, after the config.load_defaults line:

config.x.pushauth.base_uri = "https://api.unify.id"

Let’s also add the httparty gem to easily make HTTP/S requests. To do this, add the following to your Gemfile and run bundle install:

gem 'httparty', '~> 0.18.0'

Now, we will make a file called app/services/push_auth.rb which contains the interface for our Rails app to interact with the PushAuth™ APIs:

  • create_session method calls POST /v1/push/sessions to initiate PushAuth™ session (API doc).
  • get_session_status method calls GET /v1/push/sessions/{id} to retrieve the status of PushAuth™ session (API doc).
# app/services/push_auth.rb

class PushAuth
  include HTTParty
  base_uri Rails.configuration.x.pushauth.base_uri

  @@options = {
    headers: {
      "Content-Type": "application/json",
      "X-API-Key": Rails.application.credentials.unifyid[:server_api_key]
    }
  }

  def self.create_session(user_id, notification_title, notification_body)
    body = {
      "user" => user_id,
      "notification" => {
        "title" => notification_title,
        "body" => notification_body
      }
    }
    post("/v1/push/sessions", @@options.merge({body: body.to_json}))
  end

  def self.get_session_status(api_id)
    get("/v1/push/sessions/#{api_id}", @@options)
  end
end

Controller Logic Modification

Now, we will modify the login flow to incorporate PushAuth™ as the second factor authentication. The new login flow will consist of the following:

  1. The client submits the username and password via a POST request to /login (sessions#create)
  2. The controller validates that the user exists and the password matches. If not, it displays an error message.
  3. Upon successful username/password authentication, the controller creates a PushAuth™ session and redirects to GET /mfa (sessions#init_mfa)
  4. The Javascript in /mfa page repeatedly queries GET /mfa/check (sessions#check_mfa), which checks the PushAuth™ session status until the session status is no longer pending.
  5. Upon receiving a non-pending session status, the client submits a request to PATCH /mfa/finalize (sessions#finalize_mfa) that completes the login process.

First, let’s replace the create action in app/controllers/sessions_controller.rb:

  def create
    @user = User.find_by("lower(username) = ?", params[:username].downcase)
    if @user && @user.authenticate(params[:password])
      session[:pre_mfa_user_id] = @user.id

      pushauth_title = "Authenticate with #{Rails.application.class.module_parent.to_s}?"
      pushauth_body = "Login request from #{request.remote_ip}"

      response = PushAuth.create_session(@user.username, pushauth_title, pushauth_body)

      session[:pushauth_id] = response["id"]

      redirect_to mfa_path
    else
      redirect_to login_path, alert: "Sorry, that didn't work."
    end
  end

Next, let’s also add check_mfa and finalize_mfa actions in this controller:

  def check_mfa
    status = PushAuth.get_session_status(session[:pushauth_id])["status"]

    render plain: status
  end

  def finalize_mfa
    case PushAuth.get_session_status(session[:pushauth_id])["status"]
    when "accepted"
      session[:user_id] = session[:pre_mfa_user_id]
      session[:pushauth_id] = nil
      session[:pre_mfa_user_id] = nil
      flash.notice = "Successfully logged in!"
    when "rejected"
      session[:pre_mfa_user_id] = nil
      flash.alert = "Your request was denied."
    end
  end

We also want to make sure that only users who completed the password authentication are able to access actions for the PushAuth™ authentication. Thus, within sessions_controller we will add:

# app/controllers/sessions_controller.rb

# Add this just under the skip_before_action line
  before_action :semi_authorized!, only: [:init_mfa, :check_mfa, :finalize_mfa]

# And add this after action methods
  private
  def semi_authorized
    session[:pre_mfa_user_id] && session[:pushauth_id]
  end

  def unauthorized
    redirect_to login_path, alert: "You are not authorized to view this page."
  end

  def semi_authorized!
    unauthorized unless semi_authorized
  end

Views

Now, we need a page that uses AJAX to determine whether the PushAuth™ request has been completed.

First, let’s add a line in our application template that allows us to add content inside the <head> tag.
Add this to app/views/layouts/application.html.erb, right before the </head> tag:

<%= yield :head %>

Next, let’s add the Javascript code we want to run on the init_mfa page:

// app/javascript/packs/init_mfa.js

import Rails from "@rails/ujs";
let check_status = window.setInterval(function() {
  Rails.ajax({
    type: "GET",
    url: "/mfa/check",
    success: function(r) {
      if (r !== "sent") {
        Rails.ajax({
          type: "PATCH",
          url: "/mfa/finalize",
          success: function() {
            window.clearInterval(check_status);
            window.location.href = "/";
          },
          error: function() {
            console.log("Promoting PushAuth status failed.");
          }
        });
      }
    },
    error: function() {
      console.log("Checking for PushAuth status failed.");
    }
  });
}, 2000);

This will poll /mfa/check every 2 seconds, until the Rails app reports that the PushAuth™ request has been accepted, rejected, or expired. At this point, the browser will ask the Rails app to complete the login process by submitting a /mfa/finalize request.

Now, let’s add a view file for init_mfa that includes the Javascript above.

<!-- app/views/sessions/init_mfa.html.erb -->

<% content_for :head do %>
  <%= javascript_pack_tag 'init_mfa' %>
<% end %>

<div class="spinner-border" role="status" ></div>

Waiting for a response to the push notification...

Finally, we will add new mfa routes.

# Add these to config/routes.rb

  get "mfa", to: "sessions#init_mfa"

  get "mfa/check", to: "sessions#check_mfa"

  patch "mfa/finalize", to: "sessions#finalize_mfa"

Congratulations! We have now integrated UnifyID’s PushAuth™. The final result should function just like the pushauth-sample-server project, which we introduced in our How to Implement PushAuth™: Web Server post. Please reach out to us if you have any questions, comments or suggestions, and feel free to share this post.

How to Implement PushAuth™: Android Mobile App

This post is part of the Power of PushAuth™ blog series. The first post of the series was a comprehensive guide to push authentication. The next three posts of the series comprise an end-to-end sample implementation of PushAuth in a simple user login flow. The tutorial breakdown is as follows:

  1. Web Server tutorial
  2. iOS Mobile App tutorial
  3. Android Mobile App tutorial (this post)

The tutorial in this post builds on the web server from the first tutorial. With your web server set up and running, you now need a mobile app to receive and respond to notifications. This post will help you build the Android mobile app to do so; then you will be able to leverage the power of PushAuth for login requests!

Setup

To follow this tutorial, you will need:

  • An Android device running Android 7.0 or higher
  • A computer with Android Studio 4.0 installed
  • JDK 8 installed (we recommend using jabba)

Step 1 : Cloning the Project

The sample Android mobile app code for this project is provided in the pushauth-sample-app-android GitHub repository. Clone this repo to your local machine:

$ git clone https://github.com/UnifyID/pushauth-sample-app-android.git

Open Android Studio, select “Open an existing Android Studio project”, and select the directory of the cloned repository. It should look something like this:

Step 2: Set up Firebase Cloud Messaging (FCM)

Firebase Cloud Messaging (FCM) is the platform used to send notifications to Android devices. You will first need to set up Firebase in your app by following the instructions at https://firebase.google.com/docs/android/setup. After doing this, ensure that a google-services.json file is present under the app/ directory of the project in Android Studio.

Next, navigate to the Firebase project settings of your app in your browser and select the “Cloud Messaging” tab. The token labeled “Server key” is what you will need to provide to UnifyID in the next section, so copy that value.

Step 3: Providing Push Credentials to UnifyID

Now you have the Firebase Cloud Messaging (FCM) server key copied, you will provide it to UnifyID so that PushAuth can send push notifications to the sample app on your phone. You’ll provide this value in your project’s dashboard. Follow the instructions in the Developer Portal docs to do so. After you’ve done this, your project dashboard will indicate they are successfully uploaded:

Step 4: Building the Project

Now, back to Android Studio. Make sure that your Android device is plugged in to your computer, has developer tools and USB debugging enabled, and is available in the top center of Android Studio. Also make sure that the google-services.json file is located under the app/ directory of the project.

With everything set up, click the green triangle or press Control+R to run the app. The following screen should appear on your device:

Step 5: Mobile App Settings

You now have all the values necessary for configuration! Tap the gear icon in the top right corner of the sample app’s Configuration screen. For SDK key, enter your UnifyID project’s SDK key value from the Dashboard. The User string should be the same value that you used when creating a user in the web server tutorial, e.g. “Morgan”. If these values do not match, you will not be able to successfully respond to push notifications in the login flow.

After setting those values and clicking “Confirm”, the app is ready to receive your PushAuth login requests! The app will remain on the “Waiting for PushAuth requests” page until it receives a PushAuth authentication request.

Now you can go through the full login flow by entering your username and password on the login page, respond to the notification received by this app on your phone, and be successfully logged into the website.

That’s it! Now you have a simple login flow that integrates PushAuth. Stay tuned for the rest of the posts in the series, make sure to share this post and reach out to us if you have any questions, comments or suggestions!

How to Implement PushAuth™: iOS Mobile App

This post is part of the Power of PushAuth™ blog series. The first post of the series was a comprehensive guide to push authentication. The next three posts of the series comprise an end-to-end sample implementation of PushAuth in a simple user login flow. The tutorial breakdown is as follows:

  1. Web Server tutorial
  2. iOS Mobile App tutorial (this post)
  3. Android Mobile App tutorial

The tutorial in this post builds on the web server from the first tutorial. With your web server set up and running, you now need a mobile app to receive and respond to push notifications. This post will help you build the iOS mobile app to do so; then you will be able to leverage the power of PushAuth for login requests!

Setup

To follow this tutorial, you will need:

Step 1: Cloning the Project

The pushauth-sample-app-ios GitHub repository contains the sample iOS mobile app code for this project. Clone the repository to your local machine and open the PushAuthSample.xcworkspace file in Xcode.

$ git clone https://github.com/UnifyID/pushauth-sample-app-ios.git

Step 2: Setting Up and Running the Project

  1. In the top left section of your Xcode window, set the active scheme to PushAuthSample.
  2. Plug your phone into your computer. Your phone’s name will appear as the chosen device next to the active scheme.
  3. Navigate to the “Signing & Capabilities” section of the Xcode project settings.
  4. Check the boxes next to “Automatically manage signing” in the “Signing (Debug)” and “Signing (Release)” sections. This will simplify setup and merge the two into a single “Signing” section.
  5. Choose the “Team” value to match your Apple Developer account.
  6. Set the “Bundle Identifier” to something unique; this value will be used in the next step of the tutorial when you create the Identifier through the Apple Developer site.

After following these six steps, your settings should closely resemble the screenshot above from Xcode. Once everything is set up properly and with your phone still connected to your computer, run the project (Product > Run or Command-R). This screen will show up on your phone:

Step 3: Create an Apple Bundle Identifier

This step requires you to an Apple Developer Program Role with adequate permissions. The role-permissions are listed here.

Navigate to the Identifiers tab on the Certificates, Identifiers & Profiles page of the Apple Developer site. You’ll need to add a new identifier that matches the Bundle Identifier value you set in Xcode in step 6 above. Click the plus symbol next to the title at the top of the page; if you don’t see this symbol, you likely don’t have adequate permissions. Follow these instructions for the subsequent pages:

  1. Register a new identifier page: Keep the default selection (App IDs) and click “Continue”.
  2. Select a type page: Keep the default selection (App) and click “Continue”.
  3. Register an App ID page:
    • Description: enter an appropriate description for this project, e.g. “PushAuth Project”. This value will be displayed as the “Name” on the Identifiers page.
    • Bundle ID: Keep the selection on “Explicit” and enter the exact same value you put as the Bundle Identifier in the Xcode Signing & Capabilities page earlier.
    • Enable Push Notification capability by scrolling down on the page and selecting the checkbox next to “Push Notifications”.
    • Click “Continue”, verify everything was entered correctly, and click “Register”.

Now that you have created an identifier for this project, you can create a push notification certificate associated with this identifier.

Step 4: Create a Push Notification Certificate

UnifyID requires the APNs certificate in *.p12 format to send PushAuth requests to the app. This can be done from the same Identifiers page of the Apple Developer site that you were on in Step 3.

  1. Click on the name of the identifier you just created, e.g. “PushAuth Project”.
  2. Scroll down to the “Push Notifications” row and click on the “Configure” box. Next to this box you should see “Certificates (0)” since you haven’t yet created a certificate associated with this identifier.
  1. In the Apple Push Notification service SSL Certificates pop-up window, click on the “Create Certificate” box under “Production SSL Certificate” then click “Done”.
  1. At this point, you need to create a Certificate Signing Request (CSR) file from your Mac. Click “Learn More” and follow those instructions for doing so. Then upload that file and continue.
  1. Now that you have created a certificate, you must download it locally to export it to *.p12. Click “Download”.
  1. This will prompt you to add the certificate to Keychain Access. Choose a Keychain, e.g. “login”, to add the certificate to and click “Add”.
  1. Then find that certificate in Keychain Access. It may be useful to select the “Certificates” category and utilize the search bar to find the certificate you just added.
  1. Once you have located your certificate, right-click on it and click the option to export the certificate:
  1. Specify a name for the *.p12 file and a location to save it. Make sure the file format is set to “Personal Information Exchange (.p12)” then click “Save”.
  1. You will be prompted to password-protect the exported *.p12 file. Choose to export it without a password; simply click “OK”.

Now you have successfully created a APNs certificate in *.p12 format! This will be used by UnifyID and needs to be uploaded to your project settings through the dashboard.

Step 5: Providing Push Credentials to UnifyID

Now you have an Apple Bundle Identifier and an APNs push certificate. It’s time to provide your push credentials to UnifyID so that PushAuth can send push notifications to the sample app on your phone. Check out the Developer Portal docs here, or follow along the instructions below.

  1. Navigate to the “Push Credentials” section of your project on the Developer Dashboard.
  2. Click on “Choose File” and select the *.p12 file you generated in Step 4 of this tutorial.
  3. Choose the “Development/Sandbox APNs server” option for now since we are sending push notifications to an app that runs directly from Xcode. Later on, choose “Production APNs server” when you need to send PushAuth requests to apps distributed through the App Store or through ad-hoc means.
  4. Click “Add” to complete the upload.

Once the push credentials are successfully uploaded to your project settings, you will see the push credential information displayed:

If you find yourself needing to change the push credentials used for the project, simply click “Edit” and go through the same upload steps with the new credentials.

Step 6: Mobile App Settings

You now have all the values necessary for configuration! Open the sample app on your phone and tap the gear icon in the top right of the Configuration screen. For SDK key, enter your UnifyID project’s SDK key value from the Dashboard. The User string should be the same value that you used when creating a user in the web server tutorial, e.g. “Morgan”. If these values do not match, you will not be able to successfully respond to push notifications in the login flow.

Once you set those two values, you must allow push notifications for the app, then the app is ready to receive your PushAuth login requests!

Now you can go through the full login flow by entering your username and password on the login page, respond to the push notification received by this app on your phone, and be successfully logged in to the website.

That’s it! You now have a simple login flow that integrates PushAuth. The next post provides a tutorial for building the Android sample PushAuth mobile app. Stay tuned for the rests of the posts in the series and, as always, please share this post and reach out to us with questions, comments or suggestions.

How to Implement PushAuth™: Web Server

Welcome back to the Power of PushAuth™ blog series! The first post provided a comprehensive guide to push authentication — check it out here if you missed it. The next three posts are tutorials offering an end-to-end implementation of PushAuth™ in a simple user login flow and will be broken down as follows:

  1. Web Server tutorial (this post)
  2. iOS Mobile App tutorial
  3. Android Mobile App tutorial

The first tutorial (this post) covers a Ruby on Rails backend that provides a basic user login authentication flow with PushAuth integrated. The second and third tutorials will be instructions on how to run sample iOS and Android apps, respectively. These will be the apps that receive and respond to push notifications initiated by the login process.

By the end of these three tutorials, you will have a website where a registered user can log in with their username and password, receive a push notification on their phone, accept the login request via the push notification, and subsequently be logged in on the website. This flow is shown in the video below.

This is a very simplified version of a real-world application and login flow. You might remember some of the security issues that can be present with push authentication from the previous post, such as trusted device registration, fallback resources, or access revocation. This tutorial does not include solutions for those or user sign-up. Future posts in the series will provide extensions of this simple flow to tackle some of those issues.

Alright, let’s get started!

Setup

To follow this tutorial, you will need:

Step 1: UnifyID Account, Project, and Keys

A UnifyID project will grant you access to UnifyID’s services. In order to create a project, you’ll need to first create a UnifyID account. Once you have an account and project, go to the Developer Dashboard to create an API key and SDK key. Make sure to copy the API key value somewhere safe – you won’t be able to access it later.

This tutorial is using PushAuth project as the UnifyID project name. You can see the configured API and SDK keys on the dashboard view above.

Step 2: Cloning the Project and Installing Dependencies

The pushauth-sample-server GitHub repository contains the code for this project. Clone the repository, navigate into it, install the project dependencies that are listed in the Gemfile, and ensure your Yarn packages are up-to-date:

$ git clone https://github.com/UnifyID/pushauth-sample-server.git
$ cd pushauth-sample-server
$ bundle install
$ yarn install --check-files

Feel free to poke around the code if you’d like to get a better understanding of what’s going on under the hood. This tutorial won’t go into those details, but a future post in this series will.

Step 3: DB Setup and Running the Server

Now, initialize the database:

$ bundle exec rails db:migrate

Once the database has been initialized you can create users. To create a user, do the following (replacing <your_username> and <your_password> with the values you intend to use for username and password):

$ rails console
> User.create(:username => "<your_username>", :password => "<your_password>").save
> exit

Step 4: Server API Key Storage

This step requires the server API key value you copied in Step 1.

$ EDITOR=vim rails credentials:edit

The above command will open the credentials file in vim. You can replace vim with the name of whichever executable you are most comfortable (atom, sublime, etc.). This will decrypt and open the credentials file for editing, at which point you should add the following entry:

unifyid:
  server_api_key: <your_key_goes_here>

After saving and closing, the credentials file will be re-encrypted and your server API key value will be stored.

Step 5: Running the Server

Now you are able to run the server:

$ bundle exec rails server

Finally, connect to the server by opening http://localhost:3000/ in your browser. This will bring you to the landing page, where you can then navigate to the login page and enter your username and password from above in Step 3:

This brings you to the end of the web server tutorial. After entering the username and password on the login page, you can see that the server is polling for the push notification response before allowing or denying access to the website. Without a way to receive or respond to push notifications, you cannot successfully log in. Stay tuned for the next couple posts of this tutorial installment to complete the flow:

Thanks for following along! Please reach out to us if you have any questions, comments or suggestions, and feel free to share this post.

Building an Object Recognition App and Protecting It From Bots

Introduction

I love building tech for other people to use.

Unfortunately, I learned early on that if your application is accessible to users, it’s also vulnerable to cyberattacks. This is a problem that developers everywhere face, from the person adding a form to their blog to the programmer building applications used by millions.

That’s why I was so excited to join UnifyID, a startup building passwordless authentication solutions, as an intern this summer. When I started, a particular product called HumanDetect really caught my eye. Quoting the documentation,

“UnifyID HumanDetect helps you determine whether a user of your app is a human or a bot. While the purpose is similar to a CAPTCHA, HumanDetect is completely passive, creating a frictionless user experience.”

I imagine most people have come across CAPTCHAs in one form or another, so I can’t be the only person who’s super annoyed each time I’m asked to click a bunch of small boxes.

reCAPTCHA

From a developer’s perspective, CAPTCHAs aren’t ideal either. They can be time consuming, which hurts the user experience by interrupting the flow of an application. Additionally, they may be difficult to complete (especially on the smaller screens of mobile devices), making it harder for some users to access features. However, for a long time CAPTCHAs have been one of the only reliable ways to verify humans, reduce spam, and prevent bot attacks. Despite their downsides, they’ve become a fixture of software as we know it.

UnifyID HumanDetect is different: machine learning algorithms passively determine whether a user is human, replacing CAPTCHAs while not interrupting the user flow. Additionally, while CAPTCHAs work best for web apps, HumanDetect is designed for mobile applications, which don’t have many reliable human authentication methods. To me, this is exciting—HumanDetect completely eliminates the need for explicit user actions.

In this blog post, I’ll outline how I built a simple object recognition app for iOS. It allows users to take a picture using the phone’s camera, which is sent to the Flask backend. There, the image is run through a neural network and the inference results are returned to the app.

After finishing the basic functionality of the app, I added HumanDetect to protect my app from bot attacks, which should give you a good idea of how developers can take advantage of this tool. Finally, I’ve linked all my code so that you can run everything yourself and see how you can use HumanDetect to protect your own apps.

Building the Flask Server

The first part of this project involved setting up a Flask server to serve as the backend of this app. Functionally, it will accept a POST request that contains an image, use a machine learning model to generate predictions based on the picture, and return the five most likely results.

I chose to use Python for the server side of the project because it’s the language I’m most comfortable with, and it’s extremely easy to code and debug. Plus, it’s widely used for machine learning, so adding object classification should be a piece of cake. I decided to use Flask over another framework like Django for similar reasons. I’ve previously used Flask for a couple of projects and it’s also lightweight, meaning it’s super simple to get up and running.

To start off, I needed to set up my environment. Isolating the packages I was using for this project was crucial since I’d need to replicate everything when I deployed my app to a cloud platform. I chose to use Conda simply because it’s what I’m most comfortable with (there’s a theme here, in case you haven’t noticed), although virtualenv would have been fine, too.

Next, I installed Flask and created a simple app that was just a webpage with “HumanDetect Example” on it. After running it locally and verifying that everything was set up correctly, I created a project in Heroku and prepared to deploy my app.

HumanDetect Webpage

To do this, I had to set up a custom CI/CD pipeline for GitLab that would trigger a fresh deployment each time I made a commit, which ended up taking quite a bit of time. Things are a lot simpler if you’re using GitHub (which is where the example code for this project is hosted, fortunately).

With most of the setup out of the way, I could finally begin building the functionality. First, I needed a way to accept an image via a POST request. Although I tried encoding the file as a string, I ended up using a method that simulated uploading a file via a multi-part form POST body before saving it to an ./upload folder.

@app.route("/", methods=['GET', 'POST'])
def home():
  if request.method == 'POST':
    file = request.files['file']
    filename = os.path.join('./uploads', 'image.png')
    file.save(filename)

Arguably the most important part of this whole project is the machine learning object recognition code. Although I could have made it quite complex, I made a couple of decisions that simplified this part as much as possible. I decided to use Keras because it is incredibly easy to use, and includes several common pre-trained models that only take a few lines of code to implement. Plus, I’m not too concerned about performance, so there isn’t really a particular reason to use TensorFlow or PyTorch in this case.

Keras provides a number of Convolutional Neural Networks (CNNs) covering the most common and high performing architectures for image classification. Because free Heroku dynos have a memory constraint, I wanted to minimize the size of the model while ensuring that accuracy is still high. I ultimately decided to go with the MobileNet architecture, a highly efficient network that performs within a few percentage points of VGG, ResNet, and Inception models. Since Keras provides pre-trained weights for the ImageNet dataset, I decided to use them without training my own models.

Before being fed into the model, I needed to preprocess the data so I would get the most accurate classification results. The CNN is built for inputting RGB images with a 224*224 resolution, but the images that I’ll be taking from the iOS app won’t have these dimensions. Therefore, I needed to resize each image using OpenCV. I decided to use this approach rather than cropping some parts of the image out to make a perfect square because important elements could be cut out, and the trained model should be robust enough to ignore minor changes to the aspect ratio.

Once the preprocessed image is fed into the model, the results need to be returned via a response to the POST request. I decided to get the five classes with the highest probabilities, reformat them into a single clean string, and return this string.

img = cv2.imread(filename)
img = cv2.resize(img, (224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = decode_predictions(model.predict(x), top=5)[0]

preds_formatted = ", ".join([
    f"{class_description}: {score*100:.2f}%"
    for (_, class_description, score) in preds
])

print("predictions: ", preds_formatted, "\n")
return preds_formatted

To test that everything was working, I wrote a simple Python script that submits this image of a taxi via a POST request.

Taxi

Here’s the returned response:

cab: 87.69%, police_van: 5.23%, racer: 1.45%, sports_car: 1.33%, car_wheel: 1.23%

Success! With the Flask app complete, I moved on to the next part of the project.

Building the iOS App

Let me make something clear: I’m not an iOS developer. I’ve built several apps for Android and the web, but I’ve never really tried Swift or Xcode—in fact, I haven’t even owned an Apple device in the last 7 years. Therefore, everything about this iOS thing was new for me, and I had to lean pretty heavily on Google and Stack Overflow.

Luckily, the Apple developer environment seemed relatively intuitive, and was in many ways simpler than building apps for its Android counterparts. It took me some time to go through a few basic iOS development guides online, but before long I was up and running with my first app in Xcode.

The most important function of the app is that it allows a user to take a picture using the phone’s camera. To accomplish this, I used a view controller called UIImagePickerController which adds the ability to capture images in just a few lines of code. I just followed the instructions from this article that I found on Google, and got this part working pretty quickly.

iOS Screenshot 1

Now that the user can take a picture, it needs to be sent via a POST request to the Flask server. Because of the way the backend expects the request to be made, I ended up having to manually add some metadata and body content. Although it looks a bit messy (and there might be a cleaner way to do it), I eventually did get it working, which is what counts.

let filename = "image.png"
let boundary = UUID().uuidString
let config = URLSessionConfiguration.default
let session = URLSession(configuration: config)
var urlRequest = URLRequest(url: URL(string: flaskURL)!)
urlRequest.httpMethod = "POST"
urlRequest.setValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")
var data = Data()
                
data.append("\r\n--\(boundary)\r\n".data(using: .utf8)!)
data.append("Content-Disposition: form-data; name=\"file\"; filename=\"\(filename)\"\r\n".data(using: .utf8)!)
data.append("Content-Type: image/png\r\n\r\n".data(using: .utf8)!)
data.append(image.pngData()!)
data.append("\r\n--\(boundary)--\r\n".data(using: .utf8)!)

Finally, I added a few UI elements to finish up the iOS app. I set up a loading screen spinner that is activated just after the picture is taken and deactivated once the response to the POST request is received. I also added a pop up alert that displays the object recognition results to the user.

iOS Results Screenshots

And that’s it! The main functionality of the object recognition app is now complete.

Protecting From Bots

This project is a great example of a possible use case for HumanDetect. Since the object recognition functionality involves quite a bit of machine learning and heavy processing, it’s important to ensure that each request to the backend is made by legitimate users of the app. An attack involving many unauthorized requests could become very costly (both computationally and financially) or even cause the app to become overwhelmed and crash. Implementing a verification step with HumanDetect before each POST request is processed can protect apps like this from attacks.

Adding HumanDetect to the app was surprisingly easy, as the documentation provides step-by-step instructions for adding it to the frontend and backend. Before I wrote any additional code, I created a new developer account at developer.unify.id. After setting up a new project in the dashboard, I came across a page with a bunch of technical jargon.

UnifyID Dashboard

For HumanDetect, the only things that matter are API Keys and SDK Keys. An API key gives access to the Server APIs that are used to verify whether a request to the backend is from a human or bot, while an SDK Key is used to initialize the iOS SDK and allows the app to generate a unique token that encodes information about the human/bot user. For this project, I went ahead and created one of each.

There are a few things that needed to happen on the iOS side. Once the HumanDetect pod is added, I initialized the SDK in AppDelegate.swift using the SDK key generated from the dashboard.

import UnifyID

let unify : UnifyID = { try! UnifyID(
    sdkKey: "<YOUR SDK KEY>"
)}()

Next, I set up an instance of HumanDetect to utilize its functionality.

import HumanDetect
let humanDetect = unify.humanDetect

Data capture needs to be manually started right when the app first loads. This allows the app to begin recording data that will later be used to determine if the user is a human or bot. Maximizing the time when data capture is active will generally result in higher accuracy.

override func viewDidLoad() {
    super.viewDidLoad()
    humanDetect.startPassiveCapture()
}

Data capture continues until a token is generated right after the picture is taken, and the token is added to the same POST request as the picture to be sent to the backend.

switch humanDetect.getPassive() {
    case .success(let humanDetectToken):
                
        // Creating POST request
        let fieldName = "token"
        let fieldValue = humanDetectToken.token
        
        …

        data.append("\r\n--\(boundary)\r\n".data(using: .utf8)!)
        data.append("Content-Disposition: form-data; name=\"\(fieldName)\"\r\n\r\n".data(using: .utf8)!)
        data.append("\(fieldValue)".data(using: .utf8)!)
                
        …

        // POST request to Flask server
        session.uploadTask(with: urlRequest, from: data, completionHandler: { responseData, response, error in
                    
            …

        }).resume()

    …

}

The Flask server has also been modified to accept the token generated by the iOS app. Right after the POST request from the app, the server makes its own POST request to https://api.unify.id/v1/humandetect/verify containing the generated token and the API Key from the developer dashboard.

HEADERS = {
    'Content-Type': 'application/json',
    'X-API-Key': <YOUR-API-KEY>,
}

@app.route("/", methods=['GET', 'POST'])
def home():
    if request.method == 'POST':
        file = request.files['file']
        token = request.form['token']

        if not file:
            return "Error: file not found in request"

        if not token:
            return "Error: token not found in request"

        print("token:", token)

        hd_response = requests.post('https://api.unify.id/v1/humandetect/verify', headers=HEADERS, data=json.dumps({"token": token}))

        if hd_response.status_code == 400:
            return "Error: invalid HumanDetect token"

        hd_json = hd_response.json()

        if "valid" not in hd_json or not hd_json["valid"]:
            return "Error: HumanDetect verification failed"

If the response indicates that the user is a valid human, the image is run through the Convolutional Neural Network normally. If it detects that the request is made by a bot, however, it will immediately return an error message without running the machine learning code. This ensures that bots won’t overwhelm server resources, and helps protect the integrity of the application’s infrastructure.

Next Steps

The code for this HumanDetect example is available at https://github.com/UnifyID/humandetect-sample-flask and https://github.com/UnifyID/humandetect-sample-ios. Instructions for setting everything up are included in the README files. If you run into any issues or have questions about HumanDetect, feel free to contact us.

If you want to learn more about how to counter bot attacks, I’d highly suggest reading this Medium article, which goes into more detail about various solutions including HumanDetect.

Thanks for reading! I hope that this has been helpful.

A load balancer that learns, WebTorch

In my previous blog post “How I stopped worrying and embraced docker microservices” I talked about why Microservices are the bees knees for scaling Machine Learning in production. A fair amount of time has passed (almost a year ago, whoa) and it proved that building Deep Learning pipelines in production is a more complex, multi-aspect problem. Yes, microservices are an amazing tool, both for software reuse, distributed systems design, quick failure and recovery, yada yada. But what seems very obvious now, is that Machine Learning services are very stateful, and statefulness is a problem for horizontal scaling.

Context switching latency

An easy way to deal with this issue is understand that ML models are large, and thus should not be context switched. If a model is started on instance A, you should try to keep it on instance A as long as possible. Nginx Plus comes with support for sticky sessions, which means that requests can always be load balanced on the same upstream a super useful feature. That was 30% of the message of my Nginxconf 2017 talk.

The other 70% of my message was urging people to move AWAY from microservices for Machine Learning. In an extreme example, we announced WebTorch, a full-on Deep Learning stack on top of an HTTP server, running as a single program. For your reference, a Deep Learning stack looks like this.

Pipeline required for Deep Learning in production.
What is this data, why is it so dirty, alright now it’s clean but my Neural net still doesn’t get it, finally it gets it!

Now consider the two extremes in implementing this pipeline;

  1. Every stage is a microservice.
  2. The whole thing is one service.

Both seem equally terrible for different reasons and here I will explain why designing an ML pipeline is a zero-sum problem.

Communication latency

If every stage of the pipeline is a microservice this introduces a huge communication overhead between microservices. This is because very large dataframes which need to be passed between services also need to be

  1. Serialized
  2. Compressed (+ Encrypted)
  3. Queued
  4. Transfered
  5. Dequeued
  6. Decompressed (+ Decrypted)
  7. Deserialized

What a pain, what a terrible thing to spend cycles on. All of these actions need to be repeated every time the microservice limit is crossed. The horror, the terrible end-to-end performance horror!

In the opposite case, you’re writing a monolith which is hard to maintain, probably you’re either using uncomfortable semantics either for writing the HTTP server or the ML part, can’t monitor the in between stages etc. Like I said, writing a ML pipeline for production is a zero-sum problem.

An extreme example; All-in-one deep learning

Venn diagram of torch, nginx
Torch and Nginx have one thing in common, the amazing LuaJIT

That’s right, you’ll need to look at your use case and decide where you draw the line. Where does the HTTP server stop and where does the ML back-end start. If only there was a tool that made this decision easy and allowed you to even go to the extreme case of writing a monolith, without sacrificing either HTTP performance (and pretty HTTP server semantics) or ML performance and relevance in the rapid growing Deep Learning market. Now such a tool is here (in alpha) and it’s called WebTorch.

WebTorch is the freak child of the fastest, most stable HTTP server, nginx and the fastest, most relevant Deep Learning framework Torch.

Now of course that doesn’t mean WebTorch is either the best performance HTTP server and/or the best performing Deep Learning framework, but it’s at least worth a look right? So I run some benchmarks, loaded the XOR neural network found at the torch training page. I used another popular Lua tool, wrk to benchmark my server. I’m sending serialized Torch 2D DoubleTensor tensors to my server using POST requests to train. Here’s the results:

Huzha! Over 1000 req/sec on my Macbook air, with no Cuda support and 2 Intel cores!

So there, plug that into a CUDA machine and see how much performance you squeeze out of that bad baby. I hope I have convinced you that sometimes, mixing two great things CAN lead to something great and that WebTorch is an ambitious and interesting open source project!

And hopefully, in due time it will become a fast, production level server which makes it easy for Data Scientists to deploy their models in the cloud (do people still say cloud?) and devOps people to deploy and scale.

Possible applications of such a tool include, but not limited to:

  • Classification of streaming data
  • Adaptive load balancing
  • DDoS attack/intrusion detection
  • Detect and adapt to upstream failures
  • Train and serve NNs
  • Use cuDNN, cuNN and cuTorch inside NGINX
  • Write GPGPU code on NGINX
  • Machine learning NGINX plugins
  • Easily serve GPGPU code
  • Rapid prototyping Deep Learning solutions

Maybe your own?

Docker and Beanstalk: Welcome to the Gaps

At UnifyID we’re big fans of microservices à la Docker and Elastic Beanstalk. And for good reason. Containerization simplifies environment generation, and Beanstalk makes it easy to deploy and scale.

Both promise an easier life for developers, and both deliver, mostly. But as with all simple ideas, things get less, well simple, as the idea becomes more widely adopted, and then adapted into other tools and services with different goals.

Soon there are overlaps in functionality, and gaps in the knowledge base (the Internet) quickly follow. Let’s take an example.

When you first jump into Docker, it makes total sense. You have this utility docker and you write a Dockerfile that describes a system. You then tell docker to read this file and magically, a full blown programming environment is born. Bliss.

But what about running multiple containers? You’ll never be able to do it all with just a single service. Enter docker-compose, a great utility for handling just this. But suddenly, what was so clear before is now less clear:

  • Is the docker-compose.yml supposed to replace the Dockerfile? Complement it?
  • If they’re complementary, do options overlap? (Yes.)
  • If options overlap, which should go where?
  • How do the containers address each other given a specific service? Still localhost? (Not necessarily.)

Add in something like Elastic Beanstalk, its Dockerrun.aws.json file, doing eb local run, and things get even more fun to sort out.

In this post I want to highlight a few places where the answers weren’t so obvious when trying to implement a Flask service with MongoDB.

To start off, it’s a pretty straightforward setup. One container runs Flask and serves HTTP, and a second container serves MongoDB. Both are externally accessible. The MongoDB is password protected, naturally, and in no way am I going to write my passwords down in a config file. They must come from the environment.

Use the Dockerfile just for provisioning

The project began its life with a single Dockerfile containing an ENTRYPOINT to start the app. This was fine while I was still in the early stages of development — I was still mocking out parts of external functionality, or not even handling it yet.

But then I needed the same setup to provide a development environment with actual external services running, and the ENTRYPOINT in the Dockerfile became problematic. And then I realized — you don’t need it in the Dockerfile, so ditch it. Let the Dockerfile do all the provisioning, and specify your entrypoint in one of the other ways. From the command line:

docker run --entrypoint make myserver run-tests

Or, from your docker-compose.yml you can do it like

version: '2'
services:
  myserver:
    ...
    entrypoint: make dev-env

This handily solved the problem of having a single environment oriented to different needs, i.e. test runs and a live development environment.

Don’t be afraid of multiple Dockerfiles

The docker command looks locally for a file named Dockerfile. But this is just the default behavior, and it’s pretty common to have slightly different configs for an environment. E.g. our dev and production environments are very similar, but we have some extra stuff in dev that we want to weed out for production.

You can easily specify the Dockerfile you want by using docker -f Dockerfile.dev ..., or by simply using a link: ln -s Dockerfile.dev Dockerfile && docker ...

If your docker-compose.yml specifies multiple containers you may find yourself in the situation where you not only have multiple Dockerfiles for a given service, but Dockerfile(s) for each service. To demonstrate, let’s say we have the following docker-compose.yml

version: '2'
services:
  flask:
    build: .
    image: myserver:prod
    volumes:
      - .:/app
    links:
      - mongodb
    environment:
      - MONGO_USER=${MONGO_USER}
      - MONGO_PASS=${MONGO_PASS}
    ports:
      - '80:5000'
    entrypoint: make run-server
  mongodb:
    build: ./docker/mongo
    image: myserver:mongo
    environment:
      - MONGO_USER=${MONGO_USER}
      - MONGO_PASS=${MONGO_PASS}
    ports:
      - '27017:27017'
    volumes:
      - ./mongo-data:/data/mongo
    entrypoint: bash /tmp/init.sh

In the source tree for the above, we have Dockerfiles in the following locations:

Dockerfile.dev
Dockerfile.prod
docker/mongo/Dockerfile

The docker-compose command uses the build option to tell it where to find the Dockerfile for a given service. The top two files are for the Flask service, and the appropriate Dockerfile is chosen using the linking strategy mentioned above. The mongodb service uses its own Dockerfile kept in a certain folder. The line

build: ./docker/mongo

tells docker where to look for it.

Dockerrun.aws.json, the same, but different

Enter Elastic Beanstalk and Dockerrun.aws.json. Now you have yet another file, and it pretty much duplicates docker-compose.yml — but of course with its own personality.

You use Dockerrun.aws.json v2 to deploy multiple containers to Elastic Beanstalk. Also, when you do eb local run, the file .elasticbeanstalk/docker-compose.yml is generated from it.

Here’s what the Dockerrun.aws.json corollary of the above docker-compose.yml file looks like:

{
  "AWSEBDockerrunVersion": 2,
  "volumes": [
    {
      "name": "mongo-data",
      "host": {
        "sourcePath": "/var/app/mongo-data"
      }
    }
  ],
  "containerDefinitions": [
    {
      "name": "myserver",
      "image": "SOME-ECS-REPOSITORY.amazonaws.com/myserver:latest",
      "environment": [
          {
            "name": "MONGO_USER",
            "value": "changemeuser"
          },
          {
            "name": "MONGO_PASS",
            "value": "changemepass"
          },
          {
            "name": "MONGO_SERVER",
            "value": "mongo-server"
          }
      ],
      "portMappings": [
        {
          "hostPort": 80,
          "containerPort": 5000
        }
      ],
      "links": [
        "mongo-server"
      ],
      "command": [
        "make", "run-server-prod"
      ]
    },
    {
      "name": "mongo-server",
      "image": "SOME-ECS-REPOSITORY.amazonaws.com/mongo-server:latest",
      "environment": [
          {
            "name": "MONGO_USER",
            "value": "changemeuser"
          },
          {
            "name": "MONGO_PASS",
            "value": "changemepass"
          }
      ],
      "mountPoints": [
        {
          "sourceVolume": "mongo-data",
          "containerPath": "/data/mongo"
        }
      ],
      "portMappings": [
        {
          "hostPort": 27017,
          "containerPort": 27017
        }
      ],
      "command": [
        "/bin/bash", "/tmp/init.sh"
      ]
    }
  ]
}

Let’s highlight a few things. First, you’ll see that the image option is different, i.e.

      "image": "SOME-ECS-REPOSITORY.amazonaws.com/myserver:latest",

This is because we build our docker images and push them to a private repository on Amazon ECS. On deploy, Beanstalk looks for the one tagged latest, pulls, and launches.

Next, you may have noticed that in docker-compose.yml we have the entrypoint option to start the servers. However, in Dockerrun.aws.json we’re using "command".

There are some subtle differences between ENTRYPOINT and CMD. But in this case, it’s even simpler. Even though Dockerrun.aws.json has an "entryPoint" option, the server commands wouldn’t run. I had to switch to "command" before I could get eb local run to work. Shrug.

Another thing to notice is that in docker-compose.yml we’re getting variables from the host environment and setting them into the container environment:

    environment:
      - MONGO_USER=${MONGO_USER}
      - MONGO_PASS=${MONGO_PASS}

Very convenient. However, you can’t do this with Dockerrun.aws.json. You’ll have to rewrite the file with the appropriate values, then reset it. The next bit will demonstrate this.

We’re setting a local volume for MongoDB with the following block:

  "volumes": [
    {
      "name": "mongo-data",
      "host": {
        "sourcePath": "/var/app/mongo-data"
      }
    }
  ]

The above path is production specific. This causes a problem with eb local run, mainly because of permissions on your host machine. If you set a relative path, i.e.

        "sourcePath": "mongo-data"

the volume is created under .elasticbeanstalk/mongo-data, and everything works fine. On a system with Bash, you can solve this pretty easily doing something along the following lines:

cp Dockerrun.aws.json Dockerrun.aws.json.BAK
sed -i '' "s/\/var\/app\///g" Dockerrun.aws.json
eb local run ; mv Dockerrun.aws.json.BAK Dockerrun.aws.json

We just delete the /var/app/ part, run the container locally, and return the file back to how it’s supposed to be for deploys. This is also how we set the password — changemepass — from the environment on deploy.

Last, you’d think running eb local run, which is designed to simulate an Elastic Beanstalk environment locally via Docker, would execute pretty much the same as when you invoke with docker-compose up.

However, I discovered one frustrating gotcha. In our Flask configuration, we are addressing the MongoDB server with mongodb://mongodb (instead of mongodb://localhost) in order to make the connection work between containers.

This simply did not work in eb local run. Neither did using localhost. It turns out the solution is to use another environment variable, MONGO_SERVER. In our Flask config, we do the following, which defaults to mongodb://mongodb:

    'MONGO_SERVER': os.environ.get('MONGO_SERVER', 'mongodb'),

In Dockerrun.aws.json, we specify this value as

          {
            "name": "MONGO_SERVER",
            "value": "mongo-server"
          }

Why? Because the "name" of our container is mongo-server and eb generates an entry in /etc/hosts based on that.

So now everything works between docker-compose up, which uses mongodb://mongodb, and eb local run, which uses mongodb://mongo-server.

These are just a few of the things that might confound you when trying to do more than just the basics with Docker and Elastic Beanstalk. Both have a lot to offer, and you should definitely jump in if you haven’t already. Just watch out for the gaps!

How I stopped worrying and embraced Docker Microservices

Hello world,

If you are like us here at UnifyID then you’re really passionate about programming, programming languages and their runtimes. You will argue passionately about how Erlang has the best Distributed Systems model (2M TCP connections in one box), Haskell has the best type system, and how all our ML back-end should be written in Lua (Torch). If you are like me and you start a company with other people, you will argue for hours, and nobody’s feelings are gonna be left intact.

That was the first problem we had in the design phase of our Machine Learning back-end. The second problem will become obvious when you get a short introduction to what we do at UnifyID:

We data-mine a lot of sensors on your phone, do some signal processing and encryption on the phone, then opportunistically  send the data from everybody’s phone into our Deep-Learning backend where the rest of the processing and actual authentication take place.

This way, the processing load is shared between the mobile device and our Deep Learning backend. Multiple GPU machines power our Deep Learning, running our proprietary Machine Learning algorithms, across all of users’ data.

These are expensive machines and we’re a startup with finite money, so here’s the second problem; Scalability. We don’t want these machines sitting around when no jobs are scheduled and we also don’t want them struggling when a traffic spike hits. This is a classic auto-scaling problem.

This post describes how we killed two birds;

  1. Many programming runtimes for DL.
  2. Many machines.

With one stone. By utilizing the sweeping force of Docker microservices! This has been the next big thing in distributed systems for a while, Twitter and Netflix use this heavily and this talk is a great place to start. Since we have a lot of factors we verify against like Facial Recognition, Gait Analysis and Keystroke Analysis, it made sense to make them modular. We packaged each one in its own container, wrote a small HTTP server that satisfies the following REST API and done!

POST /train
Body: { Files: [ <s3 file1>, <s3 file2>,...] }
Response: { jobId: <jobId> }
POST /input
Body: { Files: [ <s3 file1>, <s3 file2>,...] }
Response: { jobId: <jobId> }
POST /output
Body: { Files: [ <s3 file1>, <s3 file2>,...] }
Response: { outputVector: <output vector> } 

GET /status?jobId=<jobId> 
Response: { status: [running|done|error] }

This API can be useful because every Machine Learning algorithm has pretty much the same API; training inputs, normal inputs and outputs.  It’s so useful we decided to open-source our microservice wrapper for Torch v7/Lua and for Python. Hopefully more people can use it and we can all start forking and pushing entire machine learning services in dockerhub.

But wait, there’s more! Now that we containerized our ML code, the scalability problem has moved from a development problem to an infrastructure problem. To handle scaling each microservice according to their GPU and Network usage, we rely on Amazon ECS. We looked into Kubernetes, as a way to load-balance containers, however its support for NVIDIA GPU based load-balancing is not there yet (There’s a MR and some people who claim they made it work). Mesos was the other alternative, with NVIDIA support, but we just didn’t like all the Java.

In the end, this is how our ML infrastructure looks like.

screen-shot-2016-09-16-at-8-54-16-pm
Top-down approach to scalable ML microservices

Those EB trapezoids represent Amazon EB (Elastic Beanstalk), another Amazon service which can replicate machines (even GPU heavy machines!) using custom-set rules. The inspiration for load-balancing our GPU cluster with ECS and EB came from this article from Amazon’s Personalization team.

For our Database we use a mix of Amazon S3 and a traditional PostgreSQL database linked and used as a local cache for each container. This way, shared data becomes as easy as sharing S3 paths, while each container can modularly keep its own state in PostgreSQL.

So there you have it, both birds killed. Our ML people are happy since they can write in whatever runtime they want as long as there is an HTTP server library for it. We don’t really worry about scalability as all our services are small and nicely containerized. We’re ready to scale to as many as 100,000 users and I doubt our microservices fleet would even flinch. We’ll be presenting our setup in the coming Dockercon 2017 (hopefully, waiting for the CFP to open) and we’re looking to hire new ML and full-stack engineers. Come help us bring the vision of passwordless implicit authentication to everyone!