Integrating GaitAuth™ into Your Android App

GaitAuth™ is an SDK offered by UnifyID that allows you to authenticate your users based on how they move. This post demonstrates the process of adding GaitAuth to a simple Android app.

The UnifyID Developer Portal has high-level documentation that covers the basics of the GaitAuth SDK integration process. There are also auto-generated JavaDocs available for the specific details. The goal of this post is to meet the two in the middle and give some more color to the integration process. By the end of the post we’ll have built a sample Android app that uses GaitAuth to train and test a GaitModel. Along the way, we’ll also explore relevant best practices and important implementation details.

The Sample App

All of the sample app code is available on GitHub. To get a feel for the training and testing process you can build the app on your phone and try it for yourself. If you want to get right to building your own app, you can treat the repository as a collection of helpful code snippets. You can also use the code to follow along with this post in depth. Not all of the code is shown in this post, the code that is shown has been abbreviated for simplicity’s sake. Links to the original code on GitHub are provided at the top of the snippets.

The sample app closely mirrors the functionality of the GaitAuth SDK. On the starting screen you are presented with the option to create or load a model. You’ll want to create a new model when you first use the app. In future uses, you can use the load option to skip training. After creating a model you’re ready to collect data and train the model. First, turn on collection (it will continue to collect even if the phone is locked or the app is backgrounded or closed). Once you’ve collected some features you can add them to the model. After you’ve collected enough data you can train the model. While you wait for the model to train (which may take a few hours), you can manually refresh its status.

If the model fails to train you’ll need to start over by pressing the trashcan icon in the top left corner. When the model successfully trains you’re ready to test it. Turn on feature collection and walk around for a while. When you are ready, stop collecting features and score them. This will graph the scores and show you an authentication decision.

Now that we know what the sample app does, let’s go build it.

Configuration & Initialization

Adding GaitAuth to our Gradle configuration is a good starting point. We’ll also need to request some permissions in the Android Manifest. We can follow the steps outlined in the Developer Portal documentation to take care of that.

Before the GaitAuth SDK can be used anywhere in the app it must be initialized. The initialization is trivial; always initializing it before using it is harder. In something straightforward like this sample app we can simply initialize it in the onCreate method of the MainActivity.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L144
if (GaitAuth.getInstance() == null) {
    // First initialize the UnifyID Core SDK
    UnifyID.initialize(getApplicationContext, SDK_KEY, USER, new CompletionHandler() {
        @Override
        public void onCompletion(UnifyIDConfig config) {
            // Second initialize the GaitAuth SDK
            GaitAuth.initialize(context, config);
            // Save these values for debugging purposes
            Preferences.put(Preferences.CLIENT_ID, config.getClientId());
            Preferences.put(Preferences.INSTALL_ID, config.getInstallId());
            Preferences.put(Preferences.CUSTOMER_ID, config.getCustomerId());
            startGaitAuthService(context);
            route();
        }
    });
}

When initialization succeeds we squirrel away a couple of relevant details in the Shared Preferences (the Preferences class hides the idiosyncrasies of the Shared Preferences API). These will come in handy for debugging. You can view them by tapping on the info icon in the top right corner of the sample app. We also start a foreground service — more on this later. Finally, we call route which determines what the app does next.

In an app with a more complex workflow, initialization of the GaitAuth SDK may not be so simple. In these scenarios we recommend you wrap the SDK in a class that protects its usage. With some simple synchronization this wrapper can ensure that the SDK is never used uninitialized.

Two important points remain. First, remember that the SDK key is a sensitive value and should be protected. In the sample app we ask the user to enter the SDK key the first time they use the app. For an app in production, something like Android Keystore can be used. Second, the user value has a few important stipulations on it. It must be unique, immutable, and should be unrelated to any personally identifiable information. We don’t recommend using things like email addresses or phone numbers for the user value.

The GaitModel

Now that we’ve gotten through the drudgery of initialization it’s time to talk about the star of the show — the GaitModel. In a sense the sample app can be viewed as a tool for managing the lifecycle of a GaitModel. It creates a new model, loads training data into it, initiates training, and then tests the model. For this reason, the sample app’s routing is based on the status of the current GaitModel.

The sample app uses a single-activity/multiple-fragment architecture where each fragment is a different screen. In the MainActivity after the initialization of the GaitAuth SDK the following routing code is run.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L185
private void route() {
    // Load the id of the current GaitModel
    String modelId = Preferences.getString(Preferences.MODEL_ID);
    if (Strings.isNullOrEmpty(modelId)) {
        showFragment(new SelectModelFragment());
    } else {
        loadModel(modelId);
    }
}

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L287
private void loadModel(String modelId) {
    AsyncTask.execute(() -> {
        model = GaitAuth.getInstance().loadModel(modelId);
        Preferences.put(Preferences.MODEL_ID, modelId);
        renderFragmentBasedOnModelStatus(model.getStatus());
    });
}

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L255
private void renderFragmentBasedOnModelStatus(GaitModel.Status status) {
    switch (status) {
        case CREATED:
            showFragment(FeatureCollectionFragment.build(gaitAuthService.isTraining()));
            break;
        case TRAINING:
            showFragment(new ModelPendingFragment());
            break;
        case FAILED:
            showFragment(ModelErrorFragment.newInstance(model.getReason()));
            break;
        case READY:
            showFragment(TestingFragment.build(gaitAuthService.isTesting()));
            break;
        default:
            // treat it as a failure
            showFragment(ModelErrorFragment.newInstance("unknown model status"));
            break;
    }
}

First the method route looks for the id of the current GaitModel and asynchronously loads it with the help of loadModel. After the model is loaded, renderFragmentBasedOnModelStatus is called. A simple switch statement then sends the user to the screen matching the current state of the model.

The method route will not find a model id on a user’s first time through the app or if the reset button was pressed. In these cases the user is immediately sent to the SelectModelFragment. From here, when the user clicks on the “Create Model” button, onCreateModelPressed is executed. It builds a new GaitModel, saves the model id in the shared preferences, and sends the user to the FeatureCollectionFragment. Alternatively, the user can opt to load a pre-existing model which leverages the same loadModel method.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L315
public void onCreateModelPressed() {
    AsyncTask.execute(() -> {
        model = GaitAuth.getInstance().createModel();
        Preferences.put(Preferences.MODEL_ID, model.getId());
        showFragment(new FeatureCollectionFragment());
    });
}

Feature Collection

So we have a GaitModel locked and loaded, but now what? The model is useless without training data so let’s start there. A GaitModel is trained on a collection of GaitFeatures which are data points collected from the user’s movement. These GaitFeatures are given to the model via the add method which has the signature void GaitModel.add(Collection<GaitFeature> features). Later during training, only the features explicitly added to the model will be used.

Now that we know how to use the features, how do we actually collect them? This is achieved by registering a FeatureEventListener that will fire the onNewFeature callback for every feature collected. Correctly managing feature collection requires tackling two key issues: feature storage and backgrounding.

Storing Features

In a naive implementation of feature collection, every time a new feature was received it would be immediately added to the GaitModel. There are two problems with this. First, adding features to the GaitModel may use network resources or trigger other expensive operations and thus is inefficient to call frequently. Second, if the model fails when training you will need to recollect all new data for the new model since you didn’t persist it anywhere.

The sample app solves the feature storage problem with a thread-safe FeatureStore class. This class exposes a method with the signature void add(GaitFeature feature), which serializes the feature and appends it to a file on disk. At a future time (when the user clicks the “Add Feature” button) all of the features can be loaded from disk, deserialized and returned via the method getAll with the signature List<List<GaitFeature>> getAll(). Note that it returns the features partitioned into multiple lists. This is because adding thousands of features to a model at once should be avoided. Finally, there is an empty method to clear the file after adding features. This prevents adding features twice. Deleting the features after using them reintroduces the persistence problem though. The sample app does it anyways to avoid re-adding features to the model in a simple manner.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/GaitAuthService.java#L163
public void onNewFeature(GaitFeature feature) {
    featureStore.add(feature);
    int count = Preferences.getInt(Preferences.FEATURE_COLLECTED_COUNT) + 1;
    Preferences.put(Preferences.FEATURE_COLLECTED_COUNT, count);
}

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L363
public int onAddFeaturesPressed() {
    FeatureStore featureStore = FeatureStore.getInstance(this);
    List<List<GaitFeature>> chunks = featureStore.getAll();
    int uploadedCounter = 0;

    for (List<GaitFeature> chunk: chunks) {
        model.add(chunk);
        uploadedCounter += chunk.size();
    }
    if (uploadedCounter > 0) { // only truncate file after we uploaded something
        featureStore.empty();
    }
    return uploadedCounter;
}

In a production application there are many improvements you would want to make to FeatureStore. First and foremost, it should implement some form of in-memory buffering. Writing to disk for every feature you collect is slow and can lead to poor battery-life. Second, it should rotate files to avoid size limitations and corruption. Third, it should not clear the file after adding the features to the model. Rather, it should keep track of what features have been added and only add the new ones.

Backgrounding

The second key issue to solve for feature collection is backgrounding. It would not be wise to register the FeatureEventListener on the app’s main thread. As soon as the user closed the app, GaitFeatures would no longer be collected. Android services can help us solve this problem. Services provide a way to execute a long-running operation in the background. There are two relevant types of Android services: background and foreground. A background service needn’t provide any indication to the user that it is running, but is more likely to be shut down by the OS if resources are scarce. A foreground service must present a notification to the user indicating its presence the entire time it is alive. But, it is unlikely to be killed by the OS.

The sample app takes the most straightforward approach. In the onCreate method of the MainActivity a foreground service is created regardless of whether or not the service will be used. This avoids complex state management and synchronization issues that arise when dynamically building a service. In the onCreate method the activity binds to the service. And in the onStop method it unbinds from the service. This gives the activity direct access to the feature collection service.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L104
protected void onStop() {
    super.onStop();
    if (gaitAuthServiceBound.get()) {
        unbindService(gaitAuthServiceConnection);
        gaitAuthServiceBound.set(false);
    }
}

You may want to handle services differently in a production app. For example, you may only want to start the service when you are actually collecting features. If you go this route make sure to take special care in synchronizing the usage of the GaitAuth SDK.

Final Considerations

That was a lot of details. Let’s take a step back for a moment and consider the entire training process. Feature collection for training is the most critical stage for the GaitAuth SDK. To get good authentication results you need to have a well trained model. The Developer Portal documentation has a number of suggestions on how to best do this.

Training the Model

The hard work of feature collection was well worth the effort. We now have a plethora of GaitFeatures to train with. We’re nearly ready to use our GaitModel, but first we need to train it. Thankfully, training is easy. When a user clicks on the “Train Model” button the method below is called. In a background thread it kicks off the training process for the model and sends the user to the ModelPendingFragment.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L397
public void onTrainModelPressed() {
    AsyncTask.execute(() -> {
        try {
            model.train();
            showFragment(new ModelPendingFragment());
        } catch (GaitModelException e) {
            showToast("Failed to start training model.");
        }
    });
}

At the pending model screen a user can manually refresh the status of a model while they wait for it to train.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L436
public void onRefreshPressed() {
    AsyncTask.execute(() -> {
        try {
            model.refresh();
            renderFragmentBasedOnModelStatus(model.getStatus());
        } catch (GaitModelException e) {
            showToast("Failed to refresh model status.");
        }
    });
}

After refreshing the status of the model, our old friend renderFragmentBasedOnModelStatus is called to bring the user to the right screen given the model’s status. If the training failed for some reason the ModelErrorFragment will load. From there the user has no choice other than to reset and start over. After a successful training, the user will be presented with the TestingFragment. And of course, if the model status is still TRAINING after a refresh then no navigation will occur.

Testing the Model

The hard work is over now — we’ve trained a GaitModel and can now reap the benefits. All of the testing functionality is managed by the Authenticator interface. When you instantiate an Authenticator you pass it a GaitModel and a scoring policy. Once created, it automatically starts collecting features. You can ask it for an authentication decision at any time and it will say if the user is authenticated or unauthenticated. In the scenario where this is not enough data to make a decision it will return inconclusive. When the user presses the “Start Collection” button onStartCollectionBtnPressed is called. It in turn tells the foreground service to create a new Authenticator object.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L338
public void onStartCollectionBtnPressed() {
    try {
        gaitAuthService.startFeatureCollectionForTesting(model);
    } catch (GaitAuthException e) {
        showToast("Failed to start feature collection for testing.");
    }
}

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/GaitAuthService.java#L213
public void startFeatureCollectionForTesting(GaitModel model) throws GaitAuthException {
    GaitQuantileConfig config = new GaitQuantileConfig(QUANTILE_THRESHOLD);
    authenticator = GaitAuth.getInstance().createAuthenticator(config, model);
}

A quick aside on the GaitQuantileConfig policy and how it works. In short it will authenticate the user if at least X percent of scores in the session scored Y or more (learn more about scores here). X is known as the quantile and Y is the score threshold. GaitQuantileConfig sets the quantile to 50% by default and in the sample app the score threshold is set to 0.8. The table below shows three example sessions with these numbers. The first session has 60% of scores at or above the 0.8 score threshold, and therefore has an authenticated result. However, the second and third sessions only have 40% and 20% of scores that meet the 0.8 score threshold, and therefore they produce an unauthenticated result.

You can customize more than just the quantile and score threshold of the GaitQuantileConfig. The method setMinNumScores lets you configure how many scores are required for an authentication decision to be made. Any attempt to authenticate with a number of scores less than this minimum will return inconclusive. Similarly, setMaxNumScores configures the maximum amount of scores that will be considered. If there are more scores than the maximum, the most recent scores will be chosen. Finally, setMaxScoreAge determines how old of scores can be used. Scores older than the given age will not be used for an authentication decision.

Back to the action. The sample app requires the user to stop collecting features before they can score them. Note that this is not a requirement of the GaitAuth SDK and is only done to simplify state management. The Authenticator has a stop method which helps us do this. Clicking the “Stop Collection” button kicks off this sequence of events.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L349
public void onStopCollectionBtnPressed() {
    gaitAuthService.stopFeatureCollectionForTesting();
}

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/GaitAuthService.java#L223
public void stopFeatureCollectionForTesting() {
    if (authenticator != null) {
        authenticator.stop();
    }
}

Now the user can press the “Score Features” button. This entails getting the authenticator from the foreground service and then getting an authentication status from the authenticator. Then it sends the user to the ScoreFragment. The authenticator also returns the individual scores that lead to the authentication decision. ScoreFragment puts these in a graph to help visualize the decision. How you use the authenticator will be highly dependent on the specifics of your use case.

// https://github.com/UnifyID/gaitauth-sample-app-android/blob/470932205438dd2ce43dc6bad586df90c72cc800/app/src/main/java/id/unify/gaitauth_sample_app/MainActivity.java#L469
public void onScoreFeaturesPressed() {
    Authenticator authenticator = gaitAuthService.getAuthenticator();
    
    authenticator.getStatus(new AuthenticationListener() {
        @Override
        public void onComplete(AuthenticationResult authenticationResult) {
            showFragment(scoreFragment);
        }

        @Override
        public void onFailure(GaitAuthException e) {
            showToast("Failed to get scores.");
        }
    });
}

Conclusion

And just like that we’ve integrated GaitAuth into a simple Android application. For more implementation details you can explore the code in depth on GitHub. If you build something with GaitAuth, let us know on social media. We’d love to hear about it. Finally, reach out to us if you have any trouble integrating.

Interview With John Whaley – UnifyID by Safety Detectives

John Whaley: Founder and CEO of UnifyID

Aviva Zacks of Safety Detectives sat down with John Whaley, Founder and CEO of UnifyID. She asked him about his company’s challenges and solutions.

Safety Detectives: What was your journey to cybersecurity and what do you love about it?

John Whaley: I went to MIT for undergrad where I majored in computer science and learned about how security is implemented in the real world. During my Ph.D. at Stanford, my thesis was on the static analysis of source code to automatically find bugs, security flaws, and security holes within the software.

I founded my first company out of Stanford which was in the security space, and now I’ve started a second company in the space as well.

SD: What motivated you to start UnifyID?

JW: What I found was that every time you type a key on the keyboard, it sends a network packet, the content of which was encrypted, but you could look at the timing between the packets and then, based on that, you could determine the timing of a user’s keystroke as they typed. So we built a demo of this solution for a security conference.

It turns out that if you know the timing of somebody’s keystrokes, then you can figure out with fair reliably what it is that they are typing because, as you move your fingers around a keyboard, the spacing between them and the duration of the time between keystrokes can leak the information about what you are typing.

We used Wireshark in the demo to capture a packet trace between the client and the server for some of these major products. Then we dumped that packet trace into a tool that would look at the timing between each of the packets, and then based on that, try to make a prediction about what the user was typing.

SD: What have been some challenges?

One of the challenges we had in building the demo was the fact that everyone has their own unique way of typing. And so, you could train a model that would work well for one person, but it wouldn’t necessarily work well for other people. That’s where we first got interested in noticing habits and idiosyncrasies that we could use for identity authentication. I noticed that passwords were a real challenge. Moving forward, we knew that the password alone was not going to be the way that people would be authenticated. While the password is not completely going away yet, we are starting to see its limitations and the need for additional authentication factors to provide secure digital experiences.

SD: Which industries use UnifyID and why?

JW: We have a lot of interest from the financial services industry because fraud is very costly in that area; they have a need for high security but there’s also a need for seamless user experience. The other areas are cryptocurrencies and crypto exchanges. Any type of case where there is a sharing economy where you need to authenticate not only the user, but also the worker, because the worker may not be a full-time employee of the company, and they want to make sure that the correct person is the one making a delivery or walking your dogs.

In many cases, people use our technology for streamlining physical access: for unlocking doors and cars for example, where you want security and you also want a seamless user experience.

SD: What do you feel is the worst cyberthreat today? 

JW: The biggest cyber threat continues to be the attacks that go after the end-user. We’ve reached a point now where firewalls are no longer easy targets. It is now much easier and much more lucrative to go after individuals and try to steal their identity during the authentication process by tricking them into authenticating. This way the attacker hijacks the individual’s session to take over their account and then either transfer money out or use the hijacked account as a launch point for new attacks.

When I was young, hackers were hobbyists who were hacking for fun to prove something. There was not a lot of money in it, and it was not particularly malicious. Fraud is now a cybercrime and cyberattacking is now a large industry. There is a lot of money in it. The attacks have gotten very sophisticated. Attackers will steal someone’s identity, wreck their credit, and use that to launch different types of attacks to try to extract money out of even more people.

Until now, humans have always been the weak link in security—getting tricked into either clicking through a phishing site, entering their password in the wrong place, or getting socially engineered over a phone call. WIth UnifyID’s behavioral biometrics technology based on motion and the way each one of us behaves, humans become a strong link in security just by behaving the way they usually do.

“Suddenly there is a much greater need to remotely authenticate people…”

John Whaley

SD: How important is multifactor authentication in the light of COVID-19 and the increase of employees working from home?

JW: The number of attacks has increased by almost 800% since the start of COVID-19. In the recent past, you were able to implicitly be authenticated due to the fact you were physically at the office, which takes security measures to let you into the building itself. Now, with everyone working remotely, suddenly there is a much greater need to remotely authenticate people as now a larger number of us works remotely.

One of the additional drivers for hacking is the current economic situation. In the current world environment, more and more people are out of work and lacking positive economic prospects. These conditions could drive more people to engage in hacking.

Interview originally published on Safety Detectives.

UnifyID™ Raises $20M Series A Funding from NEA to Fuel Next Gen Authentication

Company Uses Behavioral and Environmental Factors, Not Passwords, to Identify Users

SAN FRANCISCO, CA – August 1, 2017 – UnifyID is leading the development of an implicit authentication platform that requires zero conscious user actions. The Company announced today that it has closed $20 million in Series A financing led by NEA. Its General Partners Scott Sandell and Forest Baskett will be joining UnifyID’s Board. Investors Andreessen Horowitz, Stanford StartX, and Accomplice Ventures previously invested in the company’s Seed round, bringing the total invested to $23.4 million. This latest round of funding will be used to grow the team to expand enterprise trials, accelerate research and maintain the company’s position as the leader in implicit authentication and behavioral biometrics.

“Our goal is seamless security: you can be yourself and the devices and services you interact with will naturally recognize you based on what makes you unique,” said UnifyID founder John Whaley. Since 2015, UnifyID has been using a combination of signal processing, optimization theory, deep learning, statistical machine learning, and computer science to solve one of the oldest and most fundamental problems in organized society: How do I know you are who you say you are?

To date, the company has developed the first implicit authentication platform designed for online and physical world use. Named RSA’s Unanimous Winner for 2017, UnifyID utilizes sensor data from everyday devices and machine learning to authenticate you based on unique factors like the way you walk, type, and sit. The company has also partnered with global corporations to assess the generalizability of their software across industries.

The UnifyID solution combines over 100 different attributes to achieve 99.999% accuracy without users changing their behavior or needing specific training. The key is the proliferation of sensors combined with innovations in machine learning. UnifyID is the first product to develop neural networks to run locally on the phone to process sensor data in real-time.

“A large percentage of data breaches involve weak, default or stolen passwords, and we think passwords – as we know them – need an overhaul,” said Forest Baskett, NEA General Partner. “We are excited about the world-changing potential of UnifyID’s frictionless, universal authentication solution.”

In the past six months, UnifyID received national attention by winning security innovation competitions at TechCrunch Disrupt, RSA, and SXSW and continued to grow its engineering, machine learning, and enterprise deployment talent. For career and partnership inquiries, learn more at https://unify.id.

 

ABOUT UNIFYID
Headquartered in San Francisco, UnifyID is the first implicit authentication platform. Its proprietary approach uses behavioral and environmental factors to identify users. In February of 2017, the Company was recognized as the most innovative start-up at RSA. For career and partnership inquiries, learn more at https://unify.id.

ABOUT NEA
New Enterprise Associates, Inc. (NEA) is a global venture capital firm focused on helping entrepreneurs build transformational businesses across multiple stages, sectors and geographies. With over $19 billion in cumulative committed capital since the firm’s founding in 1977, NEA invests in technology and healthcare companies at all stages in a company’s lifecycle, from seed stage through IPO. The firm’s long track record of successful investing includes more than 210 portfolio company IPOs and more than 360 acquisitions. For additional information, visit www.nea.com.

 

Contacts
Grace Chang
grace [at] unify.id

Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations

PDF of full paper: Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations
Full-size poster image: Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations

[This paper was presented on July 21, 2017 at The First International Workshop on The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security (CV-COPS 2017), in conjunction with the 2017 IEEE Conference on Computer Vision and Pattern Recognition.]

Vinay Uday Prabhu and John Whaley, UnifyID, San Francisco, CA 94107

Abstract

In this paper, we would like to draw attention towards the vulnerability of the motion sensor-based gait biometric in deep learning-based implicit authentication solutions, when attacked with adversarial perturbations, obtained via the simple fast-gradient sign method. We also showcase the improvement expected by incorporating these synthetically-generated adversarial samples into the training data.

Introduction

In recent times, password entry-based user-authentication methods have increasingly drawn the ire of the security community [1], especially when it comes to its prevalence in the world of mobile telephony. Researchers [1] recently showcased that creating passwords on mobile devices not only takes significantly more time, but it is also more error prone, frustrating, and, worst of all, the created passwords were inherently weaker. One of the promising solutions that has emerged entails implicit authentication [2] of users based on behavioral patterns that are sensed without the active participation of the user. In this domain of implicit authentication, measurement of gait-cycle [3] signatures, mined using the on-phone Inertial Measurement Unit – MicroElectroMechanical Systems (IMU-MEMS) sensors, such as accelerometers and gyroscopes, has emerged as an extremely promising passive biometric [4, 5, 6]. As stated in [7, 5], gait patterns can not only be collected passively, at a distance, and unobtrusively (unlike iris, face, fingerprint, or palm veins), they are also extremely difficult to replicate due to their dynamic nature.

Inspired by the immense success that Deep Learning (DL) has enjoyed in recent times across disparate domains, such as speech recognition, visual object recognition, and object detection [8], researchers in the field of gait-based implicit authentication are increasingly embracing DL-based machine-learning solutions [4, 5, 6, 9], thus replacing the more traditional hand-crafted-feature- engineering-driven shallow machine-learning approaches [10]. Besides circumventing the oft-contentious process of hand-engineering the features, these DL-based approaches are also more robust to noise [8], which bodes well for the implicit-authentication solutions that will be deployed on mainstream commercial hardware. As evinced in [4, 5], these classifiers have already attained extremely high accuracy (∼96%), when trained under the k-class supervised classification framework (where k pertains to the number of individuals). While these impressive numbers give the impression that gait-based deep implicit authentication is ripe for immediate commercial implementation, we would like to draw the attention of the community towards a crucial shortcoming. In 2014, Szegedy et al. [11] discovered that, quite like shallow machine-learning models, the state-of- the-art deep neural networks were vulnerable to adversarial examples that can be synthetically generated by strategically introducing small perturbations that make the resultant adversarial input example only slightly different from correctly classified examples drawn from the data distribution, but at the same time resulting in a potentially controlled misclassification. To make things worse, a large plethora of models with disparate architectures, trained on different subsets of the training data, have been found to misclassify the same adversarial example, uncovering the presence of fundamental blind spots in our DL frameworks. After this discovery, several works have emerged ([12, 13]), addressing both means of defence against adversarial examples, as well as novel attacks. Recently, the cleverhans software library [13] was released. It provides standardized reference implementations of adversarial example-construction techniques and adversarial training, thereby facilitating rapid development of machine-learning models, robust to adversarial attacks, as well as providing standardized benchmarks of model performance in the adversarial setting explained above. In this paper, we focus on harnessing the simplest of all adversarial attack methods, i.e. the fast gradient sign method (FGSM) to attack the IDNet deep convolutional neural network (DCNN)-based gait classifier introduced in [4]. Our main contributions are as follows: 1: This is, to the best of our knowledge, the first paper that introduces deep adversarial attacks into this non-computer vision setting, specifically, the gait-driven implicit-authentication domain. In doing so, we hope to draw the attention of the community towards this crucial issue in the hope that further publications will incorporate adversarial training as a default part of their training pipelines. 2: One of the enduring images that is widely circulated in adversarial training literature is that of the panda+nematode = gibbon adversarial-attack example on GoogleNet in [14], which was instrumental in vividly showcasing the potency of the blind spot. In this paper, we do the same with accelerometric data to illustrate how a small and seemingly imperceptible perturbation to the original signal can cause the DCNN to make a completely wrong inference with high probability. 3: We empirically characterize the degradation of classification accuracy, when subjected to an FGSM attack, and also highlight the improvement in the same, upon introducing adversarial training. 4: Lastly, we have open-sourced the code here.

Figure 1. Variation in the probability of correct classification (37 classes) with and without adversarial training for varying ε.

Figure 2. The true accelerometer amplitude signal and its adversarial counterpart for ε = 0.4.

2. Methodology and Results

In this paper, we focus on the DCNN-based IDNet [4] framework, which entails harnessing low-pass-filtered tri-axial accelerometer and gyroscope readings (plus the sensor-specific magnitude signals), to, firstly, extract the gait template, of dimension 8 × 200, which is then used to train a DCNN in a supervised-classification setting. In the original paper, the model identified users in real time by using the DCNN as a deep-feature extractor and further training an outlier detector (one-class support vector machine-SVM), whose individual gait-wise outputs were finally combined into a Wald’s probability-ratio-test-based framework. Here, we focus on the trained IDNet-DCNN and characterize its performance in the adversarial-training regime. To this end, we harness the FGSM introduced in [14], where the adversarial example, x ̃, for a given input sample, x, is generated by: x ̃ = x + ε sign (∇xJ (θ, x)), where θ represents the parameter vector of the DCNN, J (θ, x) is the cost function used to train the DCNN, and ∇x () is the gradient function.

As seen, this method is parametrized by ε, which controls the magnitude of the inflicted perturbations. Fig. 2 showcases the true and adversarial gait-cycle signals for the accelerometer magnitude signal (given by amag(t) = √(a2x (t) + a2y (t) + a2z (t))) for ε = 0.4. Fig. 1 captures the drop in the probability of correct classification (37 classes) with increasing ε. First, we see that in the absence of any adversarial example, we were able to get about 96% ac- curacy on a 37 class classification problem, which is very close to what is claimed in [4]. However, with even mild perturbations (ε = 0.4), we see a sharp decrease of nearly 40% in accuracy. Fig. 1 also captures the effect of including the synthetically generated adversarial examples in this scenario. We see that, for ε = 0.4, we manage to achieve about 82% accuracy, which is a vast improvement of ∼ 25%.

3. Future Work

This brief paper is part of an ongoing research endeavor. We are currently currently extending this work to other adversarial-attack approaches, such as Jacobian-based Saliency-Map Approach (JSMA) and Black-Box-Attack (BBA) approach [15]. We are also investigating the effect of these attacks within the deep-feature-extraction+SVM approach of [4], and we are comparing other architectures, such as [6] and [5].

References
[1]  W.Melicher, D.Kurilova, S.M.Segreti, P.Kalvani, R.Shay, B. Ur, L. Bauer, N. Christin, L. F. Cranor, and M. L. Mazurek, “Usability and security of text passwords on mobile devices,” in Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 527–539, ACM, 2016. 1
[2]  E. Shi, Y. Niu, M. Jakobsson, and R. Chow, “Implicit authentication through learning user behavior,” in International Conference on Information Security, pp. 99–113, Springer, 2010. 1
[3]  J. Perry, J. R. Davids, et al., “Gait analysis: normal and pathological function.,” Journal of Pediatric Orthopaedics, vol. 12, no. 6, p. 815, 1992. 1
[4]  M. Gadaleta and M. Rossi, “Idnet: Smartphone-based gait recognition with convolutional neural networks,” arXiv preprint arXiv:1606.03238, 2016. 1, 2
[5]  Y. Zhao and S. Zhou, “Wearable device-based gait recognition using angle embedded gait dynamic images and a convolutional neural network,” Sensors, vol. 17, no. 3, p. 478, 2017. 1, 2
[6]  S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelza- her, “Deepsense: A unified deep learning framework for time-series mobile sensing data processing,” arXiv preprint arXiv:1611.01942, 2016. 1, 2
[7]  S. Wang and J. Liu, Biometrics on mobile phone. INTECH Open Access Publisher, 2011. 1
[8]  Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. 1
[9]  N. Neverova, C. Wolf, G. Lacey, L. Fridman, D. Chandra, B. Barbello, and G. Taylor, “Learning human identity from motion patterns,” IEEE Access, vol. 4, pp. 1810–1820, 2016. 1
[10]  C. Nickel, C. Busch, S. Rangarajan, and M. Mo ̈bius, “Using hidden markov models for accelerometer-based biometric gait recognition,” in Signal Processing and its Applications (CSPA), 2011 IEEE 7th International Colloquium on, pp. 58–63, IEEE, 2011. 1
[11]  C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013. 1
[12]  C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9, 2015. 1
[13]  N. Papernot, I. Goodfellow, R. Sheatsley, R. Feinman, and P. McDaniel, “cleverhans v1.0.0: an adversarial machine learning library,” arXiv preprint arXiv:1610.00768, 2016. 1
[14]  I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explain- ing and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014. 2
[15] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against deep learning systems using adversarial examples,” arXiv preprint arXiv:1602.02697, 2016.

Our Pledge to Inclusion and Diversity: 1 Year Later

Lack of diversity in tech has been a long-standing problem, but in recent months it’s become increasingly apparent that inclusion is more than an aspirational need. Diversity is the DNA that creates robust, flourishing environments primed for tough conversations and progressive thinking at UnifyID.

Last June, UnifyID was one of 33 companies that signed the White House Tech Inclusion Pledge on the eve of President Obama’s Global Entrepreneurship Innovation Summit 2016 to ensure that our employees reflect the diverse nature of the American workforce.

Although UnifyID is a small startup, we still want to lead in all areas of our business—and diversity is no exception. As an inaugural signatory of this agreement, the first of its kind, we proudly reaffirm our commitment to being an industry leader in promoting inclusion for all.

Our team on a normal day in the office.

The pledge was three-part, with the central aim of increasing representation of underrepresented groups:

“Implement and publish company-specific goals to recruit, retain, and advance diverse technology talent, and operationalize concrete measures to create and sustain an inclusive culture.”

This was a task we have invested significant time and effort into accomplishing, particularly in our recruitment operations. Many job seekers and experts alike have criticized the inconsistent process around the technical interview, noting its irrelevance to the workplace and its unnecessary biases against women. Taking into account these guidelines from Code2040, a collaborating organization of the Tech Inclusion Pledge, we’ve created a low stress, context-relevant, and fun language-agnostic technical challenge to improve the non-biased screening in our recruiting process.

“Annually publish data and progress metrics on the diversity of our technology workforce across functional areas and seniority levels.”

It is important to us that we are transparent about our gender, racial, and ethnic data because diversity and inclusion is a core part of our company mission to be authentic, be yourself. As such, this report is our first attempt at this, and we hope to make future updates more frequently.

On our team, 70 percent are people of color and 24 percent are women. Immigrants make up a significant part of the American workforce, and we are also proud to call UnifyID the workplace of immigrants who collectively represent 17 nationalities (including our interns). Paulo, one of our machine learning engineers, has quipped, “the office sometimes feels like a Model UN conference!” While our size makes us unable to release more detailed breakouts (we respect employee privacy), we will continue to release diversity data in a timely and transparent fashion.

“Invest in partnerships to build a diverse pipeline of technology talent to increase our ability to recognize, develop and support talent from all backgrounds.”

Here in the Bay Area, we are surrounded by terrific organizations that support underrepresented groups in tech, and we’ve been fortunate to be involved in these events. Some of these events include the Out for Undergrad (O4U) annual Tech Conference, which allowed us to connect with many high-achieving LGBTQ+ undergraduates from across the country, as well as the Y Combinator-hosted Female Founders Conference, or even SF Pride last month!

Our head of Product, Grace Chang, at last year’s Out for Undergrad (O4U) Tech Conference!

Diversity strengthens us as a company and as a country, so this remains one of our foremost priorities as we continue to grow (we’re hiring) and we hope to see improvement in our workplace and in the industry as a whole. We are thrilled that today, the number of companies that have signed the pledge has risen to 80.

We encourage more companies to sign this Tech Inclusion Pledge here.