Fooling Fintech AIs Part 2: How to fool the machines

Published in

The Grow Blog

5 min readMay 17, 2017

In the previous post of this series, we learned that users may try to fool the AI systems in the financial products they use. In this part, we will explain how a user can exploit a supervised machine learning system for their own benefit.

This is Part 2 of our series on Fooling Fintech AIs. You can read part 1 here.

Traditional software follows well-defined logic designed by product managers and software engineers. In contrast, machine learning systems rely on data sets to automatically determine what the optimal logic should be for a given decision. Generally, a machine learning system looks at millions of previous examples, called the ‘training data’, and calibrates itself to make the best decisions possible on new examples that it hasn’t seen before, called the ‘inference data’. Sometimes, some of the training and/or inference data is under the user’s control, and this allows a user to influence the machine learning system’s decisions. Researchers call this area ‘adversarial machine learning’.

The most familiar example of adversarial machine learning is when a spammer attempts to trick a spam filter. The spammer may be able to inject a large amount of poisoned training data into the spam filter if they can send a huge number of legitimate emails using words and phrases that resemble their spam messages. Or, more commonly, they may try to rewrite their spam messages using words and phrases like those found in the spam filter’s ‘legitimate email’ training examples. The goal in either case is to cause the spam filter to misclassify their spam messages as legitimate emails.

Machine learning systems rely on a key assumption that makes them weak against intentional deception: they assume that the training data (the data used to build the model) and the inference data (the data used to make a decision in production) share the same distribution. A user with some control over the training data or the inference data can break this assumption. This can result in the machine learning system making incorrect decisions more often. Note that there are other types of attacks that an adversarial user can make on a machine learning system, but we will limit our focus for these articles to attacks fitting this pattern.

Modifying the training data is the problem that occurred with Microsoft’s Twitter chat bot, “Tay”, in March, 2016. Initially, the chatbot was trained to converse like an American teenage girl. But, it was also designed to interact with Twitter users to improve its conversational skills. A large group of internet trolls informally coordinated to poison the bot’s training data by sending it profane and hateful tweets. The attack was successful and the bot began to tweet misogynistic, racist, and profane messages. Microsoft took Tay offline, but some damage to Microsoft’s brand was already done.

This type of attack is possible when users are able to manipulate enough of the training data. The attackers poison the training data by submitting examples that misrepresent the intended training distribution, corrupting the learning process. The result may be that the machine learning system is not able to perform its task with the expected level of performance. In extreme cases, it could be retrained to perform an entirely different task than what it was intended for.

A more common type of attack occurs when a user manipulates the inference data to be different from the training data. A common example is a fraudulent “card testing” transaction that fraudsters use to determine whether stolen credit card information is active. The fraudster chooses the merchant, amount, and timing of the transaction so that they are as close as possible to a legitimate purchase. The fraudster is attempting to recreate transactions that the fraud prevention system will have been trained to recognize as legitimate. Even though the amount is small, these transactions are often successful. Fortunately, high value fraudulent transactions are much easier for the fraud detection system to identify. So, the “card testing” transaction is mostly useful for testing stolen credit cards before attempting a higher value fraudulent transaction.

Some machine learning systems can be alarmingly easy to fool when an attacker is able to manipulate the inference data. Computer vision researchers have demonstrated that adding imperceptible noise to an image can influence a trained neural network to classify the image however the attacker wants. While the machine learning system may be excellent at classifying ‘natural’ images, imperceptible deviations from ‘natural’ can exploit weaknesses in a neural network’s training process.

Adding subtle noise to an image can cause computer vision systems to confidently misclassify images [source: OpenAI]

A fraudster could use this technique to bypass a facial recognition system used to verify someone’s identity during an online banking transaction. There are also more sophisticated techniques. Or in some cases, it could even be bypassed using a mask made out of a high quality photograph.

These security issues can affect any machine learning system that makes imperfect decisions. They are not limited to specific use cases or statistical modeling techniques. In the financial services industry, machine learning models have been used for decades (such as with credit scoring or fraud detection), and machine learning will continue to take on a greater and greater role in the industry. For may applications, users have an opportunity to affect the training data or the inference data through their actions. Users who try to game the system can negatively affect the performance of the machine learning systems used for fraud detection, credit underwriting, cross-sales, customer service, or any other aspect of a business that uses machine learning. These attacks can lead to significant financial losses for a financial institution. Despite the risks, machine learning has been used with great success for decades and can continue to be used with confidence. The key is to design critical systems holistically with full appreciation of the strengths and the weaknesses of the machine learning components of those systems.

In the next article in this series, we will discuss practical steps financial institutions can take when designing their machine learning systems to account for users attempting to game their systems, and how to prevent successful attacks, which may result in financial losses. We will also see how attempted attacks can teach the systems to become more robust against future attacks.

—

Written by Dan Mazur, PhD, Data Scientist at Grow.

Fooling Fintech AIs Part 2: How to fool the machines

Written by Grow