There are a number of fraud detection tools available on the market, but for some companies, it’s important to have greater control over how the fraud scoring algorithms work and to have them keep learning and improving using machine learning. Our team has put together a demo and some sample code to show how you can build this with Fuzzy.ai.

For this example, let’s assume that Fuzzy.ai is being used by a company with an online store, where customers order products online. After running their business for a year, they’ve noticed that all of the following factors often indicate that a transaction is fraudulent:

Transactions from new users

Transactions much larger than the online store’s average size

Mismatches and very large distances between the billing address, shipping address, phone number location and IP address location

They also noticed that in addition to the points above, some lesser indicators of fraud are:

Using the most expensive shipping option might indicate fraud

Multiple credit cards used from same IP increases fraud

Multiple credit cards shipping to the same address increases fraud

With Fuzzy.ai, you could quickly build an agent based on these rules and get a fraud score in real-time for each transaction. In fact, we built one and created a demo interface you could use to try it out:

Chatbots have become a really popular way for companies to interact with customers. Most chatbot experiences, though, are pretty static. As a developer, you need to set up a messaging flow with content that doesn’t change much based on user interaction.

Some of our users have discovered that you can use Fuzzy.ai along with most chatbot frameworks to create a much more personalized experience for your customers. Some examples include offering dynamic promotions based on user behavior, and recommending the most appropriate products.

Behind the scenes, Fuzzy.ai is taking the data from the chat responses to predict the discount most likely to result in a purchase. To get started, the Fuzzy.ai agent powering the discount decision is a pretty simple one. It’s based on these 5 rules:

IF hasAccount IS no THEN discount IS high
IF hasAccount IS yes THEN discount IS low
IF tutorial IS no THEN discount IS high
IF tutorial IS yes THEN discount IS low
lastAPICall INCREASES discount

And the discount that Promobot offers in our example ranges from 0% to 20% based on the user’s responses.

Once Promobot offers the user a discount code, it then uses Fuzzy.ai’s machine learning to automatically optimize its rules to offer the lowest effective discount possible.

In our example, Promobot does this by asking the user if they plan on using the discount code and then providing that feedback to Fuzzy.ai. In real implementations, you might send this feedback to Fuzzy.ai once the user actually makes a purchase.

The feedback Promobot provides is the % of full price paid by the customer. So, if Promobot offers a 15% discount and the customer says they plan to use it, Promobot sends “85” (100% minus 15%) to Fuzzy.ai. If the customer says they don’t plan to use it, then we consider it a lost sale and Promobot sends “0” (zero) to Fuzzy.ai.

Fuzzy.ai’s machine learning algorithms are seeking to maximize this value, and will automatically optimize the rules to get the highest possible result.

This is a simple example, but a more complex one could include dozens of questions and external data to create a truly personalized experience.

Some of our Google Sheets Add-on beta users have been using it to do some really interesting stuff. One of my favorites is the company that is using it to identify their best customers by creating custom user scores using Fuzzy.ai and user data from Intercom.

Here’s how you can build something similar:

Gather Data

First, export user data from Intercom (Instructions here) and upload it to a new Google Sheet. It should look something like this:

Given all of the data available in Intercom, you can really easily do a lot here. To get started, let’s narrow it down to a few columns:

With these columns, we can easily get a few useful facts about each customer:

How long has it been since they signed up

How long since their last login

How many times they’ve logged in

Whether they’re on a free or paid plan

You can add columns and use an easy formula in Google Sheets to calculate the number of days between today and the Signed up and Last seen dates:

=datedif(B2,today(),"D")

And an if/then statement to show whether or not the user is on a paid plan:

A common application of Fuzzy.ai is in powering custom recommendation engines. For many companies, generic solutions don’t offer enough flexibility (or require too much work manually setting up links between all of the different products in the catalog), and building a custom recommendation engine from scratch requires way too much time and effort.

How Do The Recommendations Work?

When a user is looking at a product page on an online store, the goal of this recommendation agent is to identify the other products in the catalog that might be relevant.

The way Fuzzy.ai differs from other machine learning platforms is that we let developers encode their own knowledge about how a system should work as a set of rules. The Fuzzy.ai API uses those rules to provide recommendations. Over time, as feedback is sent to Fuzzy.ai on how well the rules performed, the API learns and improves automatically.

In our sample recommendation agent, we identified a few rules that might show affinity between the current product and each of the other products in the catalog:

If the current product and another product are in the same category, it’s likely to be a good recommendation

If the current product and another product are not in the same category, it’s less likely to be a good recommendation

If the current product and another product have very different prices, it’s less likely to be a good recommendation (i.e. it decreases affinity)

If many customers who bought the current product also bought another product, that’s likely to be a very good recommendation (i.e. it increases affinity)

If the current product and another product have many of the same words in their titles, it’s likely to be a good recommendation (i.e. it increases affinity)

If the current product and another product have many of the same works in their descriptions, it’s likely to be a good recommendation (i.e. it increases affinity)

Here’s what those rules look like when created within Fuzzy.ai:

Keep in mind that these were purposely chosen because they can work generically for many online stores. Specific stores may have other rules that make sense, for example, a clothing store may want to show products for the same season and so might add rules like:

One of our favorite quick demos of Fuzzy.ai’s capabilities is to show how easy it is to take your own Twitter feed, score each of the Tweets, and surface the most relevant ones. It’s one of the first new agent templates we built for the platform and it’s a lot of fun to try out.

Getting Started

Getting your Tweet Relevance agent set up will take just a couple of minutes. Once you’ve signed up for a free Fuzzy.ai account, go to your Dashboard and take note of the API key shown on the top left-hand corner of the page. You’ll need it later.

Results

Once you’ve got things set up, your app will show you the tweets that are most relevant based on the rules we defined earlier. Each tweet will be scored like this:

What’s Next?

Now that you’ve got this simple app working, what else can you do? If you want to play around with the app and Fuzzy.ai, here are some other things you could try:

Add new rules that take into account your friends’ behavior: how many people you follow liked a tweet, how many people you follow shared a tweet.

Try combining different rules, for example if you want to identify tweets that are liked by your friends but not a lot of other people, that rule could be: IF number of likes by friends IS very high AND number of likes IS low THEN relevance IS very high.

Set up rules to increase relevance of tweets that include keywords you’re interested in and decrease relevance of tweets that include keywords you’re not interested in.

On Saturday the Montreal tech community came together at Notman House for Montreal Baseball Hackday. The event was organized by Plank Design to celebrate Montreal’s baseball heritage and the love of the game. The fact that baseball junkies love data and stats makes baseball and hacking a natural pair.

I showed up at the event with a few ideas for using fuzzy logic in a project. My friend Aran Rasmussen suggested we form a team to take on one of the challenge projects: “Prove that the 1994 Expos were the best team in baseball.” We were joined by Reda Lofti who helped us out with HTML and CSS for the project. It was Reda’s first hackday ever, and he’d just learned HTML last month.

For Montreal baseball fans, 1994 is the championship year that never was. The season ended in August due to the crippling baseball strike that led to the first cancellation of the World Series in almost a century. The Montreal Expos, who’d shown strong results for the previous few years, led the league at the time of the shut-down. And many people think that they would have been world champions if the season hadn’t ended prematurely.

But a lot happens in baseball between August and October. Wins and losses mid-season don’t really count. If you’re trying to argue that the Expos were the best team in baseball in 1994, how do you do it?

Fuzzy scoring

We took on the challenge by reframing it this way: what is the combination of statistics, and weights, that when presented to a fuzzy logic agent, give the Expos as the number 1 team? While Aran scoured the Web for stats from 1994, I started putting together a fuzzy agent strategy that would work.

If you’re unfamiliar with fuzzy logic, here’s the short description: a fuzzy agent accepts a number of input variables and maps them onto fuzzy sets — intuitive terms from the problem domain. It then uses a set of fuzzy rules to reason about the input variables and produce output fuzzy memberships. The output fuzzy values are then defuzzified into a single crisp score.

For our project, I decided to use as an output a score between 0 and 10, showing how “good” a team in 1994 was. We’ve found a few problem domains where this kind of unitless output is helpful.

For input values, Aran managed to come up with seven important stats that baseball afficionados use to compare teams and predict future performance:

Run differential

ERA

OPS

Speed score

Strikeout-to-walk percentage

WHIP

RA9-WAR

Some of these are familiar to any baseball fan; others are only relevant to the most hardcore SABR fanatic. But we wanted to pick numbers that were commonly used to say who’s the best team.

Aran boiled down the stats to a single table that we used for input to the fuzzy agent. I then broke down each input variable into 5 fuzzy sets — “veryLow”, “low”, “medium”, “high”, and “veryHigh”. Casual review showed each statistic varied linearly, so I just broke down the stats in 5 sets of equal size.

Some of the sets, like ERA, varied inversely with our output score. A low ERA shows a better team, and a high ERA shows a worse team. But most of them varied proportionally — a higher run differential shows a better team. So I mapped each of the inputs to an output using simple rules.

Varying the weights

The point of our project, however, was to help a user pick their argument points to show that the 94 Expos were the best. There are techniques to optimize the weights of your fuzzy rules to come out with expected scores based on training data. However, we wanted to give the user an interface to vary their own weights.

So we put together some radio buttons labelled: “Ignore”, “Low”, “Medium”, and “High”. (We changed them later, but you get the idea.) Each button represented a relative weight for rules based on that input: 0%, 25%, 50%, and 100%. When the user changes an input, we post the new weights to the back end; it then makes a new fuzzy agent with those rule weights, and scores each of the teams from the 1994 season — with corresponding data. It returns the values to the front-end, which then shows them in a sorted table.

I had a lot of fun doing this project. Aran and Reda were fun to work with, and the baseball hackday was a blast. (A bag of Cracker Jack in the 9th inning was what I needed to get through the day!) Baseball is a good example of a mix between hard data and user wisdom, which is an area that fuzzy logic shines. I’m looking forward to seeing what other ways we can apply fuzzy logic to baseball stats.