Thursday October 28 10:00 AM – Thursday October 28 10:30 AM in Talks II

Keeping sensitive data safe using recommendation systems

Liron Faybish (Ben-Kimon)

Prior knowledge:
No previous knowledge expected


What if I told you every-day recommendation systems can be utilized to detect unwanted behaviors? Now, what if I told you that it can also be harnessed to prevent internal security violations in organizations? Well, it’s happening. Kind of neat, right?

In this talk I’ll present the process and the results of using recommendation systems to detect unauthorized access to sensitive data.


In our every-day lives, whenever we hear about recommendation systems, it’s usually in the context of consumerism; whether via e-commerce, social-media or other advertising medias – recommendation systems enable us to consume different types of personalized content. In my session, I’ll advise you to explore that exact context through a different lens - by sharing about how recommendation systems’ methods can be utilized to detect unwanted behaviors, such as security anomalies within organizations.

In many cases, organizations store their data in a relational database. Naturally, teams and groups from the same areas of responsibility in the organization, access the same groups of tables, since they’re usually working on the same types of projects.

As some of this data may be sensitive, there’s a clear need to ensure that only authorized users will have access to it; In order to prevent potential security incidents and any possible misusage of the data. Therefore, it’s crucial for organizations to be able to detect any fundamentally different behaviors in its user’s routine data-access patterns.

My proposed solution is based on using recommendation systems methods for detecting anomalies in the access patterns of users to tables. The recommended products are the data tables, and the users are 'consuming' the tables. However, in the following case, we take a counter-intuitive approach, and look for products (=tables) that the user dislikes (= shouldn’t access).

Since we’re counting the amount of accesses each user performs to each table, the access patterns are an implicit feedback. In order to learn the usage patterns, we’re using Alternating Least Square (ALS) algorithm for implicit collaborative-filtering. This way, the model can calculate the likelihood for each access of user to a specific data-table and detect if it’s an anomaly or not.

Now, you probably ask yourself ‘How such concept can be implemented as a long-lasting solution to protect data abuse within MY organizations?’ I guess you have to join my talk to find out for yourself.

Notes: 1. The idea is using recommendation systems for anomaly detection. 2. The AUC we were able to achieve was higher than 90%.