As Data Scientists, we often assume we have data! It’s crazy not to. What should you recommend to a new user when you know nothing about them? In this talk we will discuss the challenges we faced, the assumption we took and the solutions we came up with while building a recommendations system for an interest based social network with limited data.
Introduction The problem we were trying to solve: - 6tribes: interest based social networking - Matching of user to users - Matching user to user groupings
Making recommendations is easy. - People that did X also did Y - We didn’t have any people!
Making good recommendations is hard. - User expectations are high! - Getting lost in data exploration - Ensuring fresh recommendations - Scaling recommendation engine
Making recommendations without data is impossible. Or is it? Engineering - Integrating and delivering continuously with CircleCI and OpBeat - Using data from Facebook and iPhone music and photo library - Enriching our data using external API’s like: HereAPI, FourSquare, Prismatic, Alchemy, iTunes - Using Flask to design a Rest API: Integration of data science and scala backend - increase development turn around time - Using Elasticsearch to deliver recommendations at scale - Using AWS for our data pipeline
User feedback - Internal testing of recommendations - being careful not to overfit (Anthony’s complaints and use of music) - Ensuring we act on feedback from our test users
Issues we had - Tech debt and lifecycle - Facebook signin: matured and out-dated data: not really representative of the user: show example and restrictions of data - No users! - Cost of getting data - API usage, specific data sets needed -> limited funding
Key Takeaways - How to enrich data - Building fast - Getting feedback - Ensuring scalability - Understanding of complexities of recommendation systems