As data scientists, building on published research allows us to stay on the cutting edge and not reinvent the wheel. However, in most cases, this transfer is not trivial. My talk will provide you with a five-step workflow that has made this transfer easier for me.
As data scientists working in small and mid-sized companies, we frequently want to make use of the research published by our colleagues in academia and large companies. Avoiding reinventing the wheel and staying on the cutting edge seems like a fantastic idea in general, yet the conditions in which research publications are written are usually not the same as in our industry settings. Accordingly, as data scientists, we need to evaluate the relevance, quality and reproducibility of each research paper for our specific work situation. So how do we access the gold mine that is published research in ML, stats, and related fields, and turn it into nuggets and even refined gold jewelry useful for our industry work and our customers? In my talk I will go over a five-step workflow I used for evaluating academic research papers and prototyping solutions based on them: Choosing your baseline and metrics, identifying the right research findings for your problem, evaluating quality, relevance and reproducibility, determining which findings to test first and what to keep in mind when prototyping them in your own work. Some of the things I will cover include a checklist for identifying papers for prototyping, a decision matrix for prioritizing which findings to prototype, some advice for finding the right papers and prototyping your chosen papers.