In Data driven jobs, we tend to have a huge technical debt, especially for beginners, as we mainly focus on data and not methods. We tend to lose an enormous amount of time re-iterating through code and losing ourselves in the endless loop of clean-and-try. The Question is: Is it possible to apply software-inspired methods in Data Analysis to establish new habits? Hopefully, the answer is yes.
Naturally people working in Data driven jobs, are focused on having the most cleaned data ever. We can't compromise on the goal since it gets most of us from the bottleneck to continue on advanced tasks in the roadmap, but we can change our methodology to get our goals in a record time. Having an established workflow can save us time, but, HOW?
Traditionally, we go through the data analysis part by submitting our work through Jupyter notebooks and ending up with several versions of cleaned data and not tested code which could give us a headache as soon as we get hurt with a never-seen-case before.
The answer to this problem is:
The main focus of this talk is design patterns and best practices learnt from Software Engineering and applied to Data Analysis. We will go through a quick demo to demonstrate the before and after and how to consolidate what we've learnt in a maintainable workflow.
The talk is mainly for fresh graduates and beginners as it's simply a share of experience and it's an open discussion for new ideas.