Notebooks are great, they allow to explore your data and prototype models quickly. But they make it hard to follow good software practices. In this tutorial, we will go through a case study.We will see how to refactor our code as a testable and maintainable Python package with entry-points to tune, train and test our model so it can easily be integrated to a CI/CD flow.
Notebooks are great, they allow to explore your data and prototype models quickly. But they make it hard to follow good software practices such as versioning, testing or writing clean modular and reusable code. In this tutorial, we will go through a case study with a full model developed in a notebook. We will see how to refactor our code as a testable and maintainable Python package with entry-points to tune, train and test our model so it can easily be integrated to a CI/CD flow.
To do so we will leverage tools available in sklearn such as ColumnTransformer, custom transformers and pipelines.