This talk will focus on two main goals. The first of these is to present the CV, and a job seeker's career trajectory more generally, as a highly interesting and fertile application domain for data science. Secondly, the talk will highlight a number of practical lessons and tips gleaned though years of hands-on experience in this area.
Despite the proliferation of professional networks such as LinkedIn, and the widespread use of online forms in which job seekers can fill in their details when applying for a job, the CV still remains relevant today. From online job boards to recruitment firms, personal websites to companies who remain old-fashioned in their hiring, the CV finds its way in every facet of job hunting. For the HR professional however, the CV is cumbersome to work with. For this reason, it has consumed countless man-hours of work usually spent on mundane tasks. For example, candidate details from CVs are often manually copied into forms to create a structured profile in a database.
From a data scientist's perspective, the CV provides us with a wealth of interesting applications and opportunities for investigation. In this talk, the author will present learnings from over three years of working in this domain as well as a range of applications that can arise.
Concretely, the talk will cover:
Though each of these topics could be an entire talk on their own, the objective here is to present the main idea of each and the role that they play within the broader context of CV parsing and interpreting job seeker data. The talk will also feature code snippets in Python and Apache Spark to give a practical foundation for some of the concepts discussed.