PyData London 2015 | Presentation: Getting Meaning from Scientific Articles

Saturday 2:20 p.m.–3 p.m.

Getting Meaning from Scientific Articles

Éléonore Mayola

Audience level:: Novice

Description

With a background in biomedical research I tend to think about what would have been useful to automate at the time. The bibliography process is necessary but researchers also tend to find it boring! For it to be less time consuming it would be interesting to automate part of the process. I am using Python machine learning libraries to determine whether a research article is worth reading!

Abstract

The bibliography process means every scientist regularly has to go through a lot of published articles in parallel to her/his research. The aim is to: - know what other researchers are doing: they might be ahead of you, they might have proven your project is a dead end. - get some context to interpret your research results. Using specialised search engines can be inefficient if you don't use the "right" keywords. Researcher also tend to find bibliography boring so it would be interesting to automate part of the process!

In my talk I'll answer the following question: can Python machine learning libraries (nltk, scikit-learn) be used to determine whether a research article is worth reading? I'll use the TF-IDF measure to identify frequent topics appearing in specific scientific articles and train a classifier to distinguish between relevant and non-relevant articles depending and someone's interests.

Saturday 2:20 p.m.–3 p.m.

Getting Meaning from Scientific Articles

Éléonore Mayola

Description

Abstract

Sponsors

Become a sponsor.