It has never been easier for developers to create simple-yet-powerful data-driven or data-informed tools. Through case studies, we'll explore a few projects that use a number of open source libraries or modules in concert. Next, we'll cover strategies for learning these new tools. Finally, we wrap up with pitfalls to keep in mind when gluing powerful things together quickly.
We'll open with an introduction and a quick overview of each section of the talk.
Part 1 - Case Studies: two brief case studies covering tools or projects that were created by composing several open source python libraries
Part 2 - Strategies for Becoming Familiar with New Tools: covering the use of pip, the ipython shell and a few popular debuggers
Part 3 - Potential Pitfalls: Common conceptual issues that arise when creating powerful tools quickly with "glue code"
Part 4 - Closing, Q&A: Giving a list of resources and links, and hopefully learning about some new tools from the audience as well
For the sake of time we'll move fairly quickly through each case, focusing on how libraries used in each project come together at a high level with a few flow charts, diagrams and key "glue code" listings.
The first case study will cover a tool the author built that takes a natural language search term and suggests unit tests to run from a large suite of test cases.
Next, we'll review the tools used (and how they came together) in a popular 7-part tutorial on mining, processing and visualizing Twitter data.
Following the case studies, we'll segue into a brief overview of strategies the author uses to explore new tools and how they might fit together.
We'll round out the talk with some points on the potential pitfalls of quickly gluing things together.
Beyond the technical concerns - it can be easy to imagine we understand a domain or problem better than we do with such powerful tools quickly at hand.
What is a good strategy for discovering unknown unknowns and turning them into known unknowns when tackling a new problem? How can we close those remaining gaps and ensure we're using a tool correctly?
Lastly, how do we decide when to stop tumbling down the rabbit hole? Which are the gaps that don't necessarily need to be closed?
A comprehensive list of categorized links will be given. The talk will close with a Q&A where (hopefully) audience members will contribute a few of their favorite tools not covered by the talk.
Both author-provided and audience-provided links will be compiled into a resource that will be shared via a blog post (similar to this post on resources for Machine Learning self study; compiled from a talk I gave at a BarCamp in Chiang Mai, Thailand).