Follow @continuumio on Twitter
  • About
    • Mission
    • Code of Conduct
    • Press
    • NumFocus
    • Talk Videos
  • Venue
  • Schedule
  • Sponsor
    • Become a Sponsor
    • Sponsor List
  • Speakers
    • Keynote Bios
    • Speaker Bios
    • Abstracts
  • News
  • Community
  • CFP

Speaker Bios

(click on the speaker name or photo to view speaker details)


Alex F. Bokov

University of Texas Health Science Center at San Antonio
Using Python and Paver to Control a Large Medical Informatics ETL Process
Ph.D. in Physiology from UT Health Science Center at San Antonio. Currently scientific lead at the Clinical Research Informatics Division at the Department of Epidemiology and Biostatistics at the UT Health Science Center at San Antonio, working on adapting and extending the Greater Plains Collaborative's ETL code as part of a PCORI grant. Over ten years of experience in data management and bioinformatics.

Andrew Montalenti

CTO, Parse.ly
Rapid Data Visualization, from Python to Browser, Real-time streams and logs with Storm and Kafka
Andrew is the co-founder and CTO of Parse.ly, a Python-built tech startup that helps top online publishers understand what content their audience is interested in -- and why. Prior to starting Parse.ly, Andrew was a technologist with nearly a decade of experience in finance, high tech, and online media. He earned a degree in Computer Science from NYU. A dedicated Pythonista, JavaScript hacker, and open source advocate, Andrew is also a technical author and speaker. He has presented at PyData, PyCon, and several other technology conferences.

Bartek Wilczynski

University of Warsaw
Using Python to Find a Bayesian Network Describing Your Data
He started using python at the time of transition from 1.6 to 2.0. In 2003 he became involved with Biopython during an scholarship in Lawrence Livermore National Lab working on DNA sequence analysis. He's been working on high-throughput data analysis ever since, working primarily with microarray and DNA sequencing data. SInce 2008 he's involved in development of BNfinder - a python software for finding the structue of Bayesian networks for machine learning purposes. He holds a PhD in mathematics and is an assistant professor of computer science at University of Warsaw.

Brian Granger

Cal Poly State University, IPython
Functional Performance with Core Data Structures, IPython Interactive Widgets
Brian Granger is an Assistant Professor of Physics at Cal Poly State University in San Luis Obispo, CA. He has a background in theoretical atomic, molecular and optical physics, with a Ph.D from the University of Colorado. His current research interests include quantum computing, parallel and distributed computing and interactive computing environments for scientific and technical computing. He is a core developer of the IPython project, the creator of PyZMQ and a contributor to SymPy. Contact him at ellisonbg@gmail.com or @ellisonbg (Twitter, GitHub).

Bryan Van De Ven

Continuum Analytics
Beautiful Interactive Visualizations in the Browser with Bokeh, Bokeh, Bokeh, Bokeh - Interactive Visualization for Large Datasets, Bokeh Tutorial, Interactive Plots Using Bokeh, The IPython protocol, frontends and kernels
Mr. Van de Ven received undergraduate degrees in Computer Science and Mathematics from UT Austin, and a Master's degree in physics from UCLA. He has worked at the Applied Research Labs, developing software for sonar feature detection and classification systems on US Naval submarine platforms. He also spent time at Enthought, where he worked on problems in financial risk modeling and fluid mixing simulation, and also contributed to the Chaco visualization library. He has also worked on an assortment of iOS projects as an independent consultant.

Bugra Akyildiz

Axial
A Machine Learning Pipeline with Scikit-Learn, A Thorough Machine Learning Pipeline via Scikit-Learn, Outlier Detection in Time Series Signals
I am a Data Scientist at Axial where I work on information retrieval and recommender systems. I received B.S from Bilkent University and M.Sc from New York University focusing signal processing and machine learning. I also do consulting in NLP and machine learning on a project basis; document classification, topic modeling and time-series signal analysis.

Burc Arpat

Facebook
Why Python is Awesome When Working With Data at any Scale
Burc is a Sr. Quantitative Engineering Manager at Facebook. Prior to Facebook, Burc led a quantitative engineering team at Google for 3 years, working on time series forecasting and capacity planning. He holds an MBA from UC Berkeley Haas (but don't hold that against him) and a PhD from Stanford University. Burc flirted with C++ for years but eventually realized Python is his one true love. Together, they are happily implementing machine learning and natural language processing algorithms at Facebook. He and C++ remain good friends.

Chris Beaumont

Harvard Center for Astrophysics
Data Analysis with SciDB-Py
Chris Beaumont is a software engineer at Harvard’s Center for Astrophysics. He focuses much of his effort on building interactive tools to visualize and explore multidimensional datasets. He is the lead developer of Glue (glueviz.org), and contributes to SciDB-Py. He holds a PhD in astrophysics from the University of Hawaii's Institute for Astronomy.

Chris Laffra

Google
PyAlgoViz: Python Algorithm Visualization in the Browser
I worked on numerous programming languages, development tools, systems, and projects, such as Java, Eclipse, and Python. I like to take things apart and fix them, with disregard for whether they are washing machines or software frameworks. For programming languages this means I ended up writing numerous profiling and visualization tools for Procol, Smalltalk, JavaScript, C++, Java, EGL, and Python. The latest in the series is PyAlgoViz and I like to discuss why I built the site, how it works, and what I learned doing it. I am a software engineer at Google.

Christopher Roach

Linkedin
Map Reduce: 0-60 in 80 Minutes, Map Reduce: 0-60 in 80 Minutes, Map Reduce: 0-60 in 80 Minutes
Christopher Roach has been everything from an embedded software engineer working on missile defense, to an iOS developer at Apple, to the first engineer at an early stage Y Combinator startup, and he's currently putting his Python skills to work at Linkedin. Along the way, he's continued to nurture his academic interests with published papers in the areas of swarm intelligence and social network analysis. He holds degrees in Finance and Economics and a Master's degree in Computer Science and has had several Python related articles published in sources such as MacTech magazine and the O'Reilly network.

Dan Connolly

University of Kansas Medical Center
Using Python and Paver to Control a Large Medical Informatics ETL Process
Dan Connolly leads software development for Medical Informatics at the University of Kansas Medical Center. Development of HERON, a clinical data respository with over 1 billion facts, played a key role in a 2010 Clinical and Translational Science Award (CTSA) and an award from the Patient Centered Outcomes Research Institue (PCORI) in 2014, as well as supporting National Cancer Designation in 2012. Previously, he led work on HTML, Semantic Web, and Web Architecture at the World Wide Web Consortium and participated in Semantic Web policy research at MIT. With Tim Berners-Lee, he developed a reasoning engine in python as well as a number of applications with an emphasis on microformats and integrating calendars, maps, and social structures. He holds a Computer Science degree from the University of Texas at Austin, made his first open source release in 1992, and has been active in the python community since the second python workshop in 1995. His research interests include formal methods for software quality and capability security for the Web.

Daniel Moisset

Machinalis
Querying your Database in Natural Language
I've been programming Python for 14 years and started playing with machine learning problems 1 year ago. I'm a cofounder of Machinalis, a company with a strong focus on machine learning and data processing. I'm also a teacher at the CS department at the National University of Córdoba, Argentina.

Daniel Yao

Yelp
Ad Targeting at Yelp
Coming Soon

Eytan Bakshy

Facebook
Designing and Deploying Online Experiments with PlanOut
Eytan Bakshy is a researcher and senior member of the Facebook Data Science Team. He has been conducting field experiments at Facebook for over three years, focusing on diffusion of information and behaviors in networks. He also develops tools and methods for online experimentation, including PlanOut, an open-source toolkit for designing field experiments.. Eytan holds a Ph.D. in information from the University of Michigan and a B.S. in mathematics from the University of Illinois.

Freedom Dumlao

ZEFR, Inc.
Building an Army of Data Collecting Robots in Python
Freedom Dumlao is a Lead Technologist for ZEFR based out of Boston, MA. He gets a kick out of building cool things that people like using. Creator of Flask-Classy, a popular extension to the super popular Flask web framework.

Greg Lamp

Yhat
ggplot for python
Greg is a co-founder of Yhat, a platform for efficiently running predictive models in production software. He is a former Product Manager at On Deck Capital, where he built systems for analyzing and making loans to small businesses in real-time. He primarily works with Python and R and has written open source packages for both languages. Greg is a graduate of the University of Virginia.

Guy Bayes

Facebook
Welcome to PyData at Facebook!
Guy Bayes leads Facebook’s Enterprise Business Intelligence team. He is a 20-year vertern in analytics and data warehousing with a specialty in integrating the emerging Big Data tools and concepts — such as Hadoop — with traditional BI and statistical and advanced analytics. And yes... he is a remote descendant of THAT Bayes

Jacob Barhak

None
The Reference Model for Disease Progression uses MIST to find data fitness
Jacob Barhak has diverse international background in engineering, computing science, and disease modeling. He was educated at the Technion Israel Institute of Technology. He acquired experience in manufacturing and healthcare modeling from the University of Michigan. The Reference Model for disease progression was self developed in 2012 and he currently pursues this development effort as a freelancer. URL: http://sites.google.com/site/jacobbarhak/

James Horey

OpenCore.io
Ferry - Share and Deploy Big Data Applications with Docker
I'm the founder of OpenCore.io and the author of Ferry, an open-source Python-based tool to help developers share and run big data applications on their local machines. My mission is to help developers create and deploy big data applications.

James Powell

NYC Python
Generators Will Free Your Mind, Generators Will Free Your Mind, Integration With the Vernacular (the NumPy Approach), Panel Discussion: "Shouldn't more companies be using data science?", Title Coming Soon
James Powell is a NYC-based Python programmer with experience in quantitative finance and data science. He's also very active in the Python community, where he organizes NYC Python which is the world's largest and most active Python meetup group. He also works with the numeric & scientific computing non-profit NumFOCUS to help organize the PyData conference series. In addition, he's a frequent speaker at Python conferences, and has been invited to speak at events such as PyData New York, PyData London, PyGotham, the conference ‘For Python Quants,’ and PyCon Spain.

Jason Sundram

Facebook
A Full Stack Approach to Data Visualization: Terabytes (and Beyond) at Facebook
Jason is a Quantitative Engineer at Facebook, where he creates visualization applications to yield insights from petabytes of data. Before that, he was a Senior Data Scientist at PayPal, where he analyzed and visualized geo data. He is an avid violinist and chamber musician and cofounder of The Haydn Enthusiasts, a Bay Area collective that is performing the complete String Quartets of Joseph Haydn in monthly installments over a 3-year period. His current extracurricular interests include exploring integrations of his passion for classical music with data visualization.

Jiwon Seo

Stanford
SociaLite: Python-integrated query language for big data analysis
Jiwon Seo is a PhD student in Stanford university, studying parallel query language for data analysis. He contributed to Python in various ways; for example, he implemented PEP 289 (Generator Expressions) and PEP 3102 (Keyword-Only Arguments). He designed and implemented a parallel and distributed query language, called SociaLite. The SociaLite compiler generates efficient parallel code from user query to run in parallel on distributed machines. SociaLite queries can be embedded in Python programs, allowing users to enjoy flexibility of Python and efficiency of SociaLite. More information is in http://socialite.stanford.edu

Jonathan Dinu

Zipfian Academy
Data Engineering 101: Building your First Data Product , On Building a Data Science Curriculum
Jonathan is redefining data science education as the co-founder and CTO of Zipfian Academy. He first discovered his love of all things data while studying Computer Science and Physics at UC Berkeley. In a former life, he worked for Alpine Data Labs developing distributed machine learning algorithms for predictive analytics on Hadoop. Jonathan has always had a passion for sharing the things he has learned in the most creative ways he can. He has been a mentor at Dev Bootcamp, taught classes at General Assembly, and was an instructor at Hack Reactor. At Zipfian Academy, he gets to combine his two favorite things: humans and code.

Jonathan Frederic

IPython
IPython Interactive Widgets
Jonathan Frederic is currently working as a full time IPython Developer. He works on the IPython Notebook and related tools. Jonathan enjoys computer graphics and networking and he spends his spare time writing a video game in Python. He has a B.S. in Physics from the California Polytechnic State University, San Luis Obispo. He can be contacted either by email at jonathan@bitplexity.com or on GitHub at github.com/jdfreder.

Joshua Bloom

UC Berkeley
Data Science at Berkeley
Dr. Joshua Bloom is an astronomy professor at the University of California, Berkeley where he teaches high-energy astrophysics and Python for data scientists. He has published over 250 refereed articles largely on time-domain transients events and telescope/insight automation. His book on gamma-ray bursts, a technical introduction for physical scientists, was published recently by Princeton University Press. He is also co-founder and CTO of wise.io, a startup based in Berkeley. Josh has been awarded the Pierce Prize from the American Astronomical Society; he is also a former Sloan Fellow, Junior Fellow at the Harvard Society, and Hertz Foundation Fellow. He holds a PhD from Caltech and degrees from Harvard and Cambridge University.

Keith Bourgoin

Backend Lead, Parse.ly
Real-time streams and logs with Storm and Kafka
Keith is the backend lead at Parse.ly, a Python-built tech startup that helps top online publishers understand what content their audience is interested in -- and why. Keith led the company's backend architecture which now processes billions of page views per month from over 200 million monthly unique visitors. He is the current maintainer of a Python client library for Apache Kafka, which is called samsa. Through his work at Parse.ly, he serves as the in-house expert for several important data aggregation and messaging technologies, such as Kafka, Storm, Redis, & MongoDB. Keith studied Computer Science at Case Western Reserve University, where he got his Masters with a focus in machine learning and information retrieval.

Lynn Root

Spotify, PyLadies
How to Spy with Python, How to Spy with Python, How to Spy with Python
Lynn Root is a software engineer at Spotify working on backend for partnership integrations. She is the founder and leader of PyLadies San Francisco, an international mentorship group for women and friends in the Python community. With PyLadies, Lynn regularly hosts local workshops, sprints, and events to promote Python to all experience levels of programmers. Lynn is also a board member of the Python Software Foundation and a member of the Django Software Foundation.

Matthew Rocklin

Continuum Analytics
Functional Performance with Core Data Structures, Old School - Functional Data Analysis , Python in Business Intelligence: What's Missing?
Matthew Rocklin likes numerics, mathematics, and programming paradigms. He contributes to a variety of open source projects and endeavors to demonstrate the value of abstract solutions to concrete problems. A graduate of UC Berkeley (Physics, Math) and of the University of Chicago (PhD in CS). http://matthewrocklin.com/

Mehdi Amini

Pythran: Static Compiler for High Performance
Mehdi is a software engineer, passionated by high performance and mobile environment. He is dedicated to design and produce neat C++ software and to work with agile teams to build high quality products. While Python is not usually his first choice, he is involved on his free time with the open source project Pythran: a static compiler from Python to C++11. Currently offering to work as a consultant in Silicon Valley, he was previously project manager at Silkan. He holds a PhD from MINES ParisTech (about compiler transformations for automatic parallelization for GPU), and two Masters from Université de Strasbourg.

Michael Manapat

Stripe
Python as Part of a Production Machine Learning Stack, Python as Part of a Production Machine Learning Stack
Michael Manapat leads Stripe's machine learning team, which is responsible for both the data science and the production infrastructure behind the company's machine learning products. He was previously a Software Engineer at Google, a Postdoctoral Fellow in Applied Mathematics at Harvard, and a Ph.D. student in Mathematics at MIT.

Mike Starr

Facebook
Dataswarm
Mike Starr is a software engineer in Facebook’s Data Infrastructure organization. During the previous 1.5 years, Mike has worked on Facebook's distributed scheduler, “Chronos”, and ETL solution, “Dataswarm." Prior to Facebook, Mike came from Wisconsin (land of beer, brats, and cheese) where he earned his B.S. in Computer Science and Computer Engineering at UW-Madison.

Min Ragan-Kelley

IPython, UC Berkeley
IPython: what's new, what's cool, and what's coming
Min finished his PhD at UC Berkeley in computational plasma physics in May, 2013. He has been a contributor to IPython since 2006, when the first implementation of IPython's parallel computing capabilities was his undergraduate thesis at Santa Clara University. He now works full time on IPython at UC Berkeley, funded by the Alfred P. Sloan Foundation. He is also the maintainer of pyzmq, the Python bindings of the ZeroMQ messaging library.

Paul Ivanov

The IPython protocol, frontends and kernels
Paul is a core developer of IPython and matplotlib currently working at UC Berkeley's Brain Imaging Center. He is also an instructor for Software Carpentry as well as UC Berkeley Python bootcamps. Previously, he was a graduate student (PhD ABD) in the Vision Science program at UC Berkeley's Redwood Center for Theoretical Neuroscience, and earned a degree in Computer Science from UC Davis. Paul is passionate about teaching and cycling. Paul has a blog and is @ivanov on github and twitter.

Peter Prettenhofer

DataRobot, scikit-learn
Gradient Boosted Regression Trees in scikit-learn, Gradient Boosted Regression Trees in scikit-learn
Peter is a data scientist / software engineer at DataRobot. He studied computer science at Graz University of Technology, Austria and Bauhaus University Weimar, Germany focusing on machine learning and natural language processing. He is a contributor to scikit-learn where he co-authored a number of modules such as Gradient Boosted Regression Trees, Stochastic Gradient Descent, and Decision Trees.

Peter Wang

Continuum Analytics
, Python's Role in the Future of Data Analysis, State of the Py, 2015
Peter holds a B.A. in Physics from Cornell University and has been developing applications professionally using Python since 2001. Before co-founding Continuum Analytics in 2011, Peter spent seven years at Enthought designing and developing applications for a variety of companies, including investment bankers, high-frequency trading firms, oil companies, and others. In 2007, Peter was named Director of Technical Architecture and served as client liaison on high-profile projects. Peter also developed Chaco, an open-source, Python-based toolkit for interactive data visualization.

Portia Burton

PLB Analytics,
Know Thy Neighbor: An Introduction to Scikit-learn and K-NN, Know Thy Neighbor: An Introduction to Scikit-learn and K-NN
Portia Burton is the founder of PLB Analytics, a company which uses data to solve practical business problems. She is also the organizer of the Portland Data Science Group, a ragtag club of data visualization and data mining nerds. Portia loves poking around with pandas, scikit-learn and building nifty one-off apps with Flask.

Rob Story

DataPad
Up and Down the Python Data and Web Visualization Stack
Rob is a full stack engineer at DataPad and the author of the Vincent, Folium, and Bearcart Python data visualization packages. Coming from an Ocean and Aerospace Engineering background, he spent time both as a Naval Architect performing work for the Office of Naval Research and a Siting and Loads Engineer for Vestas Wind Power before joining DataPad. He currently lives in Portland, OR and tries to escape to the mountains as often as possible.

Rob Witoff

NASA, Y Combinator
Dark Data: A Data Scientist's Exploration of the Unknown
Rob Witoff is the Directorate Data Scientist for IT at the NASA Jet Propulsion Laboratory and a Y Combinator startup founder. He founded NASA’s inaugural data science incubation lab where his team applies new technologies to strategic problems across the agency that enable scientists and engineers to better interact with information. Prior to joining JPL, Witoff founded several successful technology startups. At JPL he has led development of interplanetary data visualization technology and led Space Station Systems Engineering for the experimental OPALS laser communication satellite. Witoff is NASA’s first IT data scientist and his data science team is incubating solutions to Big Data problems for spacebound explorers, earthborn assets and the people that make them possible.

Robert Brewer

Crunch.io
Crushing the head of the snake
Robert is the Chief Architect of Crunch.io, where he works with brilliant statisticians and developers on analytics as a service. He is also the lead developer of CherryPy, a leading HTTP server and framework for Python.

Ryan Rosario

Facebook
Sentiment Classification Using scikit-learn
Ryan is a Quantitative Engineer at Facebook working on Machine Learning and Natural Language Processing applications. Before Facebook he worked at web advertising startups in the LA area doing machine learning, natural language processing and algorithms involving real time bidding. He holds a Ph.D .in Statistics and M.S. In Computer Science from UCLA. Ryan has been coding in Python since 2005 where he started using it for web crawling and web graph mining. He is a recovering R user, though is still a fan most of the time.

Sarah Guido

Reonomy
K-means Clustering with Scikit-Learn
Sarah has just graduated from the University of Michigan's School of Information and will be joining Reonomy, a startup in NYC, as a data scientist. Three of her favorite things are Python, data, and machine learning.

Saul Diez-Guerra

Ampush
My First Numba, My First Numba , Speed Without Drag, Speed without drag
Saul Diez-Guerra is a senior software engineer at Ampush in New York City, where he uses Python to build ad management and bidding systems, after a stint in social network R&D at Telefónica. He hails from Spain, where he received both a Bachelors in Computer Science as well as one in Telecommunications.

Stephan Hoyer

The Climate Corporation
Introducing xray: extended arrays for scientific datasets
Stephan is a quantitative researcher at The Climate Corporation (http://climate.com), where he builds statistical models of weather and climate for agronomic applications. He recently completed his PhD in theoretical physics at UC Berkeley.

Thomas Kluyver

UC Berkeley
The IPython protocol, frontends and kernels
I am a core IPython developer, with a background in plant sciences.

Tim Spurway

Chango
Hustle: a column oriented, distributed event database
Tim Spurway is a software engineer who heads up the Large Scale Data team at Chango. He is the creator of Hustle - the "column oriented, embarrassingly distributed relational event database", which was open-sourced in March '04.

Travis Oliphant

Co-Founder & CEO, Continuum Analytics
Blaze, Building the PyData Community, Conda, Packaging and Deployment, Packaging and Deployment, Pythran: Static Compiler for High Performance, Scalable Analytics and Visualization: Connecting Expertise to Data With Python, Welcome

CEO and Co-Founder, Continuum Analytics Introduction to NumPy; Introduction to SciPy

Dr. Oliphant has a Ph.D. in Biomedical Engineering from the Mayo Clinic, and M.S. and B.S. degrees in Electrical Engineering (and Math) from Brigham Young University. Travis has worked extensively with Python for numerical and scientific programming since 1997, and was the primary developer of the NumPy package and the author of the definitive Guide to NumPy. He is also the primary founding author of the SciPy package. During his academic career, he has worked in the fields of satellite remote sensing, Magnetic Resonance Imaging (MRI), Ultrasound, elastography, and general inverse problems. He was an Assistant Professor of Electrical and Computer Engineering at Brigham Young University from 2001 to 2007 where he taught courses in probability theory, electromagnetics, inverse problems, and signal processing. In addition, he directed the BYU Biomedical Imaging Lab, and performed research on scanning impedance imaging. He has done consulting work since 1997 in laser scattering off of semiconductors, sparse matrix calculations for search engines, and mesh transformations for fluid dynamics. Dr. Oliphant co-founded Continuum Analytics, Inc. in 2012 and currently serves as its CEO.

Ville Tuulos

AdRoll
How to build a SQL-based data warehouse for 100+ billion rows in Python
Ville

Wes McKinney

DataPad
DataPad: Python-powered Business Intelligence, Practical Medium Data Analytics with Python
Coming Soon

Sponsors

DIAMOND

PLATINUM

GOLD

VIDEO

SILVER

LANYARD

SPONSOR WORKSHOP

SUPPORTING

DIVERSITY SCHOLARSHIP SPONSOR

COMMUNITY

MEDIA

Questions? Comments? Contact admin@pydata.org