Follow @continuumio on Twitter
  • About
    • Mission
    • Code of Conduct
    • Press
    • NumFocus
    • Talk Videos
  • Venue
  • Schedule
  • Sponsor
    • Become a Sponsor
    • Sponsor List
  • Speakers
    • Headline Speaker Bios
    • Speaker Bios
    • Abstracts
  • News
  • Community
  • CFP

Speaker Bios

(click on the speaker name or photo to view speaker details)


----------

TBD,

Adam Hajari

Next Big Sound
From DataFrame to Web Application in 10 minutes,
Adam Hajari is a data scientist at Next Big Sound where he builds tools to detect breaking artists, explore demographic and geographic trends, generate data-driven reports, and import and clean data from dozens of data providers spanning the music and book publishing industries. Inspired by R's Shiny library, Adam built Spyre, a web application framework library for python, which allows python programmers to quickly and easily deploy interactive web applications without having to muck around with HTML, CSS, or Javascript.

Akira Shibata

Shiroyagi Corporation, PyData Tokyo
Putting Together World's Best Data Processing Research with Python ,
CEO, Shiroyagi Corporation, Ph.D. Experimental Particle Physics PyData Tokyo Organizer. Shiroyagi Corporation develops and provide a news curation app for the Japanese market, called Kamelio (http://kamel.io). Our second round of funding was lead by the major Japanese VC, Global Brain this February. Kamelio aims to recommend contents for user's specific interests based on their usage behavior. Previously a postdoc scientist at NYU as a data scientist working for the LHC project, which found the Higgs "god" particle 2 years ago, I specialised in statistical modelling of the massive data produced by the particle accelerator. Later as a Consultant at Boston Consulting Group at the Strategy Institute of BCG in New York, I worked on a project which was eventually published as "Your Strategy Needs a Strategy" on Harvard Business Review.

Alain Ledon

Baruch College
Logistic Regression & NFL,
Coming soon.

Amit Bhattacharyya

Sr Data Scientist, Teachers Pay Teachers
Logistic Regression & NFL,
Coming Soon

Andreas Mueller

NYU CDS, scikit-learn
Advanced scikit-learn ,
Andreas Mueller received his PhD in machine learning from the University of Bonn. After working as a machine learning researcher on computer vision applications at Amazon for a year, he recently joined the Center for Data Science at the New York University. In the last four years, he has been maintainer and one of the core contributor of scikit-learn, a machine learning toolkit widely used in industry and academia, and author and contributor to several other widely used machine learning packages. His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize the access to high-quality machine learning algorithms.

Anna Herlihy

MongoDB
Monary: Really fast analysis with MongoDB and NumPy,
Anna is software engineer with an passion for large data processing and Python. She currently works for MongoDB in NYC.

BOF

Biological Data Science , Data Community/Meetup Organizers, Healthcare Analytics, Submit topic to admin@pydata.org,
------

Bryan Van De Ven

Continuum Analytics
Beautiful Interactive Visualizations in the Browser with Bokeh,
Mr. Van de Ven received undergraduate degrees in Computer Science and Mathematics from UT Austin, and a Master's degree in physics from UCLA. He has worked at the Applied Research Labs, developing software for sonar feature detection and classification systems on US Naval submarine platforms. He also spent time at Enthought, where he worked on problems in financial risk modeling and fluid mixing simulation, and also contributed to the Chaco visualization library. He has also worked on an assortment of iOS projects as an independent consultant.

Bugra Akyildiz

Axial
A Machine Learning Pipeline with Scikit-Learn,
I am a Data Scientist at Axial where I work on information retrieval and recommender systems. I received B.S from Bilkent University and M.Sc from New York University focusing signal processing and machine learning. I also do consulting in NLP and machine learning on a project basis; document classification, topic modeling and time-series signal analysis.

Dan Foreman-Mackey

NYU Dept. of Physics
Time series analysis using Gaussian Processes in Python and the search for Earth 2.0,
Dan Foreman-Mackey is a graduate student in the Center for Cosmology and Particle Physics at NYU where he studies exoplanets using probabilistic data analysis and Python. He is the lead developer of the emcee Python package for Markov Chain Monte Carlo (http://dfm.io/emcee) and a high performance Gaussian Process implementation called George (http://dfm.io/george). In his spare time he makes stupid things on the internet (see, for example, the Open Source Report Card http://osrc.dfm.io).

Daniel Blanchard

Educational Testing Service
Simple Machine Learning with SKLL 1.0,
Daniel Blanchard is a Research Engineer in the NLP & Speech group at Educational Testing Service (ETS). He has been coding for over 20 years and has been using Python exclusively for the past 4 years. He is the primary maintainer of the DRMAA Python, GridMap, SKLL libraries and is co-maintainer of chardet.

Daniel Krasner

KFit Solutions/Columbia University
(Easy), High Performance Text Processing with Rosetta,
Daniel Krasner is the co-Founder of KFit Solutions, a data science consulting firm, and a research scholar with the “Declassification Project” at Columbia University. His current interests and work focus on high performance statistical solutions in text and natural language processing. He is the co-creator or “Rosetta,” an open source python text processing library. In addition, Daniel continually works with a number of hedge funds in the city, building financial modeling and decision support systems. Previously, Daniel was the chief data scientist at Sailthru, an email and behavioral analytics platform, a senior researcher at Johnson Research Labs, and a professor teaching Applied Data Science in the Columbia University statistics department. Prior to entering the world of data science, Daniel Krasner was a researcher at the Mathematical Sciences Research Institute in Berkeley and an assistant professor of mathematics at UCLA. He holds a PhD in mathematics from Columbia University.

Greg Reda

Datascope Analytics
Translating SQL to pandas. And back.,
Greg is a data science consultant in Chicago. Prior to consulting, he spent three years leading the data team at GrubHub, where he was the first data hire. As a big believer that the best way to truly learn is to teach others, Greg is the author of many widely read tutorials on web scraping, pandas, and unix commands for data science.

Hannah Aizenman

Cuny Graduate Center
Get To Know Your Data,
I am working on a PhD in Computer Science at The Graduate Center, CUNY. My research is in using machine learning to make sense of and visualize large, mostly climate, datasets, so I've spent a lot of time learning (and teaching) the scientific Python stack. I mostly use Python for my work, with the occasional other language thrown in.

Hugo Shi

PyData Conference
KitchenSink - framework for working with remote datasets interactively,
I've been working at Continuum Analytics for the past 3 years, working on Bokeh, a web based interactive visualization toolkit for Python. Before that I was working in Quantitative Finance for a variety of financial institutions.

Ian Huston

Pivotal
Using Cloud Foundry for Data Driven Python apps,
Ian is a data scientist for Pivotal and uses the Python data stack on a wide range of customer projects from fraud detection to transport and logistics. Ian has a background in numerical analysis and simulation and his expertise includes high performance computing for scientific applications, perturbative analysis of large systems of differential equations and the differential geometry underlying relativistic physics. He completed a PhD in theoretical cosmology at Queen Mary, University of London and received a MSc from Imperial College London in theoretical physics. Ian’s work has been published in leading international physics journals and he released Pyflation, the Python numerical package used in his research to the community.

James Powell

NYC Python
Title Coming Soon,
James Powell is a NYC-based Python programmer with experience in quantitative finance and data science. He's also very active in the Python community, where he organizes NYC Python which is the world's largest and most active Python meetup group. He also works with the numeric & scientific computing non-profit NumFOCUS to help organize the PyData conference series. In addition, he's a frequent speaker at Python conferences, and has been invited to speak at events such as PyData New York, PyData London, PyGotham, the conference ‘For Python Quants,’ and PyCon Spain.

Jared Lander

Lander Analytics
Decreasing Uncertainty with Weakly Informative Priors and Penalized Regression,
Coming Soon

Jason Grout

IPython, Bloomberg
Advanced IPython Notebook Widgets,
Jason Grout received a PhD in mathematics from Brigham Young University, was a postdoc at Iowa State University, an assistant professor of mathematics at Drake University in Des Moines, IA, and recently joined Bloomberg L.P. He has been contributing to the open-source Sage mathematical software system since 2007, and until recently was leading the development effort and running the Sage online notebook and the Sage cell server for several years. For the last several years, Jason has been contributing to IPython, including many contributions to the widget infrastructure.

Jonathan Dinu

Zipfian Academy
On Building a Data Science Curriculum,
Jonathan is redefining data science education as the co-founder and CTO of Zipfian Academy. He first discovered his love of all things data while studying Computer Science and Physics at UC Berkeley. In a former life, he worked for Alpine Data Labs developing distributed machine learning algorithms for predictive analytics on Hadoop. Jonathan has always had a passion for sharing the things he has learned in the most creative ways he can. He has been a mentor at Dev Bootcamp, taught classes at General Assembly, and was an instructor at Hack Reactor. At Zipfian Academy, he gets to combine his two favorite things: humans and code.

Julia Evans

Stripe
Recalling with precision,
Julia lives in Montréal and works on Stripe's machine learning team. She is a Hacker School alumna, keeps a blog (http://jvns.ca/), and likes learning about operating system internals. When not programming, Julia spends a lot of time thinking about how to organize amazing conferences, make scary concepts more accessible, and writing about surprising programming things she's learned recently.

Karan Dodia

Continuum Analytics
Rapid Exploration and Visualization of Large Datasets ,
Junior Software Developer at Continuum Analytics.

Kevin Wilson

Knewton
Evaluating skills in educational and other settings: An overview,
Kevin started programming in BASIC long ago on an old Apple IIe, making little calculators and applications to do his homework for him. Throughout middle and high school, he ran a small business doing graphics design and custom programming, which began with an opportunity at the Sentinel-News, the local newspaper. (When he started high school, they terminated is contract, and so he can take absolutely no credit for their current monstrosity of a website.) But programming was a gateway drug to mathematics, which he studied in earnest at the University of Michigan and then as a graduate student at Princeton University. After completing his Ph.D., he began working at Knewton as a data scientist, where he has been for the past 2 years. Knewton works with publishers and content creators to build adaptive content, helping students navigate coursework based on their current and past performance in the course, and helping teachers understand where students are struggling and excelling. He has a special interest in proficiency estimation and arithmetic statistics, but is also on a quest to get more acknowledgments, and as such, loves hearing about other people's projects.

Kyle Beauchamp

Memorial Sloan Kettering Cancer Center
Omnia.md: Engineering a Full Python Stack for Biophysical Computation,
PhD 2013 Stanford University. Research Fellow at Memorial Sloan Kettering Cancer Center, with John. D. Chodera. Developer, Folding@Home Distributed Computing Project. Principal, Co-Principal, or Contributing Developer for numerous Python scientific projects, with emphasis on Biophysics: MDTraj, OpenMM, MSMBuilder, Mixtape, PyMBAR, and more.

Luis Miguel Sanchez

ttwick
Using Python to design a parametric catastrophic (CAT) earthquake bond and predict its credit spread.,
Luis has over 20 years of experience in capital markets, insurance, consulting, and engineering, with emphasis on quantitative analysis. Luis has held multiple senior executive positions and quantitative analyst roles for Barclays Capital, Lehman Brothers, Deutsche Bank, NetRisk, AIG, and a couple of hedge funds, where he structured and launched over 10bn USD worth of deals, many with exposure to exotic assets. Luis obtained his MBA on a Fulbright LASPAU scholarship, and a BSc in Civil Engineering with double concentration in Hydraulics and Structural Analysis from the University of the Armed Forces of Venezuela. He is the founder and CEO of ttwick, Inc.

Manuel Rivas

University of Oxford
Python for Personal and Population Genome Interpretation ,
Born in Managua, Nicaragua in the 80's; raised in Miami; educated by the school of hard knocks: MIT; now living in the city of spires: Oxford. I have spent my academic career bridging the world of statistics, computing, genetics, biology, and medicine to improve our understanding of the genetic contribution to disease susceptibility. Looking forward to sharing tools and python libraries with the community for the next generation of big data and health.

Matt Greenwood

Two Sigma Investments
The Polyglot Beaker Notebook,
Matt Greenwood is the Chief Innovation Officer at Two Sigma Investments in NYC where he has happily solved hard problems for the last decade. Prior to Two Sigma, he worked at a Telecommunications startup in the Bay Area. Matt began his career at Bell Labs, working on the Inferno operating system under Dennis Ritchie. Matt earned a B.A. and M.A. in Maths from Oxford University. He also holds a Ph.D. in Mathematics from Columbia University where he taught for a number of years.

Matthew Rocklin

Continuum Analytics
Blaze Foundations: Part 1,
Matthew Rocklin likes numerics, mathematics, and programming paradigms. He contributes to a variety of open source projects and endeavors to demonstrate the value of abstract solutions to concrete problems. A graduate of UC Berkeley (Physics, Math) and of the University of Chicago (PhD in CS).

http://matthewrocklin.com/

Matthew Rocklin

Continuum Analytics
Python in Business Intelligence: What's Missing?,
Matthew Rocklin likes numerics, mathematics, and programming paradigms. He contributes to a variety of open source projects and endeavors to demonstrate the value of abstract solutions to concrete problems. A graduate of UC Berkeley (Physics, Math) and of the University of Chicago (PhD in CS). http://matthewrocklin.com/

Michael Becker

DataPhilly
Data Science: It's Easy as Pyǃ,
Michael Becker is a Data Engineer at AWeber and founder of the DataPhilly Meetup group. On a day to day basis, he spends a majority of his time acquiring, scrubbing, exploring, and visualizing data. He loves machine learning and gets his kicks out of clustering, regression and classification algorithms.

Michelle Fullwood

Grids, Streets & Pipelines: Making a linguistic streetmap with scikit-learn ,
I'm a grad student in linguistics who loves Python, natural languages, and maps.

Mike Pittaro

Dell
High Performance Hardware for Data Analysis ,
Mike Pittaro has over 25 years experience in the high technology industry, specializing in high performance computing, data warehousing, and distributed systems. He has held engineering and support positions at Alliant Computer, Kendall Square Research, Informatica, and SnapLogic. Mike is currently the principal architect for big data on Dell's Cloud Software Solutions team, where he focuses on delivering big data solutions. He is a member of the ACM, The Free Software Foundation, and the OpenStack Foundation.

Milos Miljkovic

Hyperion Analytics
Analyzing Satellite Images With Python Scientific Stack,
Milos received PhD from CUNY where he developed methods for cancer detection using using neural networks and infrared hyperspectral images. Intersection of chemistry, spectroscopy, biology, statistics, pathology, computer science and instrument design was the focus of his interest and activities while he was research assistant professor at Northeastern University. He currently works as consultant in Boston.

Neal Snow

University of South Florida
Straight from the horse's mouth: Working with XBRL tagged 10-Ks,
PhD candidate in accounting at the University of South Florida. Lover of Python, running, cycling and truth.

Olga Botvinnik

University of California, San Diego
Data-driven conversations about biology,
Hello, I'm Olga. I'm a Bioinformatics/Computational Biology PhD student at University of California, San Diego. I study RNA biology, meaning the arrows in DNA --> RNA --> Protein, because RNA has its own crazy lifecycle and lots of things can go wrong and cause sad times (neurodegeneration, cancer, etc). Besides RNA, I'm passionate about data visualization (wrote prettyplotlib, contribute to seaborn), tea, yoga and Beyonce.

Patrick Grinaway

Memorial Sloan Kettering Cancer Center
Omnia.md: Engineering a Full Python Stack for Biophysical Computation,
Coming soon.

Phillip Cloud

pandas, Continuum Analytics
Driving Blaze in the Real World of Data Land Mines,
Phillip enjoys building Python based tools that make others’ experience with data analysis a pleasure. He graduated with a BS and MA in psychology from CUNY City College where he worked on quantifying the activity of spontaneously firing neurons in vivo. In addition to being a Python evangelist he is also a core developer on the pandas data analysis library.

Ritesh Bansal

Rational Insights
Python for Fun, Profit and Retirement Planning,
Ritesh Bansal has been working at the intersection of code, data and math since 1993 when he built the first commercial real time visualization for live stock market data. After graduating with a BS in Mathematics and Computer Science from Carnegie Mellon he spent 13 years on Wall Street as a quant building high performance and distributed trading systems and also ran a high frequency trading group. He now wrangles data and code building machine learning applications at data science consultancy Rational Insights.

Sasha Laundy

Polynumeral
How to Make Your Future Data Scientists Love You,
Sasha is the founding data scientist and engineer at Polynumeral, a data science consultancy in New York City. She helps clients solve hard data problems and design their data strategy, including the World Bank, New York Public Radio, DonorsChoose.org, and Warby Parker. Previously she worked at Twilio and was an early employee at Codecademy. She founded Women Who Code, a global non-profit which connects 16,000 technical women in 14 countries.

Saul Diez-Guerra

Thinkful
Performance Python,
Saul Diez-Guerra works as Engineering Lead at Thinkful in New York City, where he uses Python to create awesome online learning experiences. Prior to that he worked at Ampush, where he built ad management and bidding systems, after a stint in social network R&D at Telefónica I+D. He hails from Spain, where he received both a Bachelors in Computer Science as well as one in Telecommunications.

Scott Draves

TwoSigma, Beaker Notebook
The Polyglot Beaker Notebook,
Scott Draves is the lead engineer for the Beaker Notebook at Two Sigma Investments in NYC. In the past he worked for Google Maps and a variety of startups in the Bay Area. He is best known as the pioneering software artist who created the Electric Sheep, a collective intelligence consisting of 450,000 computers and people that uses mathematics and genetic algorithms to realize an infinite abstract animation. Scott Draves has a BS in Mathematics from Brown University and a PhD in Computer Science from Carnegie Mellon University.

Stefan Karpinski

Julia
Julia + Python = ♡ ,
Stefan Karpinski is one of the co-creators and core developers of Julia, a high-level, high-performance dynamic programming language for technical computing. He is an applied mathematician and data scientist by trade, having worked at Akamai, Citrix Online, and Etsy, but currently is employed as a researcher at MIT, focused on advancing Julia’s design, implementation, documentation, and community.

Stefan Urbanek

Squarespace
Data warehouse and conceptual modelling with Cubes 1.0, Python in Business Intelligence: What's Missing?,
Business intelligence professional. Author of Cubes – open-source OLAP framework in PYthon.

Stephen Pimentel

FoundationDB
Python as a Query Language for Distributed Key-Value Stores ,
Stephen Pimentel is an engineer at FoundationDB in Vienna, VA, where he uses Python to leverage the power of distributed database technology. He writes code, tutorials, and online courses on data modeling in highly concurrent environments. Along with developer evangelism, his passion is for data analytics applied to large data sets. He holds an M.S. in Electrical Engineering and Computer Science from Johns Hopkins. You can find him on Twitter @stephenpiment.

Sudeep Das

OpenTable Inc.
Using Data Science to Transform OpenTable Into Your Local Dining Expert,
I am a data scientist with a passion for turning data into meaningful insights and stories. I believe that powerful visualizations provide key entry points into understanding data. Very often, I find myself hand-rolling innovative visualizations driven by unique characteristics of the data. Currently, at OpenTable, I am using an arsenal of data science tools to derive insights from our extensive data trove. My current projects include Natural Language Processing and Topic Analysis on an extensive review corpus, to reveal salient features of restaurants and reviewers. I am also driving the understanding of restaurant capacity utilization and diner spending trends. I am also building a recommendation stack using Matrix Factorization, Content-Based and hybrid Factorization Machine based approaches. I love using Python based tools like Pandas, Scikit-learn, Gensim, PySpark, Matplotlib and mpld3. For most of my professional life, I have been an Astrophysicist. I have strong experience in solving complex problems using innovative methods. I have extensive leadership abilities with a knack of bringing smart people together to solve problems. I have excellent communication skills, and am frequently invited to deliver talks, lectures and colloquia in domestic and international conferences. One of my most exciting achievements have been leading a team that isolated a subtle cosmological effect for the first time from noisy data. I have authored of more than 50 peer reviewed articles in leading journals.

Tim Spurway

Mozilla
Disco: Distributed Multi-Stage Data Pipelines,
Tim Spurway is a software engineer at Mozilla who wrangles large data sets using the Disco ecosystem. He is the creator of two Disco based projects: Hustle - the "column oriented, embarrassingly distributed relational event database", and Inferno - a rule-based stream processor and job controller.

Tobi Bosede

NYC Pyladies
PyCassa: Setting up and using Apache Cassandra with Python (in Windows),
Tobi Bosede is the lead organizer for NYC Pyladies. She is passionate about making the tech community more accessible to underrepresented groups and has worked with various companies to make that happen. Tobi has also taught python programming for the General Assembly in New York and written technical articles for Pearson. By day she is a data analyst and programmer at an investment bank. Tobi's professional work spans academia to industry from Cornell University to Sprint. She holds a bachelors degree in mathematics from the University of Pennsylvania.

Tony Tran

SF Bay Area Machine Learning
Making BIG DATA smaller ,
Tony Tran is the founder and co-organizer of the San Francisco Bay Area Machine Learning Meetup Group. He is currently a Machine Learning consultant helping companies tackle big data problems. Previously, Tony worked as a Data Engineer at Bizo where he built large scale search services and analyzed advertising data to help guide strategic business decisions. He holds an MS in Computer Science from UC Irvine specializing in Machine Learning and Computer Vision and a BS in Computer Science from UC San Diego.

Travis Oliphant

Co-Founder & CEO, Continuum Analytics
Welcome,

CEO and Co-Founder, Continuum Analytics Introduction to NumPy; Introduction to SciPy

Dr. Oliphant has a Ph.D. in Biomedical Engineering from the Mayo Clinic, and M.S. and B.S. degrees in Electrical Engineering (and Math) from Brigham Young University. Travis has worked extensively with Python for numerical and scientific programming since 1997, and was the primary developer of the NumPy package and the author of the definitive Guide to NumPy. He is also the primary founding author of the SciPy package. During his academic career, he has worked in the fields of satellite remote sensing, Magnetic Resonance Imaging (MRI), Ultrasound, elastography, and general inverse problems. He was an Assistant Professor of Electrical and Computer Engineering at Brigham Young University from 2001 to 2007 where he taught courses in probability theory, electromagnetics, inverse problems, and signal processing. In addition, he directed the BYU Biomedical Imaging Lab, and performed research on scanning impedance imaging. He has done consulting work since 1997 in laser scattering off of semiconductors, sparse matrix calculations for search engines, and mesh transformations for fluid dynamics. Dr. Oliphant co-founded Continuum Analytics, Inc. in 2012 and currently serves as its CEO.

Sponsors

DIAMOND

PLATINUM

GOLD

GOLD

LANYARD

SILVER

SUPPORTING

DIVERSITY SCHOLARSHIP

MEDIA

COMMUNITY

Questions? Comments? Contact admin@pydata.org