Project Jupyter’s Steering Council member, JupyterHub and mybinder.org Core Developer, co-editor of The Journal of Open Source Education (JOSE) and co-authored an open source book, Teaching and Learning with Jupyter.
Jupyter notebooks have become the de-facto standard as a scientific and data science tool for producing computational narratives. Over five million Jupyter notebooks exist on GitHub today. Beyond the classic Jupyter notebook, Project Jupyter's tools have evolved to provide end to end workflows for research that enable scientists to prototype, collaborate, and scale with ease. JupyterLab, a web-based, extensible, next generation interactive development environment enables researchers to combine Jupyter notebooks, code and data to form computational narratives. JupyterHub brings the power of notebooks to groups of users. It gives users access to computational environments and resources without burdening the users with installation and maintenance tasks. Binder builds upon JupyterHub and provides free, sharable, interactive computing environments to people all around the world.
Kyle is the host of the Data Skeptic podcast, a weekly interview program covering topics related to data science, artificial intelligence, machine learning, statistics, and cloud computing. Data Skeptic celebrated its 5th birthday this year.
As principal architect at Data Skeptic Labs, he leads a team that builds bespoke machine learning and data solutions at scale in industries including aerospace, fraud prevention, retail, insurance, consumer packaged goods, and ad-tech. Kyle also serves as an advisor to several small and medium-sized companies. Data Skeptic Labs released its first official product (a chatbot platform) in 2019.
Serverless computing, edge computing, and cloud computing are distinct paradigms in which Python almost uniquely has been highly successful language. Use cases and a few opinionated design philosophies that work especially well in Python will be discussed as a live Data Skeptic episode is recorded exploring these topics. The session will include an interview guest doing a technical deep dive in the style the Data Skeptic podcast is known for, as well as an exclusive look at a Python-based project being secretly developed at Data Skeptic Labs.
Milana (Rabkin) Lewis is the co-founder and CEO of Stem, a financial platform that simplifies payments for musicians and content creators.
Prior to founding Stem, Milana spent five years as a Digital Media Agent at the premier global talent and literary agency, United Talent Agency (UTA). She helped build UTA’s digital offerings by advising the agency’s individual and corporate clients on emerging distribution platforms, digitally-driven fundraising and monetization opportunities. Milana represented a roster of digital creators, ranging from top YouTube and Vine stars to prominent bloggers and social media personalities, and helped grow their social channels into sustainable and profitable careers. In addition to this work, Milana sourced investment opportunities for UTA’s then newly-formed venture capital division.
Despite the abundance of data in today's digital age, not all data is either clear or actionable. Stem's mission addresses these two shortcomings by advocating clarity over transparency by providing actionable insights from data to both empower and enable artist driven businesses to make better informed decisions. In this talk, Milana will discuss how Stem utilizes data both internally and externally in ways that help drive growth for Stem's business, and its clients. Milana will be in conversation with Sylvia Tran, organizer of PyLadies Los Angeles.
Dr. Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine (UCI). He is working on robustness and interpretability of machine learning algorithms, along with models that reason with text and structure for natural language processing. Sameer was a postdoctoral researcher at the University of Washington and received his PhD from the University of Massachusetts, Amherst, during which he also worked at Microsoft Research, Google Research, and Yahoo! Labs. His group has received funding from Allen Institute for AI, National Science Foundation (NSF), Defense Advanced Research Projects Agency (DARPA), Adobe Research, and FICO.
Machine learning is at the forefront of many recent advances in science and technology, enabled in part by the sophisticated models and algorithms that have been recently introduced. However, as a consequence of this complexity, machine learning essentially acts as a black-box as far as users are concerned, making it incredibly difficult to understand, predict, or detect bugs in their behavior. For example, determining when a machine learning model is “good enough” is challenging since held-out accuracy metrics significantly overestimate real-world performance. In this talk, I will describe our research on approaches that explain the predictions of any classifier in an interpretable and faithful manner, and automated techniques to detect bugs that can occur naturally when a model is deployed. In particular, these methods describe the relationship between the components of the input instance and the classifier’s prediction. I will cover various ways in which we summarize this relationship: as linear weights, as precise rules, and as counter-examples, and present experiments to contrast them and evaluate their utility in understanding, and debugging, black-box machine learning algorithms, on tabular, image, and text applications.