UPDATE 01/08/2024: Event moved to fall 2024.

The inaugural Workshop on Computational Advances for Scientific Discovery and Interpretability at MIT 2024 is an event that brings together researchers from in computer science, data science, artificial intelligence, and various scientific disciplines. Its primary objective is to explore the integration of cutting-edge computational techniques in scientific research, with a special emphasis on the critical role of interpretability within these methodologies.

The workshop is dedicated to two fundamental themes: firstly, advancing scientific discovery through innovative computational strategies, and secondly, enhancing the interpretability of complex machine learning models.

Participants will delve into the application of machine learning and high-performance computing across a spectrum of scientific areas, highlighting their significance in fostering new discoveries and ensuring interpretability in scientific endeavors.

This workshop is ideally suited for students and researchers in computational methods and science, seeking to understand and shape the future of computational methods in scientific discovery.

Day 1: Thursday,  February 1, 2024

9:00 - 12:00 - Workshop 1:  Ameya Daigavane, Mit Kotak (MIT) (Atomic Architects)

Symmetry. The workshop commences with a session led by Atomich Architects at MIT, focusing on the role of symmetry in computational models. This workshop will introduce participants to e3nn https://e3nn.org/, a PyTorch and Jax-based framework designed for creating E(3) equivariant neural networks. E(3), the Euclidean group in dimension 3, encompasses rotations, translations, and mirror operations. The session will cover the basics of e3nn, including practical applications in various scientific fields.

12:00 - 13:00 - Lunch Break

13:00 - 16:00 - Workshop 2: Omar Costilla-Reyes and Morgan Talbot (http://omarcostilla.mit.edu/)

Advances in Digital Behavioral Health

This workshop is an introduction for researchers, clinicians, and professionals in the field of digital behavioral health. It aims to provide a comprehensive understanding of the latest methodologies and their applications in this rapidly evolving domain. The focus will be on counterfactual analysis, network psychometrics, and a balanced exploration of the opportunities and challenges inherent in these approaches.

16:00 - 17:00 - Lightning Talks

1. Relax - Amulya Yadav and HangZhi Guo, Penn State University (https://github.com/BirkhoffG/jax-relax)

ReLax (Recourse Explanation Library in Jax) is an efficient and scalable benchmarking library for recourse and counterfactual explanations, built on top of jax. By leveraging language primitives such as vectorization, parallelization, and just-in-time compilation in jax, ReLax offers massive speed improvements in generating individual (or local) explanations for predictions made by Machine Learning algorithms.

2. Top-Down Synthesis for Library Learning (Maddy Bowers, MIT)

This workshop introduces corpus-guided top-down synthesis as a mechanism for synthesizing library functions that capture common functionality from a corpus of programs in a domain specific language (DSL). The algorithm builds abstractions directly from initial DSL primitives, using syntactic pattern matching of intermediate abstractions to intelligently prune the search space and guide the algorithm towards abstractions that maximally capture shared structures in the corpus.

17:00 - 18:00 - Networking Session

Day 2: Friday, February 2, 2024

9:00 - 12:00 - Workshop 3: Computational Data Science in Physics, Alex Shvonski (MIT) (NSF Institute for Artificial Intelligence and Fundamental Interactions, https://iaifi.org)

Explore realistic, contemporary examples of how computational and statistical methods apply to physics research. Using open data from physics experiments, you’ll be guided in the practice and logic behind end-to-end analyses using computation as an essential scientific tool.

12:00 - 13:00 - Lunch Break

13:00 - 16:00 - Workshop 4: Neurosymbolic Programming Atharva Sehgal and Arya Grayeli  (UT Texas) (https://www.neurosymbolic.org/)

This tutorial aims to provide an overview of recent advances in Neurosymbolic Programming. The objective in this area is to learn neurosymbolic programs, which combine elements of both neural networks and classical symbolic programs, with the aim of inheriting the benefits of both. Specifically, a key advantage of neurosymbolic programming is that here, one learns models that are interpretable and look more like the models that domain experts already write by hand in code. Neurosymbolic programming can also more easily incorporate prior knowledge and produce models that are more amenable to analysis and verification. At the same time, neurosymbolic models are more expressive than classical interpretable models in machine learning, for example, linear models or shallow decision trees. From the point of view of techniques, neurosymbolic programming combines ideas from machine learning and program synthesis and represents an exciting new contact point between the two communities.

16:00 - 17:00 -  Lightning Talks

1. Mathematical discoveries from program search with large language models, Bernardino Romera Paredes (Deepmind) (https://github.com/google-deepmind/funsearch).

Here we introduce FunSearch (short for searching in the function space), an evolutionary procedure based on pairing a pre-trained LLM with a systematic evaluator. We demonstrate the effectiveness of this approach to surpass the best known results in important problems, pushing the boundary of existing LLM-based approaches.

2. TBA


17:00 - 18:00 - Networking Session



Omar Costilla-Reyes, costilla@mit.edu