Link Search Menu Expand Document

Previous Talks


(10/28/2024) Speaker: Qianqian Xie

Yale University

Title
Me-LLaMA: Medical Foundation Large Language Models for Comprehensive Text Analysis and Beyond
Abstract
Recent advancements in large language models (LLMs) like ChatGPT and LLaMA have shown promise in medical applications, though their performance in medical language understanding still requires enhancement. In this talk, I will present our work Me-LLaMA, a suite of foundational medical LLMs by training open-source LLaMA models with large-scale, domain-specific datasets to enhance their efficacy across a variety of medical text analysis. Me-LLaMA utilized the largest and most comprehensive medical data, including 129B pre-training tokens and 214K instruction tuning samples from diverse biomedical and clinical data sources. Training the 70B models required substantial computational resources, exceeding 100,000 A100 GPU hours. We applied Me-LLaMA to six medical text analysis tasks and evaluated its performance on 12 benchmark datasets. Me-LLaMA models outperform LLaMA, and other existing open-source medical LLMs in both zero-shot and supervised learning settings for most text analysis tasks. With task-specific instruction tuning, Me-LLaMA models also surpass leading commercial LLMs, outperforming ChatGPT on 7 out of 8 datasets and GPT-4 on 5 out of 8 datasets. Me-LLaMA models are now publicly available through appropriate user agreements, making them a valuable resource for advancing medical AI applications.
Bio
Dr. Qianqian Xie is an Associate Research Scientist at the Department of Biomedical Informatics and Data Science, School of Medicine, Yale University. Her research interests are natural language processing and its application in medicine. She just received the National Institutes of Health (NIH), National Library of Medicine (NLM) ‘Pathway to Independence Award’ (K99/R00). She has co-authored more than 50 peer-reviewed publications. Her research has been published in leading conferences and journals, such as NeurIPS, ACL, KDD, SIGIR, EMNLP, NAACL, COLING, BioNLP, WWW, ICDM, TOIS, TKDE, IEEE JBHI, Bioinformatics, JBI, Nature Communications among others.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/21/2024) Speaker: Waqas Sultani

Information Technology University

Title
Large Scale Multi-Microscope Datasets and their Challenges
Abstract
Each year, approximately 226 million malaria cases are reported across 87 countries, with 425,600 resulting in fatalities. In 2019, 67% of these deaths were children under five. Similarly, according to GLOBOCAN 2020, leukemia is a leading cause of cancer-related deaths among individuals under 39, particularly children, accounting for 2.5% of all cancer cases with an estimated 474,519 annual incidences. Early detection through microscopic analysis of peripheral blood smears can save lives in both diseases, but this process is resource-intensive, requiring costly microscopes and skilled professionals. Additionally, many countries face a significant shortage of doctors, making it even more challenging to provide timely diagnosis. To address the subjectivity of diagnoses and the shortage of medical experts, we have developed large-scale, multi-microscope, multi-resolution Malaria and Leukemia datasets. These datasets include paired images across different microscopes and resolutions, enabling more robust model training. We have also evaluated several state-of-the-art object detectors, introduced few-shot domain adaptation techniques, and proposed partially supervised domain adaptation and detailed attribute detection methods to enhance explainability. We believe that our publicly available datasets and proposed methods will support further research and innovation in this critical area.
Bio
Waqas Sultani is an Assistant Professor at Information Technology University (ITU) and a member of the Intelligent Machines Lab. His main areas of research are Computer Vision and Deep Learning Waqas work has been published in respectable computer vision, machine learning, robotics, and remote sensing venues, such as CVPR, AAAI, ICRA, IJCV, MICCAI, ISPRS-JPRS, IEEE Trans. ITS, PAMI etc. He has been awarded the Google Research Scholar Award (2023), as a PI, for collecting a large-scale dataset for low-cost cancer detection. In 2019, he was awarded the Facebook Computer Vision for Global Challenge (CV4GC) research award for designing low-cost solutions for efficient malaria detection. He is a strong advocate of academia and industry collaborations and has recently spent one year with computer vision-based company HazenAI as a Principle Machine Learning Engineer. Before joining ITU in 2017, Waqas Sultani obtained a doctorate from the University of Central Florida under Prof Mubarak Shah, an MS from Seoul National University under Prof Jin Young Choi and BSc from U.E.T. Taxila. Currently, he is the head of the medical AI research group at ITU.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/14/2024) Speaker: Yanwu Xu

SnapChat

Title
MedSyn: A Prompt-Driven Anatomy Aware Generative Model for 3D Medical Imaging
Abstract
MedSyn introduces an innovative methodology for producing high-quality 3D lung CT images guided by textual information. While diffusion-based generative models are increasingly used in medical imaging, current state-of-the-art approaches are limited to low-resolution outputs and underutilize radiology reports' abundant information. The radiology reports can enhance the generation process by providing additional guidance and offering fine-grained control over the synthesis of images. Nevertheless, expanding text-guided generation to high-resolution 3D images poses significant memory and anatomical detail-preserving challenges. Addressing the memory issue, we introduce a hierarchical scheme that uses a modified UNet architecture. We start by synthesizing low-resolution images conditioned on the text, serving as a foundation for subsequent generators for complete volumetric data. To ensure the anatomical plausibility of the generated samples, we provide further guidance by generating vascular, airway, and lobular segmentation masks in conjunction with the CT images. The model demonstrates the capability to use textual input and segmentation tasks to generate synthesized images. Algorithmic comparative assessments and blind evaluations conducted by 10 board-certified radiologists indicate that our approach exhibits superior performance compared to the most advanced models based on GAN and diffusion techniques, especially in accurately retaining crucial anatomical features such as fissure lines and airways. This innovation introduces novel possibilities. This study focuses on two main objectives: (1) the development of a method for creating images based on textual prompts and anatomical components, and (2) the capability to generate new images conditioning on anatomical elements. The advancements in image generation can be applied to enhance numerous downstream tasks.
Bio
Yanwu Xu is a research scientist at SnapChat, where he works on cutting-edge technologies in the field of artificial intelligence and machine learning. They earned their PhD from Boston University under the guidance of Prof. Kayhan Batmanghelich. During their doctoral studies, they developed a high-quality text-to-image generation model specifically designed for 3D Lung CT scans. Additionally, in collaboration with Google, he contributed to the creation of ultra-fast, one-step diffusion models. Their work bridges the fields of medical imaging and generative AI, pushing the boundaries of what’s possible in both healthcare and creative AI applications.
Video
Session not recorded on request
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/7/2024) Speaker: Yan Hu

University of Texas, Health Science Center

Title
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering
Abstract
Objective: This study quantifies the capabilities of GPT-3.5 and GPT-4 for clinical named entity recognition (NER) tasks and proposes task-specific prompts to improve their performance. Materials and Methods: We evaluated these models on two clinical NER tasks: (1) to extract medical problems, treatments, and tests from clinical notes in the MTSamples corpus, following the 2010 i2b2 concept extraction shared task, and (2) identifying nervous system disorder-related adverse events from safety reports in the vaccine adverse event reporting system (VAERS). To improve the GPT models' performance, we developed a clinical task-specific prompt framework that includes (1) baseline prompts with task description and format specification, (2) annotation guideline-based prompts, (3) error analysis-based instructions, and (4) annotated samples for few-shot learning. We assessed each prompt's effectiveness and compared the models to BioClinicalBERT. Results: Using baseline prompts, GPT-3.5 and GPT-4 achieved relaxed F1 scores of 0.634, 0.804 for MTSamples, and 0.301, 0.593 for VAERS. Additional prompt components consistently improved model performance. When all four components were used, GPT-3.5 and GPT-4 achieved relaxed F1 socres of 0.794, 0.861 for MTSamples and 0.676, 0.736 for VAERS, demonstrating the effectiveness of our prompt framework. Although these results trail BioClinicalBERT (F1 of 0.901 for the MTSamples dataset and 0.802 for the VAERS), it is very promising considering few training samples are needed. Conclusion: While direct application of GPT models to clinical NER tasks falls short of optimal performance, our task-specific prompt framework, incorporating medical knowledge and training samples, significantly enhances GPT models' feasibility for potential clinical applications.
Bio
Yan Hu is a 4th year PhD student at the University of Texas Health Science Center at Houston, specializing in Natural Language Processing (NLP) in the biomedical domain. Yan’s current research focuses on leveraging and optimizing large language models (LLMs) for clinical applications, with a particular emphasis on clinical information extraction.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/23/2024 - 9/30/2024) Speaker: No session for two weeks -- Fall break!


(9/16/2024) Speaker: Shantanu Ghosh

Boston University

Title
Divide and Conquer: Carving Out Concept-based Models out of BlackBox for More Efficient Transfer Learning
Abstract
Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Networks (NN), often treated as blackboxes, suffer even with a slight shift in input distribution (e.g., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop a concept-based interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. Concept-based model design either starts with an interpretable design or a post-hoc-based approach from a Blackbox. Blackbox models are flexible but difficult to explain, while interpretable by-design models are inherently explainable. Yet, interpretable models require extensive machine learning knowledge and tend to be less flexible, potentially underperforming their Blackbox equivalents. My research aims to blur the distinction between a post-hoc explanation of a Blackbox and constructing interpretable models. In the first part of the talk, beginning with a Blackbox, we iteratively carve out a mixture of concept-based interpretable models and a residual network. The interpretable models identify a subset of samples and explain them using First Order Logic (FOL), providing basic reasoning on concepts from the Blackbox. We route the remaining samples through a flexible residual. We repeat the method on the residual network until all the interpretable models explain the desired proportion of data. In the second part of my talk, I will discuss an algorithm to transfer the interpretable models from a source domain to an unseen target domain with minimum training data and computation cost.
Bio
Shantanu Ghosh is a PhD candidate in Electrical Engineering at Boston University, advised by Prof. Kayhan Batmanghelich. He completed his master's degree in Computer Science from the University of Florida under the supervision of Dr. Mattia Prosperi. His research interest lies in representation learning across computer vision and medical imaging, focusing on interpretability and explainable AI. Specifically, he investigates the representations learned across different modalities, architectures, and training strategies to enhance their generalizability, robustness, and trust.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/9/2024) Speaker: Mihir Parmar

Arizona State University

Title
Role of Instruction-Tuning and Prompt Engineering in Clinical Domain
Abstract
In this talk, I will discuss the pivotal role of instruction-tuning and prompt engineering in advancing Clinical NLP. I will cover how our In-BoXBART leverages instruction-tuning to improve performance across multiple biomedical tasks, and how a collaborative LLM framework enhances the efficiency and accuracy of systematic reviews in oncology. These studies collectively demonstrate how these NLP techniques can optimize clinical processes and evidence-based practices.
Bio
Mihir is a Ph.D. student at Arizona State University and a Research Associate at Mayo Clinic. His research has been published in top-tier NLP conferences such as ACL, EMNLP, NAACL, and EACL, where he received the 'Outstanding Paper Award' at EACL 2023. His work focuses on pioneering instruction-tuning in the biomedical domain, analyzing the impact of various instructions on model performance, and exploring LLMs' capabilities in question decomposition, program synthesis, and reasoning. Additionally, he has industry experience as a research scientist intern at Adobe Research (Summer 2023) and the AI Innovation Lab at Novartis (Summer 2022).
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/26/2024 - 9/3/2024) Speaker: No session for two weeks -- Fall break!


(8/19/2024) Speaker: Rahul Thapa

Stanford University

Title
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
Abstract
Sleep is a complex physiological process evaluated through various modalities recording electrical brain, cardiac, and respiratory activities. We curate a large polysomnography dataset from over 14,000 participants comprising over 100,000 hours of multi-modal sleep recordings. Leveraging this extensive dataset, we developed SleepFM, the first multi-modal foundation model for sleep analysis. We show that a novel leave-one-out approach for contrastive learning significantly improves downstream task performance compared to representations from standard pairwise contrastive learning. A logistic regression model trained on SleepFM's learned embeddings outperforms an end-to-end trained convolutional neural network (CNN) on sleep stage classification (macro AUROC 0.88 vs 0.72 and macro AUPRC 0.72 vs 0.48) and sleep disordered breathing detection (AUROC 0.85 vs 0.69 and AUPRC 0.77 vs 0.61). Notably, the learned embeddings achieve 48% top-1 average accuracy in retrieving the corresponding recording clips of other modalities from 90,000 candidates. This work demonstrates the value of holistic multi-modal sleep modeling to fully capture the richness of sleep recordings.
Bio
Rahul is a second-year PhD student at Stanford University, working under the guidance of Dr. James Zou, with a primary focus on AI in biomedicine. Currently, he is also a research intern at Together AI, where he concentrates on multimodal models. Rahul's research interests lie in the development and application of multimodal models. He has experience working with sleep signal data and biomedical image, text, and video data.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/12/2024) Speaker: Hassan Mohy-ud-Din

School of Science and Engineering, LUMS

Title
Towards Robust Radiomics and Radiogenomics Predictive Models for Brain Tumor Characterization
Abstract
In the context of brain tumor characterization, we focused on two key questions which, to the best of our knowledge, have not been explored so far: (a) stability of radiomics features to variability in multiregional segmentation masks obtained with fully-automatic deep segmentation methods and (b) subsequent impact on predictive performance on downstream prediction tasks. The hypothesis is that highly stable and discriminatory radiomics features lead to generalizable radio(geno)mics models for brain tumor characterization.
Bio
Dr. Hassan Mohy-ud-Din is the Director of Algorithms in Theory and Practice Lab and an Assistant Professor of Electrical Engineering, Syed Babar Ali School of Science and Engineering, LUMS. He completed his PhD and MSE in Electrical and Computer Engineering and MA in Applied Mathematics and Statistics from Johns Hopkins University (2009 – 2015). From 2015 – 2017 he was a postdoctoral associate in the Department of Radiology and Biomedical Imaging at the Yale School of Medicine. From 2017 – 2018 he was a Clinical Research Scientist at Shaukat Khanum Memorial Cancer Hospital and Research Centre, Lahore, Pakistan. His research is at the intersection of applied mathematics and clinical imaging – exploiting tools in machine learning, optimization, statistics, and information theory to develop novel algorithms for clinical and translational imaging. He has done extensive research in multimodality imaging including PET/CT, SPECT/CT, PET/MR, Low-dose CT, and multiparametric MRI and developed computational pipelines for brain imaging, cardiac imaging, and abdominal imaging. His work on dynamic cardiac PET imaging won the 2014 SNMMI Bradley-Alavi fellowship and the 2014 SIAM student award. He is also a recipient of the 2019 Charles Wallace Fellowship from the British Council, Pakistan. He carries a university teaching experience of over fifteen years (including five years at Johns Hopkins University).
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/5/2024) Speaker: Xiaohan Xing

Stanford University

Title
Integrating Pathology Images and Genomics Data for Cancer Grading
Abstract
In recent years, Artificial Intelligence (AI) technology has been widely applied to the analysis of multi-modal biomedical data, revolutionizing healthcare. Integrating morphological information from pathology slides with molecular information from genomics data enhances cancer grading accuracy and reliability. However, this integration faces challenges: modality gap, modality imbalance, and modality missing. To address these, we developed a low-rank constraint-based method to bridge the gap between different modalities and capture cross-modal complementary information, a saliency-aware masking strategy to balance the contribution of different modalities, and a knowledge distillation framework to improve model performance when genomics data is missing. We validated the effectiveness of these methods on the TCGA GBMLGG dataset.
Bio
Dr. Xiaohan Xing is a postdoctoral researcher at Stanford University, having obtained her Ph.D. from The Chinese University of Hong Kong in 2021. She has been extensively involved in research related to medical AI, focusing on disease diagnosis and survival prediction based on medical images, omics data, and the integration of multimodal data. Currently, she has published more than 20 papers in top-tier journals and conferences in the field. Among these, she is the first author of 12 papers in leading journals such as Proceedings of IEEE, TMI, Medical Image Analysis, Bioinformatics, and prominent conferences like MICCAI and BIBM. Her research contributions have gained worldwide recognition, and she was honored with the MICCAI Young Scientist Award in 2022.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/29/2024) Speaker: Sarah Volinsky

The Leiden University Medical Center

Title
HECTOR, a multimodal deep learning model predicting recurrence risk in endometrial cancer
Abstract
Predicting distant recurrence of endometrial cancer (EC) is crucial for personalized adjuvant treatment. The current gold standard of combined pathological and molecular profiling is costly, hampering implementation. Here we developed HECTOR (histopathology-based endometrial cancer tailored outcome risk), a multimodal deep learning prognostic model using hematoxylin and eosin-stained, whole-slide images and tumor stage as input, on 2,072 patients from eight EC cohorts including the PORTEC-1/-2/-3 randomized trials. HECTOR demonstrated C-indices in internal (n = 353) and two external (n = 160 and n = 151) test sets of 0.789, 0.828 and 0.815, respectively, outperforming the current gold standard, and identified patients with markedly different outcomes (10-year distant recurrence-free probabilities of 97.0%, 77.7% and 58.1% for HECTOR low-, intermediate- and high-risk groups, respectively, by Kaplan–Meier analysis). HECTOR helps delivery of personalized treatment in EC.
Bio
Sarah Volinsky is a deep learning engineer with specific expertise in computational pathology. She is completing her 4-year PhD research in The Leiden University Medical Center in the Netherlands where she is the principal junior researcher of the AIRMEC team, a joint collaboration with Dr. Tjalling Bosse and Dr. Nanda Horeweg at the Leiden University Medical Center and Prof. Viktor Koelzer at the University Hospital of Basel. As part of her PhD research work, she has been developing deep learning models using histology images, outcomes, genomic data, specifically in the domain of endometrial cancer for predicting molecular alterations and outcomes. This has led to two first author publications in the Lancet digital health 2023, Nature Medicine 2024 and platform presentations in several international conferences including USCAP and AACR. Prior to this, she was working in the UK as a machine learning engineer after completing her second Msc in Data Science in London. She has also graduated from a Msc in financial engineering and a bachelor in mathematics in Paris.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/22/2024) Speaker: Ege Ozsoy

Technical University of Munich

Title
Holistic OR Domain Modeling with Large Vision Language Models
Abstract
The operating room (OR) is an intricate environment involving diverse medical staff, patients, devices, and their interactions. Traditionally, only skilled medical professionals can navigate and comprehend these complex dynamics. This talk introduces an innovative approach towards automated, comprehensive modeling of the OR domain using Large Vision Language Models (LVLMs). By leveraging semantic scene graphs (SSG) and state-of-the-art vision-language integration, we aim to achieve a holistic understanding and representation of surgical environments. Our approach involves creating and utilizing the first open-source 4D-OR dataset, capturing simulated surgeries with RGB-D sensors. This dataset, enriched with annotations for SSGs, human and object poses, clinical roles, and surgical phases, enables advanced semantic reasoning. We will discuss our neural network-based SSG generation pipeline and its successful application to clinical role prediction and surgical phase recognition tasks, showcasing its potential to enhance decision-making and patient safety during surgical procedures. Furthermore, I will highlight the significant strides made in Knowledge Guidance using LVLMs in OR modeling and the transformative impact it promises for the future of surgical data analysis.
Bio
I am currently a third-year PhD student at the Technical University of Munich (TUM) in the Chair of Computer Aided Medical Procedures (CAMP). I completed both my bachelor’s and master’s degrees in computer science at TUM. Towards the end of my master’s, I began focusing on Holistic OR Domain Modeling at CAMP. My research centers on achieving a comprehensive understanding and modeling of the entire operating room using semantic scene graphs and large vision language models.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/15/2024) Speaker: Aisha Urooj

Mayo Clinic

Title
AI-Driven Advancements in Mammogram Analysis
Abstract
Adapting AI models for healthcare applications presents significant challenges, such as domain misalignment, limited access to extensive datasets, and highly imbalanced classes. Hence, there is a pressing need to develop a corresponding proficiency in adapting the advancements in AI to the medical domain. Such adaptations would prove immensely valuable in healthcare applications to extract clinical insights and actionable information. In this talk, I will discuss my research efforts at the Mayo Clinic to design and validate AI models for use cases in mammogram analysis, particularly in the context of cancer screening and risk assessment of adverse cardiovascular events among women.
Bio
Aisha Urooj is a Research Fellow at Arizona Advanced AI & Innovation (AI3) Hub, Mayo Clinic, AZ. She received her Ph.D. from the Center for Research in Computer Vision (CRCV) at the University of Central Florida, where she was advised by Dr. Mubarak Shah and Dr. Niels Da Vitoria Lobo. She also worked at MIT-IBM Watson AI lab as a Research Scientist Intern in Summers 2020-2021. Her research interests include multimodal understanding tasks involving language and vision with a focus on representation learning, visual question answering, and visual grounding. Her recent work is focused on learning joint representations from imaging and clinical reports and developing efficient AI systems to assist radiologists in their diagnosis.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/8/2024) Speaker: Avisha Das

Mayo Clinic

Title
Framework for Exposing Vulnerabilities of Clinical Large Language Model: A Case Study in Breast Cancer
Abstract
Large language models (LLMs) with billions of parameters and trained on massive amounts of crowdsourced public data have made a dramatic impact on natural language processing (NLP) tasks. Domain specific `finetuning' of LLMs has further improved model behavior through task-specific alignment and refinement. However, with the widespread development and deployment of LLMs, existing vulnerabilities in these models have made way for perpetrators to manipulate them for malicious intentions. The widespread integration of LLMs in clinical NLP underscores the looming threat of privacy leakage and targeted attacks like prompt injection or data poisoning. In this work, we designed a systematic framework to expose vulnerabilities of clinical generative language models with a specific emphasis on its application to clinical notes. We design three attack pipelines to highlight model's susceptibility to core types of targeted attacks - (i) instruction-based data poisoning, (ii) trigger-based model editing and (iii) membership inference on de-identified breast cancer clinical notes. Our proposed framework is the first work to investigate the extent of LLM-based attacks in the clinical domain. Our findings reveal successful manipulation of LLM behavior, prompting concerns about the stealthiness and effectiveness of such attacks. Through this work, we hope to emphasize on the urgency of comprehending these vulnerabilities in LLMs and encourage the mindful and responsible usage of LLMs in the clinical domain.
Bio
I am currently a Research Fellow with the Arizona Advanced AI & Innovation (A3I) Hub, Mayo Clinic, AZ. I have previously worked as a Postdoc at the University of Texas Health Science Center, Houston. My research interests lie in Large Language Modeling and Language Understanding. I am currently working on exploring LLM vulnerabilities in the Biomedical and Clinical Domain. I have previously worked on Biomedical Knowledge Mining and Retrieval as well as Empathetic Conversational Agents for Mental Health Support.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/17/2024 - 7/01/2024) Speaker: Summer break - No MedAI session


(6/10/2024) Speaker: Mingjie Li

Stanford University

Title
Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation
Abstract
In the realm of medical imaging, automatic radiology reporting has emerged as a crucial tool to alleviate the heavy workloads faced by radiologists and enhance the interpretation of diagnoses. Traditional approaches have augmented data-driven neural networks with static medical knowledge graphs to tackle the inherent visual and textual biases in this task. However, these fixed graphs, constructed from general knowledge, often fail to capture the most relevant and specific clinical knowledge, limiting their effectiveness. In this talk, Dr. Li will introduce his innovative approach to dynamic graph enhanced contrastive learning for chest X-ray report generation, termed Dynamic Contrastive Learning (DCL). This method constructs an initial knowledge graph from general medical knowledge and dynamically updates it by incorporating specific knowledge extracted from retrieved reports during the training process. This adaptive graph structure allows each image feature to be integrated with an updated and contextually relevant graph, enhancing the quality of the generated reports. Key components of this approach include the introduction of Image-Report Contrastive and Image-Report Matching losses, which improve the representation of visual features and textual information. He will present the results of the proposed method evaluated on the IU-Xray and MIMIC-CXR datasets, demonstrating its superior performance compared to existing state-of-the-art models. This advancement holds significant promise for improving the accuracy and efficiency of automatic radiology reporting, ultimately contributing to better clinical outcomes.
Bio
Dr. Mingjie is a postdoctoral researcher in the Department of Radiation Oncology at Stanford University, working under the mentorship of Professor Lei Xing. Dr. Mingjie earned their Ph.D. in Computer Science from the University of Technology Sydney. His research focuses on medical multi-modal tasks, with a particular interest in medical report generation and medical multi-modal representation learning. Dr. Mingjie has publications in top-tier venues such as T-PAMI, CVPR, NeurIPS, and TIP. In addition to his research, Dr. Mingjie serves as a reviewer for several prestigious conferences and journals, including CVPR, NeurIPS, ACM MM, ACL, TMI, and TCSVT. His work aims to advance the integration of artificial intelligence in medical imaging and improve clinical outcomes through innovative methodologies.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/3/2024) Speaker: Arko Barman

Rice University

Title
Deep Symmetry-sensitive networks for detecting brain diseases
Abstract
Spatial symmetry is commonly used by clinicians in the diagnosis and prognosis of diseases involving multiple organs such as the brain, prostate, breasts, and lungs. Anomalies in symmetry can be indicative of patient-specific disease-related features that are less sensitive to inter-patient variability. However, quantifying these symmetric anomalies is challenging as the symmetries in the human body are not exact mirrored copies. We will explore the design and development of a novel deep learning architectural paradigm, named Deep Symmetry-sensitive Network (DeepSymNet), capable of learning anomalies in symmetry from minimally processed 2D or 3D radiological images. DeepSymNet was evaluated for detection of Large Vessel Occlusion (LVO) in the brain for ischemic stroke patients (using 3D CT angiography images). DeepSymNet is less sensitive to noise, rotation and translation compared to a ‘symmetry-naive’ deep convolutional neural network (CNN) with a similar number of parameters. Finally, an interpretation of the decision-making process of the DeepSymNet model using activation maps is presented. An expanded DeepSymNet (e-DeepSymNet) architecture also explores the combination of symmetry-sensitive and symmetry-naive feature representations for the detection of brain hemorrhage. Other applications of DeepSymNet in radiological image analysis, such as quantifying the neurodegeneration in Alzheimer's disease using MRI, will be briefly discussed.
Bio
Dr. Arko Barman is an Assistant Teaching Professor in the Data to Knowledge Lab, the ECE department, and the Statistics department at Rice University. He received his Ph.D. in Computer Science at the University of Houston, Master's in Signal Processing at the Indian Institute of Science, and Bachelor's in Electrical Engineering at Jadavpur University. Dr. Barman has previously worked at the Palo Alto Research Center, UTHealth, and Broadcom Inc. before joining Rice. His research encompasses AI, machine learning, and computer vision and their applications in biomedicine, video surveillance, ecological conservation, and the social sciences. He is also actively involved in data science education research and has published, presented, and has been in the organizing committee of engineering and computer science education conferences such ACM SIGCSE and IEEE Frontiers in Engineering. Additionally, he serves as the director of the data science capstone program at Rice, which brings together industry, healthcare, and non-profit companies with Rice students to solve real world problems using machine learning and AI.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/27/2024) Speaker: No MedAI session (Memorial Day)


(5/20/2024) Speaker: Kaishuai Xu

Hong Kong Polytechnic University

Title
Behave like a Doctor: Clinical Process-Aware Medical Dialogue System
Abstract
Medical dialogue systems (MDS) aim to provide patients with medical services, such as diagnosis and prescription, and have attracted significant attention for their potential to act as medical assistants. Many past and present studies in medical dialogue have focused on improving the accuracy and relevance of single-turn responses, often overlooking the learning of the entire medical dialogue or interview process. While the application of large medical language models has made generating high-quality, contextually relevant responses less challenging, most of these models still rely on QA-style single-turn responses and lack the ability to construct a genuine medical dialogue process. Our work aims to build a medical dialogue system that better aligns with real medical dialogue processes, real diagnostic processes, and real doctors’ thought processes.
Bio
Kaishuai Xu is a third-year PhD student in the Department of Computing at the Hong Kong Polytechnic University, supervised by Prof. Wenjie Li. His research focuses on developing reliable and trustworthy medical applications. His interests include Medical Dialogue Systems, Radiology Report Generation, and other applications powered by Large Language Models (LLMs). He has published several papers in leading computational linguistics conferences, including ACL and EMNLP. More information can be found in the personal website: kaishxu.github.io.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/13/2024) Speaker: Aimon Rahman

Johns Hopkins University

Title
Ambiguous medical image segmentation using diffusion models
Abstract
Collective insights from a group of experts have always proven to outperform an individual's best diagnostic for clinical tasks. For the task of medical image segmentation, existing research on AI-based alternatives focuses more on developing models that can imitate the best individual rather than harnessing the power of expert groups. In this paper, we introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights. Our proposed model generates a distribution of segmentation masks by leveraging the inherent stochastic sampling process of diffusion using only minimal additional learning. We demonstrate on three different medical image modalities-CT, ultrasound, and MRI that our model is capable of producing several possible variants while capturing the frequencies of their occurrences. Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks in terms of accuracy while preserving naturally occurring variation. We also propose a new metric to evaluate the diversity as well as the accuracy of segmentation predictions that aligns with the interest of clinical practice of collective insights.
Bio
Aimon Rahman is a third-year PhD student in the Department of Electrical and Computer Engineering at Johns Hopkins University under the supervision of Dr. Vishal M. Patel. Her research lies at the intersection of computer vision and medical image analysis, with a focus on developing deep learning techniques to make healthcare more affordable and accessible globally. Her specific research interests include 2D/3D segmentations, generative networks, representation learning, and addressing bias/ambiguity in medical image problems. She is also open to exploring general vision problems, particularly in the areas of generative networks and representation learning. She has a track record of first-author publications in conferences, such as MICCAI, MIDL and CVPR. (Personal Website: https://aimansnigdha.github.io/).
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/06/2024) Speaker: Kyle Swanson

Stanford University

Title
Generative AI for designing and validating easily synthesizable and structurally novel antibiotics
Abstract
Generative AI methods are a promising approach to design new drug candidates, but they often create molecules that are difficult to synthesize, limiting their usefulness in real-world drug discovery. To overcome this limitation, we recently introduced SyntheMol, a generative AI method that exclusively designs easy-to-synthesize molecules from a chemical space of nearly 30 billion molecules. In this talk, I will describe the SyntheMol algorithm as well as how we applied SyntheMol to design, synthesize, and validate molecules that successfully target the bacterium Acinetobacter baumannii.
Bio
Kyle Swanson is a 3rd year PhD student in computer science at Stanford University advised by Prof. James Zou. His research focuses on developing AI methods for drug discovery and biomedicine, with a particular emphasis on bridging the gap between computational methods and wet lab validation. Previously at MIT (BS, MEng), he worked with Prof. Regina Barzilay to develop Chemprop, a graph neural network for molecular property prediction that enabled the discovery of Halicin, one of the first antibiotic candidates identified by AI. Kyle also studied at the University of Cambridge (MASt) and Imperial College London (MSc) as a Marshall Scholar and currently studies at Stanford as a Knight-Hennessy scholar.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/29/2024) Speaker: No MedAI session


(4/22/2024) Speaker: Bin Wang

Northwestern University

Title
Radiologist-centered AI with Eye Tracking Techniques
Abstract
Although artificial intelligence (AI) based computer-aided diagnosis systems have been shown to be useful in medical image analysis, current deep learning methods still suffer (1) challenging localization of lesions and (2) low-efficient clinical practice (3) lack of expert knowledge. Eye tracking research is important in computer vision because it can help us understand how humans interact with the visual world. Specifically for high-risk applications, such as medical imaging, eye tracking can help us comprehend how radiologists and other medical professionals search, analyze, and interpret images for diagnostic and clinical purposes. In this study, we investigate how to apply eye tracking techniques to real clinical practice, which builds a time-efficient, robust, and radiologist-centered computer-aided diagnosis system.
Bio
Bin Wang is a second-year PhD student in the Department of Electrical and Computer Engineering at Northwestern University under the supervision of Prof. Ulas Bagci. He is mainly working on developing Human-centered AI for medical image analysis. His research interests include Eye Tracking, Multi-Modal Foundational Models, and Human-Computer Interaction. He has over ten papers published in leading machine learning conferences and journals, including MICCAI, WACV, ICASSP, CVPR and NeurIPS workshop, and Medical Image Analysis. (Personal Website: ukaukaaaa.github.io)
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/15/2024) Speaker: Syed Muhammad Anwar

George Washington University, School of Medicine

Title
Self-supervised learning for chest x-ray analysis
Abstract
Chest X-Ray (CXR) is a widely used clinical imaging modality and has a pivotal role in the diagnosis and prognosis of various lung and heart related conditions. Conventional automated clinical diagnostic tool design strategies relying on radiology reads and supervised learning, entail the cumbersome requirement of high quality annotated training data. To address this challenge, self-supervised pre-training has proven to outperform supervised pre-training in numerous downstream vision tasks, representing a significant breakthrough in the field. However, medical imaging pre-training significantly differs from pre-training with natural images (e.g., ImageNet) due to unique attributes of clinical images. In this talk, I will present a self-supervised training paradigm that leverages a student teacher framework for learning diverse concepts and hence effective representation of the CXR data. Hence, expanding beyond merely modeling a single primary label within an image, instead, effectively harnessing the information from all the concepts inherent in the CXR. The pre-trained model is subsequently fine-tuned to address diverse domain-specific tasks. Our proposed paradigm consistently demonstrates robust performance across multiple downstream tasks on multiple datasets, highlighting the success and generalizability of the pre-training strategy. The training strategy has been extended for federated learning (FL), which could alleviate the burden of data sharing and enable patient privacy. I will briefly talk about the privacy landscape of FL and potential data leakage within the FL paradigm.
Bio
Dr. Anwar is principal investigator at Children’s National Hospital and associate professor of Radiology and Pediatrics at the George Washington University School of Medicine and Health Sciences. Within the hospital, he is associated with the Sheikh Zayed Institute (SZI) for Pediatric Surgical Innovation doing cutting edge research in surgical planning, treatment and device innovation. Prior to this, Dr. Anwar was associated with the University of Engineering and Technology, Taxila as associate professor (Tenured) in the Department of Software Engineering and was a Fulbright Research Fellow at the Center for Research in Computer Vision (CRCV) at the University of Central Florida. CRCV is one of the top-ranked computer vision centers in the world. Dr. Anwar's research interests include developing computational & engineering solutions for healthcare systems that benefit from computer vision, signal processing and artificial intelligence. He has expertise in a wide range of application areas related to machine learning, image and signal processing, and biomedical engineering.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/8/2024) Speaker: Muhammad Shaban

Mass General Brigham and Harvard Medical School

Title
AI-driven fast and accurate cell phenotyping in highly multiplex images
Abstract
Highly multiplexed protein imaging is emerging as a potent technique for analyzing protein distribution within cells and tissues in their native context. However, existing cell annotation methods utilizing high-plex spatial proteomics data are resource intensive and necessitate iterative expert input, thereby constraining their scalability and practicality for extensive datasets. We introduce MAPS (Machine learning for Analysis of Proteomics in Spatial biology), a machine learning approach facilitating rapid and precise cell type identification with human-level accuracy from spatial proteomics data. Validated on multiple in-house and publicly available MIBI and CODEX datasets, MAPS outperforms current annotation techniques in terms of speed and accuracy, achieving pathologist-level precision even for typically challenging cell types, including tumor cells of immune origin. By democratizing rapidly deployable and scalable machine learning annotation, MAPS holds significant potential to expedite advances in tissue biology and disease comprehension.
Bio
Dr. Shaban is a Machine Learning Scientist at the AI for Pathology Image Analysis Lab, associated with Mass General Brigham and Harvard Medical School. This role follows his completion of over two years of post-doctoral research at the same lab. He earned his PhD in Computer Science from the University of Warwick, focusing on medical image analysis. He has over eight years of research experience in computer vision, deep learning, and AI-driven medical image analysis. His primary focus is developing advanced algorithms for critical clinical applications in computational pathology. For further information, please visit his website at www.mshaban.org.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/25/2024 to 4/1/2024) Speaker: Spring Break -- No MedAI session


(3/18/2024) Speaker: Che Liu & Zhongwei Wan

Imperial College London and Ohio State University

Title
Towards Fundamental Biomedical AI: Integrating Vision, Language, and Signals
Abstract
This presentation showcases cutting-edge frameworks designed to address key challenges in biomedical AI, focusing on integrating vision-language and signal-language modalities for medical diagnostics. We introduce Med-UniC, an innovative solution for cross-lingual medical vision-language pre-training, effectively reducing community bias across languages to improve performance in medical image analysis. Further, we highlight the Multimodal ECG Representation Learning (MERL) framework, which advances zero-shot ECG classification by leveraging large language models and multimodal learning, and the Multimodal ECG Instruction Tuning (MEIT) framework, designed to automate ECG report generation. Both MEIT and MERL exhibit unparalleled performance in their fields, underlining the significant potential of multimodal and cross-lingual learning in enhancing biomedical AI through the seamless integration of clinical insights and artificial intelligence technologies.
Bio
Che Liu, a 3rd-year PhD student at Imperial College London, is working on multimodal large language models (LLMs) for analyzing both daily world and healthcare data. Zhongwei Wan, a 1st-year student at Ohio State University, focuses on multimodal LLMs and enhancing the efficiency of LLMs.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/11/2024) Speaker: Yi Lin

Hong Kong University of Science and Technology

Title
Data Efficient Learning in medical image segmentation
Abstract
Medical image segmentation is a foundational yet arduous undertaking within the realm of medical image analysis. However, the availability of manually annotated medical data remains limited and costly, posing a significant impediment in this field. In this seminar, I will present our recent research on data-efficient learning in medical image segmentation. Our work delves into the exploration of strategies for maximizing the utility of available data, optimizing annotation efficiency, and enhancing the scalability of the deep-learning-based model.
Bio
Yi Lin is a Ph.D. student in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology, under the supervision of Prof. Kwang-Ting (Tim) Cheng and Prof. Hao Chen. He also worked as a research engineer at the Tencent Jarvis Lab. His research interests include label-efficient learning algorithms with human-in-the-loop, and computer-aided-diagnosis systems considering efficiency, scalability, and accessibility. He has published several papers in top-tier conferences and journals, including IEEE TMI, Medical Image Analysis, IEEE JBHI, MICCAI, and IPMI. He also serves as a reviewer for top-tier conferences and journals, including CVPR, ICCV, ECCV, MICCAI, IEEE TPAMI, IEEE TNNLS, IEEE TMI.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/4/2024) Speaker: Zeljko Kraljevic

King's College London

Title
Large Language Models as Universal Medical Forecasters
Abstract
Electronic Health Records hold detailed longitudinal information about each patient's health status and general clinical history, a large portion of which is stored within the unstructured text. Existing approaches focus mostly on structured data or a subset of single-domain outcomes. We will explore how temporal modeling of patients from free text and structured data, using large language models fine tuned on over 1M million patient timelines can be used to forecast a wide range of future disorders, medications, procedures and symptoms.
Bio
Zeljko Kraljevic is a researcher in AI for Healthcare at King’s College London working on temporal modeling of patients and disorders using generative transformers.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/26/2024) Speaker: Emma Chen

Harvard University

Title
Multimodal Clinical Benchmark for Emergency Care (MC-BEC): A Comprehensive Benchmark for Evaluating Foundation Models in Emergency Medicine
Abstract
We propose the Multimodal Clinical Benchmark for Emergency Care (MC-BEC), a comprehensive benchmark for evaluating foundation models in Emergency Medicine using a dataset of 100K+ continuously monitored Emergency Department visits from 2020-2022. MC-BEC focuses on clinically relevant prediction tasks at timescales from minutes to days, including predicting patient decompensation, disposition, and emergency department (ED) revisit, and includes a standardized evaluation framework with train-test splits and evaluation metrics. The multimodal dataset includes a wide range of detailed clinical data, including triage information, prior diagnoses and medications, continuously measured vital signs, electrocardiogram and photoplethysmograph waveforms, orders placed and medications administered throughout the visit, free-text reports of imaging studies, and information on ED diagnosis, disposition, and subsequent revisits. We provide performance baselines for each prediction task to enable the evaluation of multimodal, multitask models. We believe that MC-BEC will encourage researchers to develop more effective, generalizable, and accessible foundation models for multimodal clinical data.
Bio
Emma Chen is a second-year Ph.D. student in Computer Science at Harvard University, co-advised by Professor Vijay Janapa Reddi and Professor Pranav Rajpurkar. Her research focuses on multimodal machine learning for healthcare. In her free time, she writes a weekly newsletter, Doctor Penguin Weekly (https://doctorpenguin.substack.com), with Dr. Eric Topol to share the latest important medical AI research with the community.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/19/2024) Speaker: Seongsu Bae

KAIST

Title
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
Abstract
Electronic Health Records (EHRs) are rich in clinical information, presented in a multi-modal format. Developing a question answering (QA) system to interact with EHRs can significantly aid clinicians in making informed decisions, assist researchers in conducting precise studies, and enhance hospital operational efficiency. Current research mainly focuses on uni-modal QA, such as table-based or image-based QA. This leaves a gap in multi-modal QA, a largely underexplored area. In this talk, I will introduce EHRXQA, the first multi-modal clinical QA dataset requiring reasoning over both tables and images. I will detail the creation of two datasets: a new X-ray-based QA dataset utilizing MIMIC-CXR and Chest ImaGenome, and a table-based QA dataset from MIMIC-IV using EHRSQL, where the two are combined to create EHRXQA. Additionally, I will discuss the potential of NeuralSQL, which integrates SQL with external function calls, in addressing multi-modal questions.
Bio
Seongsu Bae is a second-year Ph.D. student at KAIST's Kim Jaechul Graduate School of AI, advised by Prof. Edward Choi. His research focuses on multi-modal learning, healthcare AI, and the evaluation and benchmarking of AI systems for deployment. Seongsu is dedicated to developing and assessing AI models that understand multi-modal Electronic Health Records (EHRs), including both structured EHRs and medical images/reports. Recently, he completed an internship at Microsoft Research Asia (MSRA) in 2023, focused on multi-modal question answering and generative models in the healthcare domain.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/15/2024) Speaker: Maximilian Dreyer

Fraunhofer Heinrich Hertz Institute

Title
Reveal to Revise: How to Uncover and Correct Biases of Deep Models in Medical Applications
Abstract
Deep Neural Networks are prone to learning spurious correlations embedded in the training data, leading to potentially biased predictions. This poses risks when deploying these models for high-stakes decision-making, such as in medical applications. In this talk, we will explore the latest techniques to reveal and revise model biases. To reveal model misbehavior, we will study Explainable AI methods of the next generation that communicate model behavior using human-understandable concepts (locally and globally). To revise biases, techniques based on full retraining, fine-tuning or no additional training (post-hoc) are discussed. At last, possible ways to evaluate the success of bias unlearning are presented.
Bio
Maximilian Dreyer is a PhD student in the Explainable AI group led by Sebastian Lapuschkin and Wojciech Samek of the Fraunhofer Heinrich Hertz Institute in Berlin (Germany). His research focuses, on the one hand, on developing XAI method that are human-understandable, insightful and yet require low human effort. Secondly, Maximilian works on frameworks that allow to improve AI models based on XAI insights. Specifically, his research focuses here on revealing and revising model (mis)-behavior. Maximilian obtained his B.Sc. in Physics at Humboldt-University of Berlin and M.Sc. in Computational Science at University of Potsdam.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/8/2024) Speaker: Margarita Bintsi

Imperial College London

Title
Multimodal Brain Age Estimation Using Interpretable Adaptive Population-Graph Learning
Abstract
Brain age estimation is clinically important as it can provide valuable information in the context of neurodegenerative diseases such as Alzheimer’s. Population graphs, which include multimodal imaging information of the subjects along with the relationships among the population, have been used in literature along with Graph Convolutional Networks (GCNs) and have proved beneficial for a variety of medical imaging tasks. A population graph is usually static and constructed manually using non-imaging information. However, graph construction is not a trivial task and might significantly affect the performance of the GCN, which is inherently very sensitive to the graph structure. In this work, we propose a framework that learns a population graph structure optimized for the downstream task. An attention mechanism assigns weights to a set of imaging and non-imaging features (phenotypes), which are then used for edge extraction. The resulting graph is used to train the GCN. The entire pipeline can be trained end-to-end. Additionally, by visualizing the attention weights that were the most important for the graph construction, we increase the interpretability of the graph. We use the UK Biobank, which provides a large variety of neuroimaging and non-imaging phenotypes, to evaluate our method on brain age regression and classification. The proposed method outperforms competing static graph approaches and other state-of-the-art adaptive methods. We further show that the assigned attention scores indicate that there are both imaging and non-imaging phenotypes that are informative for brain age estimation and are in agreement with the relevant literature.
Bio
Margarita is a final year PhD student at the BioMedIA lab at Imperial College London, supervised by Prof. Daniel Rueckert. Her PhD is on deep learning for medical imaging, and more specifically brain age estimation. Her interests include graph machine learning and multimodal learning, with a particular focus on increasing interpretability. During her PhD, she interned at Microsoft Research and the NASA Frontier Development Lab, when she applied machine learning methods to telecommunications and solar physics, respectively.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/23/2023 to 2/1/2024) Speaker: Thanksgiving Break + Winter Break -- We will see you next year :)


(11/16/2023) Speaker: Zhi Huang

Stanford University

Title
Visual-language foundation model for pathology research and education
Abstract
The lack of annotated publicly available medical images is a major barrier for computational research and education innovations. At the same time, many de-identified images and much knowledge are shared by clinicians on public forums such as medical Twitter. Here we harness these crowd platforms to curate OpenPath, a large dataset of 208,414 pathology images paired with natural language descriptions. We demonstrate the value of this resource by developing pathology language–image pretraining (PLIP), a multimodal artificial intelligence with both image and text understanding, which is trained on OpenPath. PLIP achieves state-of-the-art performances for classifying new pathology images across four external datasets: for zero-shot classification, PLIP achieves F1 scores of 0.565–0.832 compared to F1 scores of 0.030–0.481 for previous contrastive language–image pretrained model. Training a simple supervised classifier on top of PLIP embeddings also achieves 2.5% improvement in F1 scores compared to using other supervised model embeddings. Moreover, PLIP enables users to retrieve similar cases by either image or natural language search, greatly facilitating knowledge sharing. Our approach demonstrates that publicly shared medical information is a tremendous resource that can be harnessed to develop medical artificial intelligence for enhancing diagnosis, knowledge sharing and education.
Bio
Zhi Huang is a postdoctoral fellow at Stanford University. In August 2021, He received a Ph.D. degree from Purdue University, majoring in Electrical and Computer Engineering (ECE). His background is in the area of Artificial Intelligence, Digital Pathology, and Computational Biology.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/9/2023) Speaker: Chunfeng Lian

Xian Jiaotong University

Title
A dual meta-learning framework with application to longitudinally generalized brain tissue segmentation (and other cross-domain segmentation tasks)
Abstract
Brain tissue segmentation is essential for neuroscience and clinical studies. However, segmentation on (unpaired) longitudinal data is technically challenging due to dynamic brain changes across the lifespan, especially in infancy. In this talk, we’ll introduce a dual meta-learning (DuMeta) paradigm to learn efficiently longitudinally consistent/generalized representations and persist when fine-tuning. Results on heterogenous T1w MRI datasets demonstrate the effectiveness of DuMeta in the challenging one-shot segmentation setting. Also, some further experiments suggest that such a meta-learning strategy could also be applied to other cross-domain segmentation tasks.
Bio
Chunfeng Lian is currently an Associate Professor of Information Science, School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, China. His research focuses on machine learning in medical image computing. He received his PhD in Computer Science from Université de Technologie de Compiègne, CNRS, Heudiasyc (UMR 7253), Compiègne, France, in 2017. From 2017 to 2020, he was a Postdoc with the Radiology department & Biomedical Research Imaging Center (BRIC), UNC-Chapel Hill. Dr. Lian serves as an Associate Editor of three international journals, i.e., Medical Physics, IRBM, and Frontiers in Radiology, and was a Distinguished Reviewer of IEEE T-MI. He was an AC of multiple conferences, e.g., MICCAI’23 & ICPR’22, & a leading co-organizer of MICCAI-MLMI’ 22 workshop.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/2/2023) Speaker: Ayis Pyrros

University of Illinois, Chicago

Title
Harnessing the Power of AI in Radiology: Revolutionizing Population Health Management
Abstract
As healthcare progressively integrates technological advancements, the intersection of population health and Artificial Intelligence (AI) emerges as a groundbreaking frontier, especially in the realm of medical imaging. This talk delves into the transformative potential of AI in shaping population health strategies through deep learning with medical imaging techniques. We will explore the contemporary methodologies by which AI-driven algorithms process, analyze, and interpret vast volumes of imaging data, identifying patterns that might be imperceptible to the human eye. Such capabilities pave the way for early disease detection, prediction of epidemiological trends, and personalized care strategies at a population level.
Bio
Dr. Ayis Pyrros, a board-certified neuroradiologist, specializes in clinical informatics and machine learning applications within diagnostic radiology. After completing his education at Jefferson Medical College and further specialization at Northwestern Memorial Hospital, he earned a Clinical Informatics certification from the American Board of Preventive Medicine. At Duly Health and Care, Dr. Pyrros played a pivotal role in designing and implementing a machine learning-driven population health platform. His notable research at the Medical Imaging Data Resource Center (MIDRC) has been widely published, including a significant study on the AI-enabled detection of type 2 diabetes from chest radiographs. Currently, he serves as adjunct faculty at the University of Illinois Chicago and is an active contributor to multiple radiological societies (RSNA, ASNR, and ARRS), advocating for advancements in informatics-driven patient care.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/26/2023) Speaker: Aishik Konwer

Stony Brook University

Title
Embracing Imperfection in Medical Vision: Towards Data-Efficient Representation Learning
Abstract
Deep learning approaches have been widely used for supervised tasks like segmentation and predictive modeling in medical computer vision. However, in many cases, medical datasets are limited or imperfect due to the absence of annotations and modalities, thereby preventing traditional frameworks from being applied off the shelf. Such scenarios pose highly complex challenges regarding the efficient usage of accessible modalities to boost downstream performance. There is also a dearth of sophisticated methods that facilitate the incorporation of hidden spatial and temporal data patterns into machine learning frameworks, to aid downstream tasks. I will present data-efficient representation learning approaches for disease severity/outcome prediction modeling from longitudinal imaging data, and lesion segmentation from missing imaging data. This talk comprises discussions encapsulating frameworks that: 1) jointly exploit the spatial distribution within images and the temporal information across time points, 2) augment snapshot-image-based pipelines by the integration of information from multi-image sequences, and 3) employ novel meta-adversarial learning strategies to address the problem of missing MRI sequences in brain tumor segmentation.
Bio
Aishik Konwer is a final year PhD student in the Computer Science department at Stony Brook University, New York. He works in the Imaging Informatics for Precision Medicine (IMAGINE) lab, supervised by Dr. Prateek Prasanna. His research in medical vision focuses on the development of algorithms that leverage meta-learning, few-shot learning, self-supervised learning, and knowledge distillation techniques to tackle challenges associated with imperfect data in medical imaging. Driven by these core techniques, he has proposed several modality- and annotation-efficient learning algorithms for tumor segmentation, disease progression, and outcome prediction tasks, which have been published in top-tier conferences such as CVPR, ICCV, MICCAI, and MIDL. For more details please visit his homepage https://aishikkonwer95.github.io/.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/19/2023) Speaker: Zongwei Zhou

Johns Hopkins University

Title
Medical Image Analysis: Scaling Annotations, Datasets, and Algorithms
Abstract
Cancer, a leading cause of mortality, can be effectively treated if detected in its early stages. However, early detection is challenging for both humans and computers. While AI can identify details beyond human perception, delineate anatomical structures, and localize abnormalities in medical images, the training of these algorithms requires large-scale datasets, comprehensive annotations, and cutting-edge algorithms. Several disciplines, including natural language processing (e.g., GPTs), representation learning (e.g., MAE), and image segmentation (e.g., SAMs), have witnessed the transformative power of scaling data for AI advancement, but this concept remains relatively underexplored in medical imaging due to the inherent challenges in data and annotation curation. This talk seeks to bridge this gap by focusing on datasets, annotations, and algorithms that are integral to the analysis of medical images, particularly for early cancer detection.
Bio
Zongwei Zhou is a postdoctoral researcher at Johns Hopkins University. He received his Ph.D. in Biomedical Informatics at Arizona State University in 2021. His research focuses on developing novel methods to reduce the annotation efforts for computer-aided detection and diagnosis. Zongwei received the AMIA Doctoral Dissertation Award in 2022, the Elsevier-MedIA Best Paper Award in 2020, and the MICCAI Young Scientist Award in 2019. In addition to five U.S. patents, Zongwei has published over 30 peer-reviewed journal/conference articles, two of which have been ranked among the most popular articles in IEEE TMI and the highest-cited article in EJNMMI Research. He was named the top 2 percent of Scientists released by Stanford University in 2022 and 2023. Zongwei has been elected to the Guest Editor of Sensors and J. Imaging; Reviewer of IEEE TPAMI, MedIA, Information Fusion, and IEEE TMI; and Area Chair for CVPR in 2024.
Video
Session not recorded on request
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/4/2023) Speaker: Adam Yala

UC Berkeley and UCSF

Title
AI for Personalized Cancer Screening
Abstract
For multiple diseases, early detection significantly improves patient outcomes. This motivates considerable investments in population-wide screening programs, such as mammography for breast cancer and low-dose CT for lung cancer. To be effective and economically viable, these programs must find the right balance between early detection and overscreening. This capacity builds on two complementary technologies: (1) the ability to accurately assess patient risk at a given time point and (2) the ability to design screening regimens based on this risk. Moreover, these tools must obtain consistent performance across diverse populations and adapt to new clinical requirements while learning from limited datasets. In this talk, I’ll discuss approaches to address these challenges in cancer risk assessment from imaging and personalized screening policy design. I’ve demonstrated that these clinical models offer significant improvements over the current standard of care across globally diverse patient populations, and our image-based tools now underly prospective trials.
Bio
Adam Yala is an assistant professor of Computational Precision Health, Electrical Engineering and Computer Science at UC Berkeley and UCSF. His research focuses on developing machine learning methods for personalized medicine and translating them to clinical care. His previous research has focused on two areas: 1) predicting future cancer risk, and 2) designing personalized screening policies. Adam's tools underly multiple prospective trails and his research has been featured in the Washington Post, New York Times, STAT, Boston Globe and Wired. Prof Yala obtained his BS, MEng and PhD in Computer Science from MIT where he was a member of MIT Jameel Clinic and MIT CSAIL.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/28/2023) Speaker: Julia Wolleb

University of Basel

Title
Denoising Diffusion Models for Medical Image Analysis
Abstract
Over the past two years, denoising diffusion models for image generation have seen tremendous success. This new class of deep learning models outperforms previous approaches and has become widely popular with frameworks such as Stable Diffusion and Dall-E for text-to-image generation. We explore how this state-of-the-art technique can be applied to medical tasks. We will discuss medical applications such as segmentation of anatomical structures, contrast harmonization of MR images, automatic implant generation, and weakly supervised anomaly detection. Additionally, the presentation will provide insights into the current state of research, highlight limitations, and offer a glimpse of future directions in this field.
Bio
Julia Wolleb is a postdoctoral researcher at the Center for medical Image Analysis & Navigation at the University of Basel. She holds a Master's degree in Mathematics from the University of Basel, with a focus on numerics and algebra. She completed her Master's thesis in 2018 in the Mathematical Epidemiology group at the Swiss Tropical and Public Health Institute. She then pursued a PhD at the Department of Biomedical Engineering at the University of Basel, where she successfully defended her thesis in 2022, which mainly focused on the automatic detection of pathological regions in medical images. Julia's research interests focus on the development of robust and reliable deep learning methods for medical image analysis in clinical applications. Throughout her academic career, she has explored various deep learning approaches, including tasks such as image segmentation, weakly supervised anomaly detection, image-to-image translation, and domain adaptation.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/21/2023) Speaker: Zeshan Hussain

Harvard-MIT Health

Title
Benchmarking Causal Effects from Observational Studies using Experimental Data
Abstract
Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and lack data on broader populations of interest. Causal effects in these populations are often estimated using observational datasets, which may suffer from unobserved confounding and selection bias. Given a set of observational estimates (e.g., from multiple studies), we propose a meta-algorithm that attempts to reject observational estimates that are biased. We do so using validation effects, causal effects that can be inferred from both RCT and observational data. After rejecting estimators that do not pass this test, we generate conservative confidence intervals on the extrapolated causal effects for subgroups not observed in the RCT. Under the assumption that at least one observational estimator is asymptotically normal and consistent for both the validation and extrapolated effects, we provide guarantees on the coverage probability of the intervals output by our algorithm. To facilitate hypothesis testing in settings where causal effect transportation across datasets is necessary, we give conditions under which a doubly-robust estimator of group average treatment effects is asymptotically normal, even when flexible machine learning methods are used for estimation of nuisance parameters. We illustrate the properties of our approach on semi-synthetic experiments based on the IHDP and Women's Health Initiative datasets, and show that it compares favorably to standard meta-analysis techniques.
Bio
Zeshan Hussain is an MD-PhD student in the Harvard-MIT Health, Sciences, and Technology program. He completed his PhD in June 2023 at MIT EECS advised by Prof. David Sontag, and is on track to finish his MD in 2025. His thesis was titled, 'Towards Precision Oncology: A Predictive and Causal Lens.' His research interests span several areas, including deep generative models for healthcare, causal inference, and human-AI interaction. His work includes building more predictive generative models of clinical sequential data, designing statistical methods to quantify the uncertainty of causal effects and assess their reliability using experimental data, and studying how these elements might affect physician decision-making through user studies. Zeshan's research is generously supported by the NIH Ruth L. Kirschstein National Research Service F30 Award.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/14/21) Speaker: No session this week -- Fall break!


(9/7/21) Speaker: No session this week -- Fall break!


(8/31/2023) Speaker: Monica Agrawal

Duke University

Title
Scalable Natural Language Processing for Transforming Medicine
Abstract
The data in electronic health records (EHRs) have immense potential to transform medicine both at the point-of-care and through retrospective research. However, structured data alone can only tell a fraction of patients' clinical narratives, as many clinically important variables are trapped within clinical notes. In this talk, I will discuss scalable natural language processing (NLP) solutions to overcome these technical challenges in clinical information extraction. These include the development of label-efficient modeling methodology, novel techniques for leveraging large language models, and a new paradigm for EHR documentation that incentivizes the creation of high-quality data at the point-of-care.
Bio
Dr. Monica Agrawal is an incoming assistant professor at Duke University, joint between the Department of Biostatistics and Bioinformatics and the Department of Computer Science, as well as the co-founder of a new health technology startup. In her research, she tackles diverse challenges including scalable clinical information extraction, smarter electronic health records, and human-in-the-loop systems. Her work has been published at venues in machine learning, natural language processing, computational health, and human-computer interaction. She recently earned her PhD in Computer Science in the Clinical Machine Learning Group at MIT and previously obtained a BS/MS in Computer Science from Stanford University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/24/2023) Speaker: Yunhe Gao

Rutgers University

Title
Toward Universal Medical Image Segmentation: Challenges and Opportunities
Abstract
A major enduring focus of clinical workflows is disease analytics and diagnosis, leading to medical imaging datasets where the modalities and annotations are strongly tied to specific clinical objectives. To date, the prevailing training paradigm for medical image segmentation revolves around developing separate models for specific medical objects (e.g., organs or tumors) and image modalities (e.g., CT or MR). This traditional paradigm can hinder the robustness and generalizability of these AI models, inflate costs when further scaling data volumes, and fail to exploit potential synergies among various medical imaging tasks. By observing the training program of radiology residency, we recognize that radiologists’ expertise arises from routine exposure to a diverse range of medical images across body regions, diseases, and imaging modalities. This observation motivates us to explore a new training paradigm, 'universal medical image segmentation', whose key goal is to learn from diverse medical imaging sources. In this talk, I’ll delve into challenges in the new paradigm including issues with partial labeling, conflicting class definitions, and significant data heterogeneity. I’ll also present our pioneering solution, Hermes, aimed at tackling these challenges. We demonstrate that our proposed universal paradigm not only offers enhanced performance and scalability, but also excels in transfer learning, incremental learning and generalization. This innovative approach opens up new perspectives for the construction of foundational models in a broad range of medical image analysis.
Bio
Yunhe Gao is a fourth-year Ph.D. candidate in the Department of Computer Science at Rutgers University, advised by Distinguished Professor Dimitris Metaxas. His research focuses on computer vision and medical image analysis. He is also broadly interested in AI model robustness, data efficiency and their applications in the machine learning and healthcare domains. His research has been published in top-tier venues such as IEEE TMI, MedIA, ICCV, ECCV, IPMI and MICCAI.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/17/2023) Speaker: Hyungjin Chung

KAIST

Title
Generative Diffusion Models for Medical Imaging
Abstract
Foundational generative models are gaining more and more interest in recent days. Among them, all the modalities except language seem to be converging towards a single class of model: diffusion models. In this talk, we will focus on leveraging diffusion models as generative priors using Bayesian inference, and how to use them to solve inverse problems, especially focusing on medical imaging. The talk will be focused on fully leveraging the power of foundation models by using them as plug-and-play building blocks to solve challenging downstream tasks. At the end of the talk, a new work on adapting diffusion models for out-of-distribution measurements, showcasing that diffusion models can be used to reconstruct data that were trained on completely different datasets.
Bio
Hyungjin Chung is a PhD student at KAIST, and a student researcher at Google AI. His research interests lie on the intersection between deep generative models and computational imaging. Especially, he has pioneered many works on using diffusion models for solving inverse problems, many of them focusing on biomedical imaging. Hyungjin completed his Master's degree in KAIST, and his Bachelor's degree in Korea University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/10/2023) Speaker: Jeya Maria Jose Valanarasu

Stanford University

Title
Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training
Abstract
Harnessing the power of pre-training on large-scale datasets like ImageNet forms a fundamental building block for the progress of representation learning-driven solutions in computer vision. Medical images are inherently different from natural images as they are acquired in the form of many modalities (CT, MR, PET, Ultrasound etc.) and contain granulated information like tissue, lesion, organs etc. These characteristics of medical images require special attention towards learning features representative of local context. In this work, we focus on designing an effective pre-training framework for 3D radiology images. First, we propose a new masking strategy called local masking where the masking is performed across channel embeddings instead of tokens to improve the learning of local feature representations. We combine this with classical low-level perturbations like adding noise and downsampling to further enable low-level representation learning. To this end, we introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations. Additionally, we also devise a cross-modal contrastive loss (CMCL) to accommodate the pre-training of multiple modalities in a single framework. We curate a large-scale dataset to enable pre-training of 3D medical radiology images (MRI and CT). The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance. Notably, our proposed method tops the public test leaderboard of BTCV multi-organ segmentation challenge.
Bio
Jeya Maria Jose Valanarasu is a postdoctoral researcher at Stanford University working with Andrew Ng and Curtis Langlotz. He obtained his Ph.D. and M.S. from Johns Hopkins University advised by Vishal M. Patel. His interests broadly span across Computer Vision, Machine Learning, and Healthcare. His research aims to overcome the challenges that arise when translating machine learning models to practical applications for healthcare and engineering sectors. He has experience working on multiple problem statements like developing new architectures, large-scale pre-training, vision-language models, domain adaptation, and data-efficient learning for various applications areas. He has authored over 30 research articles and his works have been recognized through multiple awards like Amazon Research Fellowship, Young Scientist Impact Award Finalist (MICCAI), Best Student Paper Awards (ICRA, CVIP), and NIH MICCAI Award.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/3/2023) Speaker: Keno Bressem

Radiologist, Charité – Universitätsmedizin Berlin

Title
Training and Application of Language Models in Medicine
Abstract
Language Models (LMs) can significantly impact the medical field by transforming textual information into usable data and addressing the shift from general language to medical language. The unique word distribution and sensitive data handling required in medical texts present distinct challenges and opportunities. Pretrained, domain specific models, such as BERT, can performs well in tasks such as extraction of procedures and ICD codes and the generation of text embeddings for downstream applications, despite its lack of conversational ability and the need for fine-tuning. Conversational medical Large Language Models (LLMs) offer a range of benefits, including accessibility, task versatility, data privacy, and educational utility. Still, they can also face issues such as hallucinations, insufficient data, errors in the data, and narrow instructions, limiting their current applicability. LMs can also take on a notable role in structuring data, particularly in the field of radiology. GPT-4 demonstrates potential for structured reporting, by converting complex free text reports into structured formats. However, limitations exist, such as the inability to use information not present in the text, potential for incorrect data entry, and privacy concerns. Understanding these aspects can contribute to a more nuanced perspective on the potential of LMs in medicine and the considerations necessary for their successful implementation.
Bio
Dr. Keno Bressem is a board-certified radiologist with six years of experience at Charité – Universitätsmedizin Berlin. His clinical expertise encompasses CT, MRI, Ultrasound, X-ray, and interventional radiology. In addition to his medical expertise, Dr. Bressem is also proficient in computer science, a skill set developed through research conducted at Charité and Harvard Medical School. His dual expertise in radiology and computer science has led to over 60 publications in the field of digital medicine. Currently, Dr. Bressem leads the international COMFORT project, an EU-funded initiative that focuses on the application of AI in the treatment of urogenital cancers, underscoring his commitment to improving patient care through advanced digital solutions.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/27/2023) Speaker: Zifeng Wang

University of Illinois Urbana-Champaign

Title
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
Abstract
Existing vision-text contrastive learning like CLIP aims to match the paired image and caption embeddings while pushing others apart, which improves representation transferability and supports zero-shot prediction. However, medical image-text datasets are orders of magnitude below the general images and captions from the internet. Moreover, previous methods encounter many false negatives, i.e., images and reports from separate patients probably carry the same semantics but are wrongly treated as negatives. In this paper, we decouple images and texts for multimodal contrastive learning thus scaling the usable training data in a combinatorial magnitude with low cost. We also propose to replace the InfoNCE loss with semantic matching loss based on medical knowledge to eliminate false negatives in contrastive learning. We prove that MedCLIP is a simple yet effective framework: it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval. Surprisingly, we observe that with only 20K pre-training data, MedCLIP wins over the state-of-the-art method (using around 200K data).
Bio
Zifeng Wang is a PhD student in the computer science department at UIUC. His research interests are AI4Health, AI4Drug, and NLP. He published in NeurIPS, ICLR, KDD, AAAI, EMNLP, etc. He is the recipient of the Best Student Paper award of PAKDD'21. His research is sponsored by Yunni & Maxine Pao Memorial Fellowship and Yee Memorial Fellowship.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/20/2023) Speaker: Cheng-Yu Hseih

University of Washington

Title
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Abstract
Deploying large language models (LLMs) is challenging because they are memory inefficient and compute-intensive for practical applications. In reaction, researchers train smaller task-specific models by either finetuning with human labels or distilling using LLM-generated labels. However, finetuning and distillation require large amounts of training data to achieve comparable performance to LLMs. We introduce Distilling step-by-step, a new mechanism that (a) trains smaller models that outperform LLMs, and (b) achieves so by leveraging less training data needed by finetuning or distillation. Our method extracts LLM rationales as additional supervision for training small models within a multi-task framework. We present three findings across 4 NLP benchmarks: First, compared to both finetuning and distillation, our mechanism achieves better performance with much fewer labeled/unlabeled training examples. Second, compared to few-shot prompted LLMs, we achieve better performance using substantially smaller model sizes. Third, we reduce both the model size and the amount of data required to outperform LLMs; our finetuned 770M T5 model outperforms the few-shot prompted 540B PaLM model using only 80% of available data on a benchmark, whereas standard finetuning the same T5 model struggles to match even by using 100% of the dataset.
Bio
Cheng-Yu Hsieh is a PhD student at UW working with Ranjay Krishna and Alex Ratner. He focuses on data-centric machine learning, specifically around developing techniques and tools that help users efficiently curate large-scale datasets, enrich dataset information, and communicate model behavior via data. He is also interested in the interplay between data and modern large-scale pretrained models.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/13/2023) Speaker: Alexandros Rekkas

Erasmus University Medical Center

Title
Risk-based approaches to the assessment of treatment effect heterogeneity
Abstract
In order to provide the most optimal medical care, doctors are advised to align their clinical treatments with the results of well-conducted clinical trials, or the aggregated results of multiple such trials. However, the overall estimated treatment effect is often an average of heterogeneous treatment effects and, as such, may not be applicable to most patient subgroups, let alone individual patients. A patient’s baseline risk—her or his probability of experiencing an outcome of interest—is an important determinant of treatment effect and can be used to guide medical decisions. In this talk we present methods for risk-based assessment of treatment effect heterogeneity in both the clinical trial and the observational setting. Performance of these methods is evaluated using simulations and real-world data.
Bio
Alexandros is a PhD student at the Department of Medical Informatics of Erasmus University Medical Center, Netherlands. His research focuses on risk prediction and its use for guiding medical decisions. He is an active member of the OHDSI collaborative which promotes large-scale observational research through the adoption of an open community data standard. He has a bachelor's degree in mathematics from Aristotle University of Thessaloniki (Greece) and a master’s degree in Statistics from KU Leuven (Belgium).
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/6/2023) Speaker: Yun Liu

Google

Title
Lessons on the path from Code to Clinic
Abstract
Inspired by the potential of artificial intelligence (AI) to improve access to expert-level medical image interpretation, several organizations began developing deep learning-based AI systems for detecting diabetic retinopathy (DR) from retinal fundus images around 2015. Today, these AI-based tools are finally being deployed at scale in certain parts of the world, often bringing DR screening to a population lacking easy access to timely diagnosis. The path to translating AI research into a useful clinical tool has gone through several unforeseen challenges along the way. In this talk, we share some lessons contrasting a priori expectations (*myths*) with synthesized learnings of what truly transpired (*reality*), to help others who wish to develop and deploy similar medical AI tools. In addition, I will cover recent work in explainability and learning new knowledge from deep learning models.
Bio
Yun Liu is a staff research scientist in Google Research. In this role he focuses on developing and validating machine learning for healthcare across multiple fields: pathology, ophthalmology, radiology, dermatology, and beyond. Yun completed his PhD at Harvard-MIT Health Sciences and Technology, where he worked on predictive risk modeling using biomedical signals, medical text, and billing codes. He has previously also worked on predictive modeling for nucleic acid sequences and protein structures. Yun completed a B.S. in Molecular and Cellular Biology and Computer Science at Johns Hopkins University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/29/2023) Speaker: Jiwoong Jason Jeong

Arizona State University

Title
The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.4 Million Screening and Diagnostic Mammographic Images
Abstract
Breast cancer detection remains one of the most frequent commercial and research applications for deep learning (DL) in radiology. Development of DL models to improve breast cancer screening requires robustly curated and demographically diverse datasets to ensure generalizability. However, most publicly released breast cancer datasets are ethnically and racially homogeneous, are relatively small, and lack image annotations and/or pathologic data. While many mammographic datasets are available, they either lack semantic imaging descriptors (OMI-DB) or are not full-field digital mammograms (CBIS-DDSM) along with the underrepresentation of minority patients. As such, we created the EMory BrEast imaging Dataset (EMBED) contains lesion-level annotations, pathologic outcomes, and demographic information for 116 000 patients from racially diverse backgrounds and will help bridge the existing gaps in granularity, diversity, and scale in breast imaging datasets.
Bio
Jiwoong Jason Jeong is a third year PhD student at ASU’s School of Computing and Augmented Intelligence, advised by Dr. Imon Banerjee. His primary research interests lies in using generative models to handle data imbalance problems in classification tasks but also focuses on the application and deployment of classification models in healthcare.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/22/2023) Speaker: Michael Moor

Stanford University

Title
Generalist Medical AI
Abstract
The exceptionally rapid development of highly flexible, reusable artificial intelligence (AI) models is likely to usher in newfound capabilities in medicine. In this talk, we will discuss a new paradigm for medical AI: generalist medical AI (GMAI). GMAI models will be capable of carrying out a diverse set of tasks using very little or no task-specific labelled data. Built through self-supervision on large, diverse datasets, GMAI will flexibly interpret different combinations of medical modalities, including data from imaging, electronic health records, laboratory results, genomics, graph-structured data, or medical text. Models will in turn produce expressive outputs such as free-text explanations, spoken recommendations or image annotations that demonstrate advanced medical reasoning abilities. We identify a set of high-impact potential applications for GMAI and lay out specific technical capabilities and training datasets necessary to enable them. Finally, we discuss how GMAI-enabled applications will challenge current strategies for regulating and validating AI devices for medicine.
Bio
Dr. Moor is a postdoctoral scholar at Stanford University’s Department of Computer Science, mentored by Prof. Jure Leskovec. He completed his PhD at ETH Zurich in the Machine Learning and Computational Biology lab advised by Prof. Karsten Borgwardt. He obtained his medical degree at the University of Basel. Dr. Moor’s research revolves around developing novel machine learning approaches for analyzing multimodal medical data. He developed machine learning methods for clinical early warning systems, such as the detection of sepsis. Beyond applications, he also developed representation learning methods to better capture the structure of complex and high-dimensional input data. Most recently, Dr. Moor explores zero-shot learning, knowledge injection, and medical reasoning in large-scale medical AI. Dr. Moor is a recipient of the Fellowship of the Swiss Study Foundation, a program designed to promote academic excellence in Switzerland. His doctoral thesis was nominated for the 2022 ETH silver medal for best dissertations.
Video
Session not recorded on request
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/15/2023) Speaker: Summer Break (No MedAI Session)


(6/8/2023) Speaker: Summer Break (No MedAI Session)


(6/1/2023) Speaker: Siyi Tang

Artera AI

Title
Modeling Multivariate Biosignals With Graph Neural Networks and Structured State Space Models
Abstract
Multivariate biosignals play an important role in many medical domains, such as electroencephalography, polysomnography, and electrocardiography. Modeling multivariate biosignals is challenging due to (1) long-range temporal dependencies and (2) complex spatial correlations between the electrodes. In this talk, I will present a general graph neural network (GNN) architecture, GraphS4mer, which aims to address the aforementioned challenges. Specifically, (1) we leverage the Structured State Space architecture, a state-of-the-art deep sequence model, to capture long-range temporal dependencies in biosignals and (2) we propose a graph structure learning layer to learn dynamically evolving graph structures in the data. We evaluate our proposed model on three distinct biosignal classification tasks and show that GraphS4mer consistently improves over existing models, including (1) seizure detection from electroencephalographic signals, outperforming a previous GNN with self-supervised pre-training by 3.1 points in AUROC; (2) sleep staging from polysomnographic signals, a 4.1 points improvement in macro-F1 score compared to existing sleep staging models; and (3) 12-lead electrocardiogram classification, outperforming previous state-of-the-art models by 2.7 points in macro-F1 score.
Bio
Siyi is a Machine Learning Scientist at ArteraAI, where her research focuses on developing multimodal deep learning models for predicting cancer patient outcomes and personalizing treatment strategies. Siyi recently received her PhD Degree from Stanford University, where she was advised by Prof. Daniel Rubin. At Stanford, she worked on developing deep learning methods for modeling medical time series data, with a focus on graph-based modeling approaches. She also co-organized the Stanford MedAI Group Exchange Sessions with Nandita. Prior to Stanford, Siyi received her Bachelor's Degree in Electrical Engineering with Highest Distinction Honors from National University of Singapore.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/25/2023) Speaker: Meirui Jiang

The Chinese University of Hong Kong

Title
Robust and Reliable Federated Learning for Heterogeneous Medical Images
Abstract
Federated learning (FL) has become a promising paradigm to enable multi-institutional collaboration without sharing raw data. However, data heterogeneity is still an outstanding challenge in FL, typically in cases where the local devices are responsible for heterogeneity in the data distributions. It is important to find a way to train the model robustly and reliably by utilizing large amounts of distributed data. This presentation will discuss our ongoing progress in designing FL algorithms that embrace the data heterogeneity properties in medical images, including tackling the data heterogeneity for FL global model, personalizing the client model and improving the generalization in FL, and promoting fairness in FL training on heterogeneous data.
Bio
Meirui Jiang is a third-year Ph.D. student in the Department of Computer Science and Engineering at the Chinese University of Hong Kong, supervised by Prof. Qi Dou. His research interest lies in federated learning for medical image analysis, aiming to promote the applicability (efficiency, privacy, fairness, generalizability) of medical imaging research by utilizing large, distributed data. His research has been published in at top-tier conferences and journals such as ICLR, CVPR, AAAI, MICCAI, Nature Communications, and IEEE TMI, and he also wrote a book chapter on Machine Learning, Medical AI and Robotics.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/18/2023) Speaker: Haoran Zhang

MIT

Title
Why did the Model Fail? Attributing Model Performance Changes to Distribution Shifts
Abstract
Clinical machine learning models frequently experience performance drops under distribution shifts. These shifts may occur when a model is deployed in a different domain, a different subpopulation, or gradually over time. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing model failure to these factors is critical for the model developer to identify the root cause and take mitigating action. In this talk, I will motivate the performance attribution problem, and present our recent method for attributing model performance changes to distribution shifts in the underlying data generating mechanism. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only that set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts.
Bio
Haoran Zhang is a 2nd year PhD student at the Computer Science and Artificial Intelligence Laboratory at MIT, advised by Prof. Marzyeh Ghassemi. His research focuses on methods to construct fair and robust machine learning models which maintain their performance across real-world distribution shifts. He is also interested in the application of such methods in the healthcare domain. His research has appeared in top venues such as Nature Medicine, NeurIPS, ICML, and ACM FAccT. Before joining MIT, Haoran received his M.Sc. from the University of Toronto and his B.Eng. from McMaster University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/11/2023) Speaker: Elliot Bolton

Stanford University

Title
Foundation Models For Biomedical QA
Abstract
During this talk we’ll discuss CRFM’s work on large scale foundation models for the biomedical domain, culminating in a 2.7 billion parameter GPT-style model trained on PubMed. We’ll discuss the training process and results of our evaluations in the question answering setting, and survey biomedical NLP language model landscape. We aim for a vibrant discussion with the biomedical community, and would love to hear about potential applications to help us shape future models.
Bio
Elliot is a research engineer working on language models for Stanford CRFM. Currently he is working on a large scale, open source foundation model for biomedical NLP. For nearly a decade Elliot has helped build and maintain various open source Stanford projects including CoreNLP, Stanza, and Mistral.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/4/2023) Speaker: Tom Hartvigsen

MIT

Title
Avoiding Model Aging by Continually Editing and Repairing Deployed Models
Abstract
Deployed machine learning models often fail in unexpected ways. For example, models can perpetuate unexpected biases and rely on spurious features in their training data. Such performance decay can also happen gradually over time. While there is a wealth of work training models to be robust to shifting label and covariate distributions, a new paradigm is arising: updatable machine learning, where we adapt, edit, or repair big pre-trained models. With the advent of foundation models, such methods pose a cost-effective alternative to expensive retraining, especially when pre-training data are private or proprietary. This is particularly relevant in healthcare where training data are often private and models decay quickly. In this talk, I will discuss model editing, a paradigm for spot-fixing mistakes made by pre-trained models. I will then showcase my recent work introducing GRACE, the first method for continually editing pre-trained models thousands of times during deployment. GRACE learns to cache discrete segments of a language model's latent space, rerouting its embeddings to selectively modify the model's behavior and enabling targeted fixes in its predictions. These fixes leave the model's original weights unaltered and can generalize to correct new errors. I will conclude my talk with a look towards the future of lifelong model maintenance.
Bio
Tom Hartvigsen is a postdoc at MIT and an incoming assistant professor at the University of Virginia. Tom focuses on core challenges in making machine learning and data mining systems responsibly deployable in healthcare settings, mostly for time series and text. His work has appeared at top venues such as KDD, ACL, NeurIPS, and AAAI. He also ran the 2022 NeurIPS workshop on Learning from Time Series for Health and is the general chair of the 2023 Machine Learning for Health Symposium. Tom received his Ph.D. in Data Science from Worcester Polytechnic Institute in 2021, where he was advised by Elke Rundensteiner and Xiangnan Kong.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/27/2023) Speaker: Vivek Natarajan

Google Research

Title
Foundation Models for Medical AI
Abstract
Despite recent progress in the field of medical artificial intelligence (AI), most existing models are narrow, single-task systems that require large quantities of labeled data to train. Moreover, these models cannot be easily reused in new clinical contexts and lack expressivity and interactive capabilities. This, in turn, has prevented their broad uptake in real-world healthcare settings. The advent of foundation models offers an opportunity to rethink medical AI and make it more performant, interactive, safer, and equitable. In this talk, I will share some of my recent work in the space of foundation medical models: REMEDIS, Co-Doc and Med-PaLM. In particular, I will highlight the effectiveness of these models in solving key medical AI translation problems such as data-efficient generalization, reliability and safety. Finally, I will also lay out how we might be able to build on these models towards generalist multimodal, multi-task medical AI.
Bio
Vivek Natarajan is a Staff Research Scientist at Google Health AI advancing biomedical AI to help scale world class healthcare to everyone. Vivek is particularly interested in building large language models and multimodal foundation models for biomedical applications and leads the Google Brain moonshot behind Med-PaLM, Google's flagship medical large language model. Vivek's research has been published in well-regarded conferences and journals including Nature Medicine, Nature Biomedical Engineering, NeurIPS, CVPR, ICCV and JMLR. Prior to Google, Vivek worked on multimodal assistant systems at Facebook AI Research and published award winning research, was granted multiple patents and deployed AI models to products at scale with hundreds of millions of users.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/20/2023) Speaker: Chirag Nagpal

Carnegie Mellon University

Title
Leveraging Heterogeneity in Time-to-Event (Survival) Predictions
Abstract
Time-to-Event Regression, often referred to as Survival Analysis or Censored Regression involves learning of statistical estimators of the survival distribution of an individual given their covariates. As opposed to standard regression, survival analysis is challenging as it involves accounting for outcomes censored due to loss of follow up. This circumstance is common in, e.g., bio-statistics, predictive maintenance, and econometrics. With the recent advances in machine learning methodology, especially deep learning, it is now possible to exploit expressive representations to help model survival outcomes. My thesis contributes to this new body of work by demonstrating that problems in survival analysis often manifest inherent heterogeneity which can be effectively discovered, characterized, and modeled to learn better estimators of survival.

Heterogeneity may arise in a multitude of settings in the context of survival analysis. Some examples include heterogeneity in the form of input features or covariates (for instance, static vs. streaming, time-varying data), or multiple outcomes of simultaneous interest (more commonly referred to as competing risks). Other sources of heterogeneity involve latent subgroups that manifest different base survival rates or differential responses to an intervention or treatment. In this talk, I aim to demonstrate that carefully modelling the inherent structure of heterogeneity can boost predictive power of survival analysis models while improving their specificity and precision of estimated survival at an individual level. An overarching methodological framework of this thesis is the application of graphical models to impose inherent structure in time-to-event problems that explicitly model heterogeneity, while employing advances in deep learning to learn powerful representations of data that help leverage various aspects of heterogeneity.
Bio
Chirag Nagpal is a final year PhD candidate at the School of Computer Science at Carnegie Mellon University. Chirag is a member of the Auton Lab and advised by Artur Dubrawski. Chirag's research interests are in novel methodology and applications of machine learning in health, especially around censored time-to-event outcomes and counterfactuals. Chirag is also extremely interested in working with complex multimodal health data like EHR, imaging and natural language. During the course of his PhD, Chirag spent summers at Google's Brain and Responsible AI, IBM Research and JPMorgan AI Research. He is the lead contributor to auton-survival, a popular package for survival analysis in python.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/13/2023) Speaker: Yasha Ektefaie

Harvard Medical School

Title
Multimodal learning with graphs for biomedical applications
Abstract
Artificial intelligence for graphs has achieved remarkable success in modelling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases — assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal datasets is challenging because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, graph artificial intelligence methods combine different modalities while leveraging cross-modal dependencies through geometric relationships. Diverse datasets are combined using graphs and fed into sophisticated multimodal architectures, specified as image-intensive, knowledge-grounded and language-intensive models. Using this categorization, we introduce a blueprint for multimodal graph learning, use it to study existing methods and provide guidelines to design new models. This talk will focus on biomedical applications of multimodal graph learning, emphasizing how this blueprint can enable future innovation in this space.
Bio
Yasha is a PhD candidate in the Bioinformatics and Integrative Genomics program at Harvard Medical School co-advised by Marinka Zitnik and Maha Farhat. With publications in Nature Machine Intelligence, Nature Breast Cancer, and Lancet Microbe, he is interested in designing the next generation of machine learning methods to predict phenotypes from genotypes. Specifically, he is interested in understanding and improving the ability of these models to generalize to new and unseen genotypes. Before Harvard, he received a B.S. in electrical engineering and computer science (EECS) and bioengineering at UC Berkeley where he did research on designing computational methods to understand bacterial communities.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/6/2023) Speaker: Spring Break (No MedAI session)


(3/30/2023) Speaker: Spring Break (No MedAI session)


(3/23/2023) Speaker: Yuzhe Yang

MIT

Title
Using AI to Diagnose and Assess Parkinson's Disease: Challenges, Algorithms, and Applications
Abstract
There are currently no effective biomarkers for diagnosing Parkinson’s disease (PD) or tracking its progression. In this talk, I will present an artificial intelligence (AI) model to detect PD and track its progression from nocturnal breathing signals. I’ll first discuss background, problem setup, and general challenges and principles for designing such AI models in the wild for health applications: sparse supervision, data imbalance, and distribution shift. I’ll present the most general form of each principle before providing concrete instantiations of using each in practice. This will include a simple multi-task learning method that incorporates health domain knowledge and for AI model interpretation, a framework for learning imbalanced data with continuous targets (http://dir.csail.mit.edu), and an algorithm that enables learning from multi-domain imbalanced data as well as imbalanced domain generalization with theoretical guarantees. Finally, I will conclude with implications and applications of using AI to advance digital medicine and other real-world applications.
Bio
Yuzhe Yang is a PhD student in computer science at MIT CSAIL. He received his B.S. in EECS from Peking University. His research interests include machine learning and AI for health and medicine, with a focus on trustworthy machine learning for model fairness, robustness and generalization, as well as building innovative AI solutions to enable new understanding for human diseases, health and precision medicine. His work on AI-enabled biomarker for Parkinson’s disease is named as Top Ten Notable Advances in 2022 by Nature Medicine, and Top Ten Crucial Advances in Movement Disorders in 2022 by The Lancet Neurology. His research has been published at top interdisciplinary journals and AI/ML conferences including Nature Medicine, Science Translational Medicine, NeurIPS, ICML, ICLR, CVPR, ECCV, etc. His works have been recognized by the MathWorks Fellowship, Baidu PhD Fellowship, and media coverage from MIT Tech Review, Wall Street Journal, Forbes, BBC, The Washington Post, etc.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/16/2023) Speaker: Arjun Desai, Karan Goel, Sabri Eyuboglu

Stanford University

Title
Meerkat: Interactive Data Systems For Unstructured Data & Foundation Models
Abstract
Unstructured data (e.g. images, videos, text documents, etc.) are ubiquitous in today's digital world. However, the analysis of such data using traditional data science tools can be quite challenging. Foundation models (FMs) have shown that they extract semantically meaningful information from diverse types of unstructured data, but they can be imprecise, brittle, and difficult to control. We’re excited to introduce Meerkat, a Python library that teams can use to interactively wrangle their unstructured data with foundation models. In this talk, we will explore how Meerkat makes it easier to work with unstructured data and FMs, learn how to build user interfaces with Meerkat, and dive into potential applications in healthcare.
Interested in building with Meerkat? Check out our website and join our Discord!
Bio
Arjun Desai is a 4th-year EE PhD student working with Akshay Chaudhari and Chris Re. He is interested in how signal processing principles can improve robustness, efficiency, and scalability in machine learning. He is excited about how these methods can help build scalable deployment and validation systems for challenging applications in healthcare and the sciences.
Karan Goel is a 5th year CS PhD student at Stanford advised by Chris Ré. He is interested in sequence modeling techniques for building large-scale foundation models, as well as problems that arise due to the deployment of ML models to practice. As part of the Meerkat project, he thinks about how the application of FMs will change systems for data science and engineering. He is a recipient of the Siebel Foundation Scholarship.
Sabri Eyuboglu is a 3rd year PhD student advised by Chris Re and James Zou. He is broadly interested in how we can bring machine learning to bear in challenging applied settings like medicine and the sciences. To that end, he’s recently been working on data management tools that help practitioners better understand their data. He is supported by the National Science Foundation GRFP.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/9/2023) Speaker: Houcemeddine Turki

University of Sfax, Tunisia

Title
Empowering Biomedical Informatics with Open Resources - Unleashing the Potential of Data and Tools
Abstract
Biomedical informatics is a highly interdisciplinary field that has gained increasing importance in recent years. Combining computer science, statistics, and information science with clinical and basic biomedical research, it seeks to leverage data-driven approaches to solve healthcare challenges and accelerate biomedical discovery. With the explosion of biomedical data generated from various sources such as genomic sequencing, medical imaging, electronic health records, and wearable devices, the field has witnessed significant growth in recent years. One of the key drivers of this growth is the increasing availability of open resources, such as open-access data repositories and software tools. These resources have democratized access to biomedical data and tools, making it easier for researchers to conduct biomedical informatics research and drive scientific discoveries.
PubMed, an open-access bibliographic database maintained by the National Center for Biotechnology Information (NCBI), offers researchers access to millions of citations for biomedical literature, including links to full-text articles and MeSH (Medical Subject Headings) terms for indexing and searching. These resources can be used to develop and validate computational models and algorithms, enabling researchers to study different aspects of diseases and therapies. Wikidata, a free and open knowledge graph maintained by the Wikimedia Foundation, is another critical open resource that is driving biomedical informatics. With Wikidata, researchers can access and contribute to a structured and linked database of biomedical information, including genes, diseases, and drugs. The Open Biomedical Ontologies Foundry (OBO Foundry) is an initiative to develop and maintain a library of interoperable, open-source ontologies for biomedical research. These ontologies provide a common framework for annotating and integrating biomedical data from different sources, enabling data sharing and collaboration.
In this presentation, I will share our way of reusing open resources, particularly NCBI Resources, Wikidata, and the Python Package Index. I will highlight the advantages and challenges of using open resources in biomedical informatics research, share success stories, and discuss the potential of these resources in accelerating scientific discovery. Finally, I will provide some insights into the future of open resources in biomedical informatics, emphasizing the need for greater collaboration and data sharing to tackle the pressing challenges in healthcare.
Bio
Houcemeddine Turki is a medical student at the University of Sfax, Tunisia, and a research assistant at the Data Engineering and Semantics Research Unit, University of Sfax, Tunisia. His research spans several interdisciplinary fields, including Biomedical Informatics, Library and Information Science, Semantic Technologies, Open Science, and Applied Linguistics. With a passion for open science, he is a dedicated contributor to Wikimedia Projects, particularly Wikipedia and Wikidata, and has held leadership roles within the Wikimedia community, including Vice-Chair of the Wikimedia Tunisia User Group (2020-2022), Board member of the Wikimedia and Libraries User Group (2019-2021), and member and advisor to the Affiliations Committee of the Wikimedia Foundation (2021-2023). Houcemeddine has played an active role in organizing several Wikimedia community conferences, such as Wikimania (2022) and WikiIndaba Conference (2018, 2020, and 2021), and is committed to advancing the mission of open access and open knowledge sharing for all.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/2/2023) Speaker: No MedAI session


(2/23/2023) Speaker: Karan Singhal

Google Research

Title
Large Language Models Encode Clinical Knowledge
Abstract
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of models today, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.
Bio
Karan co-leads teams at Google Research working on medical AI, foundation models, representation learning, and federated learning. He is broadly interested in developing and validating techniques that lead to wider adoption of safe, beneficial AI. Prior to joining Google, he received an MS and BS in Computer Science from Stanford University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/16/2023) Speaker: Sajjad Fouladvand

Stanford University

Title
A Comparative Effectiveness Study on Opioid Use Disorder Prediction Using Artificial Intelligence and Existing Risk Models
Abstract
Opioid use disorder (OUD) is a leading cause of death in the United States placing a tremendous burden on patients, their families, and health care systems. Artificial intelligence (AI) can be harnessed with available healthcare data to produce automated OUD prediction tools. In this retrospective study, we developed AI based models for OUD prediction and showed that AI can predict OUD more effectively than existing clinical tools including the unweighted opioid risk tool (ORT). On 100 randomly selected test sets including 47,396 patients, our proposed transformer-based AI model can predict OUD more efficiently (AUC=0.742±0.021) compared to logistic regression (AUC=0.651±0.025), random forest (AUC=0.679±0.026), xgboost (AUC=0.690±0.027), long short-term memory model (AUC=0.706±0.026), transformer (AUC=0.725±0.024), and unweighted ORT model (AUC=0.559±0.025). Our results show that embedding AI algorithms into clinical care may assist clinicians in risk stratification and management of patients receiving opioid.
Bio
Sajjad Fouladvand, PhD, MSc is a postdoctoral scholar at Stanford Center for Biomedical Informatics Research. His research career thus far has been focused on developing and applying artificial intelligence (AI) algorithms to solve real-world healthcare problems. Prior to Stanford, he worked at the Institute for Biomedical Informatics at the University of Kentucky (UK) while completing his PhD in Computer Science. During this time, he also received training at Mayo Clinic’s Department of Artificial Intelligence and Informatics as an intern. At Stanford, Dr. Fouladvand is involved in conducting AI and healthcare data science research in close collaboration with clinicians, scientists, and healthcare systems with access to deep clinical data warehouses and broad population health data sources.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/9/2023) Speaker: Tiange Xiang

Stanford University

Title
Denoising diffusion models for denoising diffusion MRI
Abstract
Magnetic resonance imaging (MRI) is a common and life-saving medical imaging technique. However, acquiring high signal-to-noise ratio MRI scans requires long scan times, resulting in increased costs and patient discomfort, and decreased throughput. Thus, there is great interest in denoising MRI scans, especially for the subtype of diffusion MRI scans that are severely SNR-limited. While most prior MRI denoising methods are supervised in nature, acquiring supervised training datasets for the multitude of anatomies, MRI scanners, and scan parameters proves impractical. In this work, we present Denoising Diffusion Models for Denoising Diffusion MRI (DDM^2), a self-supervised denoising method for MRI denoising using diffusion denoising generative models. Our three-stage framework integrates statistic-based denoising theory into diffusion models and performs denoising through conditional generation. During inference, we represent input noisy measurements as a sample from an intermediate posterior distribution within the diffusion Markov chain.
Bio
Tiange Xiang is a first-year CS Ph.D. student in Stanford Vision and Learning Lab (SVL) at the Stanford University. He received his Bachelor degree from The University of Sydney under the supervision of Prof. Weidong Cai. He was awarded Honors Class I and The University Medal. His research interests are Machine Learning and Computer Vision.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/2/2023) Speaker: Sanmi Koyejo

Stanford University

Title
Diagnosing failures of fairness transfer across distribution shift in real-world medical settings
Abstract
Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the \textit{structure} of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is encountering in practice. In this work, we adopt a causal framing to motivate conditional independence tests as a key tool for characterizing distribution shifts. Using our approach in two medical applications, we show that this knowledge can help diagnose failures of fairness transfer, including cases where real-world shifts are more complex than is often assumed in the literature. Based on these results, we discuss potential remedies at each step of the machine learning pipeline.
Bio
Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at Stanford University. Koyejo was previously an Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His research interests are in developing the principles and practice of trustworthy machine learning, focusing on applications to neuroscience and healthcare. Koyejo completed a Ph.D. in Electrical Engineering at the University of Texas at Austin, advised by Joydeep Ghosh, and postdoctoral research at Stanford University with Russell A. Poldrack and Pradeep Ravikumar. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence, a Skip Ellis Early Career Award, a Sloan Fellowship, a Terman faculty fellowship, an NSF CAREER award, a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping. Koyejo spends time at Google as a part of the Brain team, serves on the Neural Information Processing Systems Foundation Board, the Association for Health Learning and Inference Board, and as president of the Black in AI organization.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(1/26/2023) Speaker: Roxana Daneshjou

Stanford University

Title
AI in Dermatology - the Pitfalls and Promises
Abstract
Artificial intelligence (AI) has the potential to have significant impacts in healthcare. The field of dermatology, which involves diagnosing skin disease, is particularly ripe for innovation given the visual nature of the clinical tasks. However, potential pitfalls of AI for dermatology include biased datasets and algorithms. Addressing these pitfalls is necessary for the development of equitable algorithms that do not exacerbate existing health disparities. With fairness in mind, AI in dermatology also has the promise of streamlining the healthcare process.
Bio
Dr. Roxana Daneshjou received her undergraduate degree at Rice University in Bioengineering, where she was recognized as a Goldwater Scholar for her research. She completed her MD/PhD at Stanford, where she worked in the lab of Dr. Russ Altman. During this time, she was a Howard Hughes Medical Institute Medical Scholar and a Paul and Daisy Soros Fellowship for New Americans Fellow. She completed dermatology residency at Stanford in the research track and now practices dermatology as a Clinical Scholar in Stanford's Department of Dermatology while also conducting artificial intelligence research with Dr. James Zou as a postdoc in Biomedical Data Science. Her research interests are in developing diverse datasets and fair algorithms for applications in precision medicine.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(1/19/2023) Speaker: Fuying Wang (TIME change to 10am PST)

University of Hong Kong

Title
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
Abstract
Learning medical visual representations directly from paired radiology reports has become an emerging topic in representation learning. However, existing medical image-text joint learning methods are limited by instance or local supervision analysis, ignoring disease-level semantic correspondences. In this paper, we present a novel Multi-Granularity Cross-modal Alignment (MGCA) framework for generalized medical visual representation learning by harnessing the naturally exhibited semantic correspondences between medical image and radiology reports at three different levels, i.e., pathological region-level, instance-level, and disease-level. Specifically, we first incorporate the instance-wise alignment module by maximizing the agreement between image-report pairs. Further, for token-wise alignment, we introduce a bidirectional cross-attention strategy to explicitly learn the matching between fine-grained visual tokens and text tokens, followed by contrastive learning to align them. More important, to leverage the high-level inter-subject relationship semantic (e.g., disease) correspondences, we design a novel cross-modal disease-level alignment paradigm to enforce the cross-modal cluster assignment consistency. Extensive experimental results on seven downstream medical image datasets covering image classification, object detection, and semantic segmentation tasks demonstrate the stable and superior performance of our framework.
Bio
Fuying Wang is a second-year Ph.D. student in the department of Statistics and Actuarial Science at the university of Hong Kong, supervised by Dr. Lequan Yu. Prior to this, he received his bachelor’s degree from Tsinghua university. His research interests span the area of multimodal learning, self-supervised learning and interpretable AI. In particular, He is currently working on multimodal biomedical data analysis, self-supervised medical representation learning and interpretable machine learning for healthcare.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(1/12/2023) Speaker: Ali Mottagi

Stanford University

Title
Adaptation of Surgical Activity Recognition Models Across Operating Rooms
Abstract
Automatic surgical activity recognition enables more intelligent surgical devices and a more efficient workflow. Integration of such technology in new operating rooms has the potential to improve care delivery to patients and decrease costs. Recent works have achieved a promising performance on surgical activity recognition; however, the lack of generalizability of these models is one of the critical barriers to the wide-scale adoption of this technology. In this work, we study the generalizability of surgical activity recognition models across operating rooms. We propose a new domain adaptation method to improve the performance of the surgical activity recognition model in a new operating room for which we only have unlabeled videos. Our approach generates pseudo labels for unlabeled video clips that it is confident about and trains the model on the augmented version of the clips. We extend our method to a semi-supervised domain adaptation setting where a small portion of the target domain is also labeled. In our experiments, our proposed method consistently outperforms the baselines on a dataset of more than 480 long surgical videos collected from two operating rooms.
Bio
Ali Mottaghi is a 5th-year Ph.D. student working on AI in Healthcare. He is advised by Serena Yeung in Medical AI and Computer Vision Lab (MARVL). Ali is mainly interested in developing new methods for more data-efficient learning algorithms. He has previously worked in Intuitive Surgical and Curai Health.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(12/8/2022 to 1/5/2023) Speaker: Winter Break - Happy Holidays :)


(12/1/2022) Speaker: Julian Acosta

Yale University

Title
Multimodal Biomedical Artificial Intelligence - challenges and opportunities
Abstract
The increasing availability of biomedical data from large biobanks, electronic health records, medical imaging, wearable and ambient biosensors, and the lower cost of genome and microbiome sequencing have set the stage for the development of multimodal artificial intelligence solutions that capture the complexity of human health and disease. In this talk, we will discuss key applications enabled by multimodal AI in health, along with the challenges we need to overcome to achieve its potential.
Bio
Julian N. Acosta, MD, trained as a neurologist at the Fleni institute in Argentina before joining Yale University as a postdoctoral fellow in 2019, where his research focused on population genetics and advanced neuroimaging in neurovascular disease. He has also collaborated with Dr. Rajpurkar's Medical AI lab on multiple projects related to the application of artificial intelligence in healthcare, and he currently works as a Clinical Data Scientist at Rad AI Inc.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/25/2022) Speaker: Thanksgiving Break (No MedAI Session)


(11/17/2022) Speaker: Ramprasaath Selvaraju

Artera

Title
Explaining Model Decisions and Fixing Them through Human Feedback
Abstract
In this talk, I will focus on how we can build algorithms that provide explanations for decisions emanating from deep networks in order to build user trust, incorporate domain knowledge into AI, learn grounded representations, and correct for any unwanted biases learned by our AI models.
Bio
Ramprasaath is a Sr. Machine Learning Scientist at Artera. Prior to this he was a Sr. Research Scientist at Salesforce. He did his Ph.D in Computer Science at Georgia Institute of Technology advised by Devi Parikh. He works at the intersection of machine learning, computer vision & language, explainable AI and more recently medical AI. He has held visiting positions at Brown University, Oxford University, Meta, Samsung, Microsoft and Tesla Autopilot. He has a Bachelor’s degree in Electrical and Electronics Engineering and a Masters in Physics from BITS-Pilani.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/10/2022) Speaker: Adriel Saporta

Courant Institute at NYU (New York University)

Title
Benchmarking saliency methods for chest X-ray interpretation
Abstract
Saliency methods, which produce heat maps that highlight the areas of the medical image that influence model prediction, are often presented to clinicians as an aid in diagnostic decision-making. However, rigorous investigation of the accuracy and reliability of these strategies is necessary before they are integrated into the clinical setting. In this work, we quantitatively evaluate seven saliency methods, including Grad-CAM, across multiple neural network architectures using two evaluation metrics. We establish the first human benchmark for chest X-ray segmentation in a multilabel classification set-up, and examine under what clinical conditions saliency maps might be more prone to failure in localizing important pathologies compared with a human expert benchmark. We find that (1) while Grad-CAM generally localized pathologies better than the other evaluated saliency methods, all seven performed significantly worse compared with the human benchmark, (2) the gap in localization performance between Grad-CAM and the human benchmark was largest for pathologies that were smaller in size and had shapes that were more complex, and (3) model confidence was positively correlated with Grad-CAM localization performance. Our work demonstrates that several important limitations of saliency methods must be addressed before we can rely on them for deep learning explainability in medical imaging.
Bio
Adriel Saporta is a PhD candidate in Computer Science at the Courant Institute at NYU, where she is advised by Professor Rajesh Ranganath and is a DeepMind Scholar. Her research interests are at the intersection of AI and health, and she co-hosts The AI Health Podcast with Harvard's Professor Pranav Rajpurkar. Previously, Adriel conducted research on Apple’s Health AI team and in Dr. Andrew Ng’s Stanford Machine Learning Group. She has held engineering and product roles across both big tech (Apple, Amazon) and start-ups (SeatGeek, Common). She holds an MBA from the Stanford Graduate School of Business, an MS in Computer Science from Stanford University, and a BA in Comparative Literature from Yale University. Born and raised in Brooklyn, Adriel is half-Cuban and half-Greek.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/3/2022) Speaker: Christian Bluethgen & Pierre Chambon

Stanford University

Title
Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains
Abstract
Multi-modal foundation models are typically trained on millions of pairs of natural images and text captions. Although such models depict excellent generative capabilities, they do not typically generalize well to specific domains such as medical images that have fundamentally shifted distributions compared to natural images. Building generative models for medical images that faithfully depict clinical context may help alleviate the paucity of healthcare datasets. To investigate the capacity of a large pretrained latent diffusion model (Stable Diffusion) to generate medical domain-specific images, we explored the main components of the Stable Diffusion pipeline (the variational autoencoder, the U-Net and the text-encoder), to fine-tune the model to generate chest x-rays and evaluate the results on quantitative and qualitative levels. Our best-performing model can be text-conditioned to insert realistic-looking abnormalities like pleural effusions on synthetic radiological images, while maintaining a high accuracy on a classifier trained to detect the abnormality on real images.
Bio
Christian is a physician-scientist, a radiologist with a clinical focus on thoracic imaging and currently a postdoctoral research fellow at the Stanford Center for Artificial Intelligence in Medicine and Imaging (AIMI). At the AIMI Center, he works on the application of deep learning for the detection, diagnosis and monitoring of interstitial lung disease and other thoracic pathologies, using large multi-modal imaging and EHR datasets. Before joining AIMI, he worked as a radiologist at the University Hospital Zurich, where he created computational simulations to visualize lung ultrasound wave propagation, developed NLP models for the classification of radiology reports, used radiomics for the differentiation of thymic neoplasms and deep learning for fracture detection and localization. In Zurich, he initiated the seminar series “Applied Machine Learning in Diagnostic Imaging” (currently in its fifth year), which brings together radiologists, researchers and clinicians from other medical specialties, and industry.
Pierre is an ML researcher, who recently graduated from a master at Stanford ICME and will soon graduate from a master in France at Ecole Centrale Paris, with a focus on mathematical and computational methods for machine learning and deep learning. He has been involved at the AIMI center for the last two years, where he developed NLP methods for domain-specific applications to radiology as well as multimodal tasks along vision models. As part of the MIDRC initiative, he worked on classification tasks in the data- and compute-constraint settings, as well as a text de-identification tool useful for the broad sharing of medical notes within and between institutions. More recently, he tackled both image-to-text and text-to-image tasks, leading to models that can generate synthetic radiology images and reports, hopefully further useful to other machine learning applications in radiology.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/27/2022) Speaker: David Ouyang

Cedars-Sinai Medical Center

Title
Development to Deployment of Cardiovascular AI
Abstract
Computer vision has advanced tremendously over the last decade, with performance of deep learning algorithms surpassing previous paradigms of image identification and segmentation. Cardiovascular care relies on precise detection, interpretation, and measurement of cardiovascular imaging, and could benefit from automation. In this talk, I will describe opportunities to explore cardiovascular medicine and its current challenges through the application of AI models and eventual assessment in clinical trials and deployment in clinical care.
Bio
David is a cardiologist and researcher in the Department of Cardiology and Division of Artificial Intelligence in Medicine at Cedars-Sinai Medical Center. As a physician-scientist and statistician with focus on cardiology and cardiovascular imaging, he works on applications of deep learning, computer vision, and the statistical analysis of large datasets within cardiovascular medicine. As a clinical echocardiographer, he works on applying deep learning for precision phenotyping in cardiac ultrasound. Additionally, he is interested in multi-modal datasets, linking ECR, echo, and MRI data for a holistic look at cardiovascular disease. He majored in statistics at Rice University, obtained his MD at UCSF, and received post-graduate medical education in internal medicine, cardiology, and a postdoc in computer science and biomedical data science at Stanford University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/20/2022) Speaker: Break - Speaker Nominations + Feedback Forms


(10/13/2022) Speaker: Break - Speaker Nominations + Feedback Forms


(10/6/2022) Speaker: Louis Blankemeier

Stanford University

Title
Opportunistic Incidence Prediction of Multiple Chronic Diseases from Abdominal CT Imaging Using Multi-Task Learning
Abstract
Opportunistic computed tomography (CT) analysis is a paradigm where CT scans that have already been acquired for routine clinical questions are reanalyzed for disease prognostication, typically aided by machine learning. While such techniques for opportunistic use of abdominal CT scans have been implemented for assessing the risk of a handful of individual disorders, their prognostic power in simultaneously assessing multiple chronic disorders has not yet been evaluated. In this retrospective study of 9,154 patients, we demonstrate that we can effectively assess 5-year incidence of chronic kidney disease (CKD), diabetes mellitus (DM), hypertension (HT), ischemic heart disease (IHD), and osteoporosis (OST) using single already-acquired abdominal CT scans. We demonstrate that a shared multi-planar CT input, consisting of an axial CT slice occurring at the L3 vertebral level, as well as carefully selected sagittal and coronal slices, enables accurate future disease incidence prediction. Furthermore, we demonstrate that casting this shared CT input into a multi-task approach is particularly valuable in the low-label regime. With just 10% of labels for our diseases of interest, we recover nearly 99% of fully supervised AUROC performance, representing an improvement over single-task learning.
Bio
Louis Blankemeier is a PhD student in electrical engineering at Stanford in the machine intelligence in medical imaging group led by Professor Akshay Chaudhari. His research focuses on reusing the vast amounts of patient data, often collected to answer specific clinical questions, to automatically screen for unrelated indications, a paradigm referred to as opportunistic screening. He has also spent time working on medical AI in industry at GE Healthcare and Microsoft Health AI. He holds a master’s in electrical engineering from Stanford and bachelor’s degrees in physics and electrical engineering from the University of Southern California.
Video
Session not recorded on request
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/29/2022) Speaker: Ethan Steinberg

Stanford University

Title
Language Models Are An Effective Representation Learning Technique For Electronic Health Record Data
Abstract
The widespread adoption of electronic health records (EHRs) has fueled the development of using machine learning to build prediction models for various clinical outcomes. However, this process is often constrained by having a relatively small number of patient records for training the model. We demonstrate that using patient representation schemes inspired from techniques in natural language processing can increase the accuracy of clinical prediction models by transferring information learned from the entire patient population to the task of training a specific model, where only a subset of the population is relevant. Such patient representation schemes enable a 3.5% mean improvement in AUROC on five prediction tasks compared to standard baselines, with the average improvement rising to 19% when only a small number of patient records are available for training the clinical prediction model. Paper Linked Here
Bio
Ethan Steinberg is a Ph.D. student in Computer Science at Stanford University, co-advised by Jure Leskovec and Nigam Shah. He is interested in the intersection of machine learning and healthcare, and how we can better use deep learning models to improve our healthcare system.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/22/2022) Speaker: No Session - Fall Quarter Break


(9/15/2022) Speaker: Natalie Dullerud

Stanford University (prev. at University of Toronto)

Title
Fairness in representation learning - a study in evaluating and addressing fairness via subgroup disparities in deep metric learning
Abstract
Deep metric learning (DML) enables learning with less supervision through its emphasis on the similarity structure of representations. There has been much work on improving generalization of DML in settings like zero-shot retrieval, but little is known about its implications for fairness. In this talk, we will discuss evaluation of state-of-the-art DML methods trained on imbalanced data, and show the negative impact these representations have on minority subgroup performance when used for downstream tasks. In the talk, we will first define fairness in DML through an analysis of three properties of the representation space -- inter-class alignment, intra-class alignment, and uniformity -- and propose finDML, the fairness in non-balanced DML benchmark to characterize representation fairness. Utilizing finDML, we find bias in DML representations to propagate to common downstream classification tasks. Surprisingly, this bias is propagated even when training data in the downstream task is re-balanced. To address this problem, we present Partial Attribute De-correlation (PARADE) to de-correlate feature representations from sensitive attributes and reduce performance gaps between subgroups in both embedding space and downstream metrics. In addition to covering salient aspects of fairness in deep metric learning, the talk will encompass a larger discussion of fairness metrics in representation learning at large, where our proposed definitions exist within representation learning, and how use of such metrics may vary based on domain.
Bio
Natalie Dullerud is an incoming PhD student at Stanford University and recently received her Masters from University of Toronto. She previously graduated with a Bachelor’s degree in mathematics from University of Southern California, with minors in computer science and chemistry. At University of Toronto, Natalie was awarded a Junior Fellowship at Massey College, and she has completed several research internships at Microsoft Research. Natalie’s research largely focuses on machine learning through differential privacy, algorithmic fairness, and applications to clinical and biological settings.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/8/2022) Speaker: Arjun Desai

Stanford University

Title
Leveraging Physics-Based Priors for Label-Efficient, Robust MRI Reconstruction
Abstract
Deep learning has enabled improved image quality and fast inference times for various inverse problems, including accelerated MRI reconstruction. However, these models require access to large amounts of fully-sampled (labeled) data and are sensitive to clinically-pervasive distribution drifts. To tackle this challenge, we propose a family of consistency-based training strategies, which leverage physics-driven data augmentations and our domain knowledge of MRI physics to improve label efficiency and robustness to relevant distribution shifts. In this talk, we will discuss how two of these methods, Noise2Recon and VORTEX, can reduce the need for labeled data by over 10-fold and increase robustness to both physics-driven perturbations and variations in anatomy and MRI sequences & contrasts. We will also discuss how these techniques can simplify composing heterogenous augmentation and self-supervised methods into a unified framework.
Bio
Arjun is a 4th-year PhD student in Electrical Engineering working with Akshay Chaudhari and Chris Ré. He is broadly interested in how we can accelerate the pace at which artificial intelligence can be used safely and at scale in healthcare. His interests lie at the intersection of signal processing and machine learning, including representation learning for multimodal data, developing data-efficient & robust machine learning methods, and designing scalable clinical deployment and validation systems for medical image acquisition and analysis. Prior to Stanford, he received his B.E. in Biomedical Engineering and Computer Science from Duke University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/1/2022) Speaker: Paul Pu Liang

Carnegie Mellon University

Title
Fundamentals of Multimodal Representation Learning - Towards Generalization and Quantification
Abstract
In recent years, the quest for artificial intelligence capable of digital, physical, and social intelligence has led to an explosion of interest in multimodal datasets and algorithms. This research area of multimodal machine learning studies the computational and theoretical foundations of learning from heterogeneous data sources. This talk studies two core challenges in multimodal learning- (1) constructing multimodal models and datasets that enable generalization across many modalities and different tasks, and (2) designing quantification methods to comprehensively understand the internal mechanics of multimodal representations and gain insights for safe real-world deployment.
In the first part, we study generalization in multimodal learning. Generalization is particularly beneficial when one modality has limited resources such as the lack of annotated data, noisy inputs, or unreliable labels, and presents a step towards processing many diverse and understudied modalities. To enable the study of generalization, we introduce MultiBench, a unified large-scale benchmark across a wide range of modalities, tasks, and research areas. Using MultiBench, we study generalization with parallel modalities, as well as in non-parallel scenarios, where we are presented with many modalities, but each task is defined only over a small subset of them.
The second part studies quantification of multimodal models via MultiViz, our recent attempt at a framework to understand the internal modeling of multimodal information and cross-modal interactions. We conclude this talk by discussing how future work can leverage these ideas to drive progress towards more general, scalable, and explainable multimodal models.
Bio
Paul Liang is a Ph.D. student in Machine Learning at CMU, advised by Louis-Philippe Morency and Ruslan Salakhutdinov. His research lies in the foundations of multimodal machine learning with applications in socially intelligent AI, understanding human and machine intelligence, natural language processing, healthcare, and education. His research is generously supported by a Facebook PhD Fellowship and a Center for Machine Learning and Health Fellowship, and has been recognized by awards at the NeurIPS 2019 workshop on federated learning and ICMI 2017. He regularly organizes courses, workshops, and tutorials on multimodal learning and was a workflow chair for ICML 2019. Website at https://www.cs.cmu.edu/~pliang/
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/25/2022) Speaker: Chima Okechukwu

Georgia Institute of Technology

Title
Lessons learned from an X-ray Body Part Classifier Competition
Abstract
We discuss technical and nontechnical learnings from participating in the UNIFESP X-ray Body Part Classifier Competition. Anyone already in or looking to enter a machine learning competition would benefit from listening to this session.
Bio
Chima Okechukwu is a Masters student in the College of Computing at Georgia Institute of Technology. He is a student under Judy Gichoya in the Healthcare Innovation and Translational Lab (HITI) at Emory. He is broadly interested in machine learning and applications in medical imaging and disease classification.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/18/2022) Speaker: Break - No session


(8/11/2022) Speaker: No session (Speaker cancellation)


(8/4/2022) Speaker: Tri Dao

Stanford University

Title
FlashAttention - Fast and Memory-Efficient Exact Attention with IO-Awareness
Abstract
Transformers are slow and memory-hungry on long sequences, since the time and memory complexity of self-attention are quadratic in sequence length. Approximate attention methods have attempted to address this problem by trading off model quality to reduce the compute complexity, but often do not achieve wall-clock speedup. We argue that a missing principle is making attention algorithms IO-aware -- accounting for reads and writes between levels of GPU memory. We propose FlashAttention, an IO-aware exact attention algorithm that uses tiling to reduce the number of memory reads/writes between GPU high bandwidth memory (HBM) and GPU on-chip SRAM. We analyze the IO complexity of FlashAttention, showing that it requires fewer HBM accesses than standard attention, and is optimal for a range of SRAM sizes. We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method. FlashAttention trains Transformers faster than existing baselines- 15% end-to-end wall-clock speedup on BERT-large (seq. length 512) compared to the MLPerf 1.1 training speed record, 3× speedup on GPT-2 (seq. length 1K), and 2.4× speedup on long-range arena (seq. length 1K-4K). FlashAttention and block-sparse FlashAttention enable longer context in Transformers, yielding higher quality models (0.7 better perplexity on GPT-2 and 6.4 points of lift on long-document classification) and entirely new capabilities- the first Transformers to achieve better-than-chance performance on the Path-X challenge (seq. length 16K, 61.4% accuracy) and Path-256 (seq. length 64K, 63.1% accuracy). This work received the Best Paper Award at the Hardware-Aware Efficient Training Workshop at ICML, 2022. Paper, Code.
Bio
Tri Dao is a PhD student in Computer Science at Stanford, co-advised by Christopher Re and Stefano Ermon. He works at the interface of machine learning and systems, and his research interests include sequence models with long-range memory and structured matrices for compact deep learning models. His work has received the ICML 2022 Outstanding paper runner-up award.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/28/2022) Speaker: Richard Chen

Harvard University

Title
Large Images as Long Documents - Rethinking Representation Learning in Gigapixel Pathology Images using Transformers
Abstract
Tissue phenotyping is a fundamental problem in computational pathology (CPATH) that aims at characterizing objective, histopathologic features within gigapixel whole slide images (WSIs) for cancer diagnosis, prognosis, and the estimation of response-to-treatment in patients. However, unlike natural images, whole-slide imaging is a challenging computer vision domain as images can be as large as 150K x 150K pixels, which may be intractable for tasks such as survival prediction which entail modeling complex interactions between morphological visual concepts within the tumor microenvironment. In this talk, we present recent advances that rethink whole-slide images as long documents, adapting Transformer attention for- 1) early-based multimodal fusion with other modalities such as genomics, 2) learning hierarchical representations that capture cells, tissue patterns, and their spatial organization in the tumor microenvironment. In equipping conventional set-based deep learning frameworks in computational pathology with Transformer attention, we present- 1) new improvements on a variety of baselines on slide-level cancer subtyping and survival prediction, 2) new insights on how to perform self-supervision on high-resolution images, and 3) new tasks that shift the evaluation of slide-level tasks in CPATH from the conventional weakly-supervised regime (pixel annotations not needed) to an unsupervised regime (slide annotations not needed).
Bio
Richard Chen is a 4th year Ph.D. student at Harvard University with research interests in multimodal learning, representation learning, and their applications in solving challenging problems in biology and medicine. Previously, he obtained his B.S. / M.S. in Biomedical Engineering and Computer Science at Johns Hopkins University, and also worked as a Researcher at Apple integrating multimodal sensor streams from the iPhone and Apple Watch to measure cognitive decline. In his Ph.D., Richard is currently working on novel computer vision techniques for processing gigapixel images in computational pathology, in particular, 1) representation learning of the the tumor microenvironment, 2) integrative and interpretable techniques for discovering feature correspondences between histology and genomics, and 3) rethinking deep learning approaches for WSIs via advances in NLP - with the analogy that tissue patches in a gigapixel pathology image are words in a document.
Video
Session not recorded on request
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/21/2022) Speaker: Hyewon Jeong

MIT

Title
Real-Time Seizure Detection using EEG - A Comprehensive Comparison of Recent Approaches under a Realistic Setting
Abstract
Electroencephalogram (EEG) is an important diagnostic test that physicians use to record brain activity and detect seizures by monitoring the signals. There have been several attempts to detect seizures and abnormalities in EEG signals with modern deep learning models to reduce the clinical burden. However, they cannot be fairly compared against each other as they were tested in distinct experimental settings. Also, some of them are not trained in real-time seizure detection tasks, making it hard for on-device applications. In this work, for the first time, we extensively compare multiple state-of-the-art models and signal feature extractors in a real-time seizure detection framework suitable for real-world application, using various evaluation metrics including a new one we propose to evaluate more practical aspects of seizure detection models.
Bio
Hyewon Jeong, M.D., M.S. is a Ph.D. student in Electrical Engineering and Computer Science at MIT, co-advised by Marzyeh Ghassemi and Collin Stultz. Her primary research focus has been on developing and applying machine learning methods to solve real-world clinical tasks using time-series electronic health record data and signal data. Before joining MIT, she received a B.S. in Biological Sciences, M.S. in Computer Science from KAIST, and M.D. at Yonsei University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/14/2022) Speaker: Andrew Ilyas

MIT

Title
Datamodels - Predicting Predictions from Training Data
Abstract
Machine learning models tend to rely on an abundance of training data. Yet, understanding the underlying structure of this data--and models exact dependence on it--remains a challenge. In this talk, we will present a framework for directly modeling predictions as functions of training data. This framework, given a dataset and a learning algorithm, pinpoints--at varying levels of granularity--the relationships between train and test point pairs through the lens of the corresponding model class. Even in its most basic version, our framework enables many applications, including discovering data subpopulations, quantifying model brittleness via counterfactuals, and identifying train-test leakage. Based on joint work with Sung Min Park, Logan Engstrom, Guillaume Leclerc, and Aleksander Madry.
Bio
Andrew Ilyas is a fourth-year PhD student at MIT, advised by Aleksander Madry and Constantinos Daskalakis. His research focuses on robust and reliable machine learning, with an emphasis on the ways in which (often unintended) correlations present in training data can manifest at test-time. He is supported by an Open Philanthropy Project AI Fellowship.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/7/2022) Speaker: Ruishan Liu

Stanford University

Title
AI for Clinical Trials and Precision Medicine
Abstract
Clinical trials are the gate-keeper of medicine but can be very costly and lengthy to conduct. Precision medicine transforms healthcare but is limited by available clinical knowledge. This talk explores how AI can help both — make clinical trials more efficient and generate hypotheses for precision medicine. I will first discuss Trial Pathfinder, a computational framework that simulates synthetic patient cohorts from medical records to optimize cancer trial designs (Liu et al. Nature 2021). Trial Pathfinder enables inclusive criteria and data valuation for clinical trials, benefiting diverse patients and trial sponsors. In the second part, I will discuss how to quantify the effectiveness of cancer therapies in patients with specific mutations (Liu et al. Nature Medicine 2022). This work demonstrates how computational analysis of large real-world data generates insights, hypotheses and resources to enable precision oncology.
Bio
Ruishan Liu is a postdoctoral researcher in the Department of Biomedical Data Science at Stanford University, working with Prof. James Zou. She received her PhD in the Department of Electrical Engineering at Stanford University in 2022. She is broadly interested in the intersection of machine learning and applications in human diseases, health and genomics. Her work on Trial Pathfinder was selected as 2021 Top Ten Clinical Research Achievement and Finalist for Global Pharma Award 2021.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/30/2022) Speaker: Shibani Santurkar

Stanford University

Title
Actionably interpretable ML
Abstract
Machine learning models today attain impressive accuracy on benchmark tasks. But as we move towards deploying these models in the real world, it becomes increasingly important to verify that they not only make the *right prediction*, but that they do so for the *right reasons*. The scale and complexity of current models however presents a major roadblock in achieving this goal.
In this talk, I will discuss a methodology to design neural networks that are accurate, yet at the same time inherently more debuggable. As we demonstrate via numerical and human experiments, our approach yields vision and language models wherein one can more easily pinpoint learned spurious correlations, explain misclassifications, and diagnose biases.
Bio
Shibani Santurkar is a postdoctoral researcher at Stanford University with Tatsu Hashimoto, Percy Liang and Tengyu Ma. Her research revolves around developing machine learning models that can perform reliably in the real world, and characterizing the consequences if they fail to do so. Shibani received a PhD in Computer Science from MIT in 2021, where she was advised by Aleksander Mądry and Nir Shavit. Prior to that, she obtained a B.Tech and M.Tech in electrical engineering from the Indian Institute of Technology Bombay. She is a recipient of the Google Fellowship and an Open Philanthropy early-career grant.
Video
Session not recorded on request
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/23/2022) Speaker: Dylan Slack

University of California, Irvine

Title
Exposing Shortcomings and Improving the Reliability of Machine Learning Explanations
Abstract
For domain experts to adopt machine learning (ML) models in high-stakes settings such as health care and law, they must understand and trust model predictions. As a result, researchers have proposed numerous ways to explain the predictions of complex ML models. However, these approaches suffer from several critical drawbacks, such as vulnerability to adversarial attacks, instability, inconsistency, and lack of guidance about accuracy and correctness. For practitioners to safely use explanations in the real world, it is vital to properly characterize the limitations of current techniques and develop improved explainability methods. This talk will describe the shortcomings of explanations and introduce current research demonstrating how they are vulnerable to adversarial attacks. I will also discuss promising solutions and present recent work on explanations that leverage uncertainty estimates to overcome several critical explanation shortcomings.
Bio
Dylan Slack is a Ph.D. candidate at UC Irvine advised by Sameer Singh and Hima Lakkaraju and associated with UCI NLP, CREATE, and the HPI Research Center. His research focuses on developing techniques that help researchers and practitioners build more robust, reliable, and trustworthy machine learning models. In the past, he has held research internships at GoogleAI and Amazon AWS and was previously an undergraduate at Haverford College advised by Sorelle Friedler where he researched fairness in machine learning.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/16/2022) Speaker: SUMMER BREAK!


(6/9/2022) Speaker: SUMMER BREAK!


(6/2/2022) Speaker: Huaxiu Yao

Stanford University

Title
Actionable Machine Learning for Tackling Distribution Shift
Abstract
To deploy machine learning algorithms in real-world applications, we must pay attention to distribution shift. When the test distribution differs from the training distribution, there will be a substantial degradation in model performance. To tackle the distribution shift, in this talk, I will present two paradigms with some instantiations. Concretely, I will first discuss how to build machine learning models that are robust to two kinds of distribution shifts, including subpopulation shift and domain shift. I will then discuss how to effectively adapt the trained model to the test distribution with minimal labeled data. The remaining challenges and promising future research directions will also be discussed.
Bio
Huaxiu Yao is a Postdoctoral Scholar in Computer Science at Stanford University, working with Prof. Chelsea Finn. Currently, his research focuses on building machine learning models that are robust to distribution shifts. He is also passionate about applying these methods to solve real-world problems with limited data. He obtained his Ph.D. degree from Pennsylvania State University. The results of his work have been published in top-tier venues such as ICML, ICLR, NeurIPS. He organized the MetaLearn workshop at NeurIPS, the pre-training workshop at ICML, and he served as a tutorial speaker at KDD, IJCAI, and AAAI.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/26/2022) Speaker: Laura Manduchi

ETH Zurich

Title
Incorporating domain knowledge in deep generative models for weakly supervised clustering with applications to survival data
Abstract
The ever-growing amount of data and the time cost associated with its labeling have made clustering a relevant task in machine learning. Yet, in many cases, a fully unsupervised clustering algorithm might naturally find a solution that is not consistent with the domain knowledge. Additionally, practitioners often have access to prior information about the types of clusters that are sought, and a principled method to guide the algorithm towards a desirable configuration is then needed. This talk will explore how to integrate domain knowledge, in the form of pairwise constraints and survival data, in deep generative models. Leveraging side information in biomedical datasets enables exploratory analysis of complex data types, resulting in medically meaningful findings.
Bio
Laura is a PhD student in Computer Science at the Institute of Machine Learning at ETH Zürich under the supervision of Julia Vogt and Gunnar Rätsch. She is a member of the Medical Data Science group and of the ETH AI Centre. Her research lies at the interplay between probabilistic modelling and deep learning, with a focus on representation learning, deep generative models, and clustering algorithms. She is particularly interested in incorporating domain knowledge in the form of constraints and probabilistic relations to obtain preferred representations of data that are robust to biases, with applications in medical imaging and X-ray astronomy.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/19/2022) Speaker: Xiaoyuan Guo

Emory Univeristy

Title
Facilitating the Curation and Future Analysis of Annotated Medical Images Across Institutions
Abstract
Medical imaging plays a significant role in different clinical applications such as detection, monitoring, diagnosis, and treatment evaluations of various clinical conditions. Supervised deep learning approaches have been popular in solving medical image related tasks. However, training such models often require large amounts of annotated data as supervision, which is often unavailable in the medical area. Therefore, curation of annotated data is promising to create a large-scale dataset and contribute to the development of supervised learning. Nonetheless, directly sharing data is prohibited due to the patient concerns. Without exchanging data between internal and external data sources, we propose to apply unsupervised anomaly detectors on the internal dataset and learn the clean in-distribution (ID). Then we share the trained models with the externals and detect the class-wise shift data (aka. out-of-distribution (OOD) data) with the anomaly detectors. Higher anomaly scores indicate more difference the external data owe. We also suggest the quantification methods to measure the shiftness of detected data and the external dataset quality after removing the shift samples. Furthermore, we design a corresponding content-based medical image retrieval method that can balance both the intra- and inter-class variance for OOD-sensitive retrieval. The designed shift data identification pipeline can be used to help detect noisy and under-represented data automatically, accelerating the curation process. Meanwhile, the OOD-aware image retrieval is suitable for image annotation, querying and future analysis in external datasets.
Bio
Xiaoyuan Guo is a Computer Science PhD student at Emory University, working with Prof. Imon Banerjee, Prof. Hari Trivedi and Prof. Judy Wawira Gichoya. Her primary research interests are computer vision and medical image processing. She is experienced in medical image segmentation, out-of-distribution detection, image retrieval, unsupervised learning, etc.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/12/2022) Speaker: Ramon Correa

Arizona State University

Title
A review of Fair AI model development for image classification and prediction
Abstract
Artificial Intelligence (AI) models have demonstrated expert-level performance in image-based recognition and diagnostic tasks, resulting in increased adoption and FDA approvals for clinical applications. The new challenge in AI is to understand the limitations of models to reduce potential harm. Particularly, unknown disparities based on demographic factors could encrypt currently existing inequalities worsening patient care for some groups. In this talk, we will discuss techniques to improve model fairness for medical imaging applications alongside their limitations.
Bio
Ramon Correa is a Ph.D. student in ASU’s Data Science, Analytics, and Engineering program. His research interest involves studying model debiasing techniques. Previously, he completed his undergraduate studies at Case Western Reserve University, majoring in Biomedical Engineering.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/5/2022) Speaker: Nandita Bhaskhar

Stanford University

Title
Beyond Test Set Performance - Rethinking Generalization Strategies for Clinical Deployment
Abstract
Artificial Intelligence (AI) and Deep Learning (DL) have seen tremendous successes across various domains in medicine. However, most of these successes have been limited to academic research, with their performance validated on siloed datasets. Real world deployment of deep learning models in clinical practice are rare. In this talk, I will discuss several studies and papers in a journal club format that demonstrate various challenges and risks in directly deploying current day models to the clinic. I will then lead a discussion surrounding strategies and recommendations for developing an evaluation framework and monitoring system to make our models suitable for deployment.
Bio
Nandita Bhaskhar (see website) is a PhD student in the Department of Electrical Engineering at Stanford University advised by Daniel Rubin. She is broadly interested in developing machine learning methodology for medical applications. Her current research focuses on (i) building label-efficient models through observational supervision and self-supervision for leveraging unlabelled medical data and (ii) developing strategies for reliable model deployment by assessing, quantifying and enhancing model trust, robustness to distribution shifts, etc. Prior to Stanford, she received her B.Tech in Electronics Engineering from the Indian Institute of Information Technology, IIIT, with the highest honours.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/28/2022) Speaker: Petar Stojanov

Broad Institute

Title
Domain Adaptation with Invariant Representation Learning - What Transformations to Learn?
Abstract
Unsupervised domain adaptation, as a prevalent transfer learning setting, spans many real-world applications. With the increasing representational power and applicability of neural networks, state-of-the-art domain adaptation methods make use of deep architectures to map the input features X to a latent representation Z that has the same marginal distribution across domains. This has been shown to be insufficient for generating optimal representation for classification, and to find conditionally invariant representations, usually strong assumptions are needed. We provide reasoning why when the supports of the source and target data from overlap, any map of X that is fixed across domains may not be suitable for domain adaptation via invariant features. Furthermore, we develop an efficient technique in which the optimal map from X to Z also takes domain-specific information as input, in addition to the features X. By using the property of minimal changes of causal mechanisms across domains, our model also takes into account the domain-specific information to ensure that the latent representation Z does not discard valuable information about Y . We demonstrate the efficacy of our method via synthetic and real-world data experiments. The code is available at https://github.com/DMIRLAB-Group/DSAN.
Bio
Petar (website) is a postdoctoral researcher at the Broad Institute of MIT and Harvard, where he is supervised by Prof. Gad Getz and Prof. Caroline Uhler. He received his PhD in Computer Science at Carnegie Mellon University, where he was fortunate to be advised by Prof. Jaime Carbonell and Prof. Kun Zhang. Prior to that, he was an associate computational biologist at the Getz Lab. His research interests span machine learning and computational biology. He is currently very interested in applying causal discovery methodology to improve genomic analysis of cancer mutation and single-cell RNA sequencing data with the goal of understanding relevant causal relationships in cancer progression. His doctoral research was in transfer learning and domain adaptation from the causal perspective, a field which he is still interested and active.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/21/2022) Speaker: Albert Gu

Stanford University

Title
Efficiently Modeling Long Sequences with Structured State Spaces
Abstract
A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies. Although conventional models including RNNs, CNNs, and Transformers have specialized variants for capturing long dependencies, they still struggle to scale to very long sequences of 10000 or more steps. This talk introduces the Structured State Space sequence model (S4), a simple new model based on the fundamental state space representation $x*(t) = Ax(t) + Bu(t), y(t) = Cx(t) + Du(t)$. S4 combines elegant properties of state space models with the recent HiPPO theory of continuous-time memorization, resulting in a class of structured models that handles long-range dependencies mathematically and can be computed very efficiently. S4 achieves strong empirical results across a diverse range of established benchmarks, particularly for continuous signal data such as images, audio, and time series.
Bio
Albert Gu is a final year Ph.D. candidate in the Department of Computer Science at Stanford University, advised by Christopher Ré. His research broadly studies structured representations for advancing the capabilities of machine learning and deep learning models, with focuses on structured linear algebra, non-Euclidean representations, and theory of sequence models. Previously, he completed a B.S. in Mathematics and Computer Science at Carnegie Mellon University, and an internship at DeepMind in 2019.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/14/2022) Speaker: Sabri Eyuboglu

Stanford University

Title
Discovering Systematic Errors with Domino
Abstract
Machine learning models that achieve high overall accuracy often make systematic errors on coherent slices of validation data. In this talk , I introduce Domino, a new approach for discovering these underperforming slices. I also discuss a new framework for quantitatively evaluating methods like Domino.
Bio
Sabri is a 2nd Year CS PhD Student in the Stanford Machine Learning Group co-advised by Chris Ré and James Zou. He’s broadly interested in methods that make machine learning systems more reliable in challenging applied settings. To that end, he’s recently been working on tools that help practitioners better understand the interaction between their models and their data.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/7/2022) Speaker: No session this week -- Spring Break!


(3/31/2022) Speaker: Max Lu

MIT & Harvard Medical School

Title
Weakly-supervised, large-scale computational pathology for diagnosis and prognosis
Abstract
In this talk, I will outline a general framework for developing interpretable diagnostic and prognostic machine learning models based on digitized histopathology slides. Our method does not require manual annotation of regions of interest and can be easily scaled to tens of thousands of samples. Examples of application range from cancer subtyping and prognosis to predicting the primary origins of metastatic tumors.
Bio
Max is a 1st year Computer Science PhD student at MIT advised by Dr. Faisal Mahmood, currently interested in computational pathology and spatial biology. He obtained his B.S. degree in biomedical engineering and applied math and statistics from Johns Hopkins University. Before starting his PhD, his research primarily focused on developing machine learning algorithms for large scale quantitative analysis of digital histopathology slides.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/24/2022) Speaker: Karan Singhal

Google Research

Title
Generalization and Personalization in Federated Learning with Connections to Medical AI
Abstract
Karan will present two recent works - "What Do We Mean by Generalization in Federated Learning?" (to appear at ICLR 2022, paper) and "Federated Reconstruction- Partially Local Federated Learning" (presented at NeurIPS 2021, paper, blog post). He'll give an overview of federated learning, discuss how we might think about generalization when we have multiple local data distributions, and provide an example of a method that improves final generalization to new data distributions. Throughout the talk, he'll connect the works to medical AI by discussing generalization to unseen patients and hospitals, in both federated and standard centralized settings.
Bio
Karan leads a team of engineers and researchers at Google Research working on representation learning and federated learning, with applications in medical AI. He is broadly interested in developing and validating techniques that lead to wider adoption of AI that benefits people. Prior to joining Google, he received an MS and BS in Computer Science from Stanford University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/17/2022) Speaker: Mikhail Khodak

Carnegie Mellon University

Title
Federated Hyperparameter Tuning - Challenges, Baselines, and Connections to Weight-Sharing
Abstract
Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investigate the problem of federated hyperparameter tuning. We first identify key challenges and show how standard approaches may be adapted to form baselines for the federated setting. Then, by making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx, to accelerate federated hyperparameter tuning that is applicable to widely-used federated optimization methods such as FedAvg and recent variants. Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization across devices. Empirically, we show that FedEx can outperform natural baselines for federated hyperparameter tuning by several percentage points on the Shakespeare, FEMNIST, and CIFAR-10 benchmarks, obtaining higher accuracy using the same training budget.
Bio
Misha is a PhD student in computer science at Carnegie Mellon University advised by Nina Balcan and Ameet Talwalkar. His research focuses on foundations and applications of machine learning, in particular the theoretical and practical understanding of meta-learning and automation. He is a recipient of the Facebook PhD Fellowship and has spent time as an intern at Microsoft Research - New England, the Lawrence Livermore National Lab, and the Princeton Plasma Physics Lab. Previously, he received an AB in Mathematics and an MSE in Computer Science from Princeton University.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/10/2022) Speaker: Bin Li

University of Wisconsin-Madison

Title
Weakly supervised tumor detection in whole slide image analysis
Abstract
Histopathology is one of the essential tools for disease assessment. In modern histopathology, whole slide imaging (WSI) has become a powerful and widely used tool to visualize tissue sections in disease diagnosis, medical education, and pathological research. The use of machine learning brings great opportunities to the automatic analysis of WSIs that could facilitate the pathologists’ workflow and more importantly, enable higher-order or large-scale correlations that are normally very challenging in standard histopathology practices, such as differential diagnosis of hard-cases and treatment response predictions. This talk will cover our recent work of approaching the fundamental problem of weakly supervised classification and tumor localization in gigapixel WSIs with a novel multiple instance learning (MIL) model leveraged by self-supervised learning, as well as discussing the emerging challenges and opportunities in computational histopathology.
Bio
Bin Li is a Ph.D. candidate in Biomedical Engineering at the University of Wisconsin-Madison. He is currently a research assistant in the Laboratory for Optical and Computational Instrumentation, mentored by Prof. Kevin Eliceiri. He develops computational methods to improve the understanding of the biological and pathological mechanisms of disease development and patient care based on multi-modal microscopic image analysis.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(3/3/2022) Speaker: Siyi Tang

Stanford University

Title
Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis
Abstract
Automated seizure detection and classification from electroencephalography (EEG) can greatly improve seizure diagnosis and treatment. In this talk, I will present our recent work on graph-based modeling for EEG-based seizure detection and classification. We model EEG signals using a graph neural network and develop two EEG graph structures that capture the natural geometry of EEG sensors or dynamic brain connectivity. We also propose a self-supervised pre-training strategy to further improve the model performance, particularly on rare seizure types. Lastly, we investigate model interpretability and propose quantitative metrics to measure the model’s ability to localize seizures. ICLR paper link
Bio
Siyi Tang is a PhD candidate in Electrical Engineering at Stanford University, advised by Prof. Daniel Rubin. Her research aims to leverage the structure in medical data to develop better medical machine learning models and enable novel scientific discovery. She is also interested in enhancing the human interpretability and clinical utility of medical machine learning algorithms. Prior to Stanford, Siyi received her Bachelor of Engineering Degree with Highest Distinction Honor from National University of Singapore.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/24/2022) Speaker: Mike Wu

Stanford University

Title
Optimizing for Interpretability in Deep Neural Networks
Abstract
Deep models have advanced prediction in many domains, but their lack of interpretability remains a key barrier to their adoption in many real world applications. There exists a large body of work aiming to help humans understand these black box functions to varying levels of granularity – for example, through distillation, gradients, or adversarial examples. These methods however, all tackle interpretability as a separate process after training. In this talk, we explore a different approach and explicitly regularize deep models so that they are well-approximated by processes that humans can step through in little time. Applications will focus on medical prediction tasks for patients in critical care and with HIV.
Bio
Mike is a fifth year PhD student in Computer Science at Stanford University advised by Prof. Noah Goodman. His primary research interests are in deep generative models and unsupervised learning algorithms, often with applications to education and healthcare data. Mike’s research has been awarded two best paper awards at AAAI and Education Data Mining as well as featured in the New York Times. Prior to Stanford, Mike was a research engineer at Facebook’s applied ML group.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/17/2022) Speaker: Weston Hughes

Stanford University

Title
Deep Learning Methods for Electrocardiograms and Echocardiograms
Abstract
In this talk, we will discuss two recently published deep learning methods we’ve developed at Stanford and UCSF for the understanding of ECG and echocardiogram data. First, we'll discuss the development and evaluation of a convolutional neural network for multi class ECG interpretation which outperforms cardiologists and currently used ECG algorithms. Second, we’ll discuss a computer vision system for evaluating a range of biomarkers from echocardiogram videos. In our discussion of both papers, we’ll emphasize different analyses aiming to explain and interpret the models in different ways.
Bio
Weston Hughes is a 3rd year PhD student in the Computer Science department at Stanford, co-advised by James Zou in Biomedical Data Science and Euan Ashley in Cardiology. His research focuses on applying deep learning and computer vision techniques to cardiovascular imaging data, including electrocardiograms, echocardiograms and cardiac MRIs. He is an NSF Graduate Research Fellow.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/10/2022) Speaker: Enze Xie (TIME CHANGE - 4PM to 5PM PST)

University of Hong Kong

Title
SegFormer - Simple and Efficient Design for Semantic Segmentation with Transformers
Abstract
We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perceptron (MLP) decoders. SegFormer has two appealing features- 1) SegFormer comprises a novel hierarchically structured Transformer encoder which outputs multiscale features. It does not need positional encoding, thereby avoiding the interpolation of positional codes which leads to decreased performance when the testing resolution differs from training. 2) SegFormer avoids complex decoders. The proposed MLP decoder aggregates information from different layers, and thus combining both local attention and global attention to render powerful representations. We show that this simple and lightweight design is the key to efficient segmentation on Transformers. We scale our approach up to obtain a series of models from SegFormer-B0 to SegFormer-B5, reaching significantly better performance and efficiency than previous counterparts. For example, SegFormer-B4 achieves 50.3% mIoU on ADE20K with 64M parameters, being 5x smaller and 2.2% better than the previous best method. Our best model, SegFormer-B5, achieves 84.0% mIoU on Cityscapes validation set and shows excellent zero-shot robustness on Cityscapes-C. Code is available here.
Bio
Enze Xie is currently a PhD student in the Department of Computer Science, The University of Hong Kong. His research interest is computer vision in 2D and 3D. He has published 16 papers (including 10 first/co-first author) in top-tier conferences and journals such as TPAMI, NeurIPS, ICML and CVPR with 1400+ citations. His work PolarMask was selected as CVPR 2020 Top-10 Influential Papers. He was selected into NVIDIA Graduate Fellowship Finalist. He has won 1st place in Google OpenImages 2019 instance segmentation track.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(2/3/2022) Speaker: Jeffrey Gu

Stanford University

Title
Towards Unsupervised Biomedical Image Segmentation using Hyperbolic Representations
Abstract
Segmentation models are extremely useful for biomedical image analysis, but training segmentation models often require large, labelled datasets that are difficult and costly to acquire. Unsupervised learning is a promising approach for training segmentation models that avoids the need to acquire labelled datasets, but is made difficult by the lack of high-quality supervisory signal from expert annotations. Using the observation that biomedical images often contain an inherent hierarchical structure, we augment a VAE with additional supervisory signal via a novel self-supervised hierarchical loss. To aid the learning of hierarchical structure, we learn hyperbolic representations instead of Euclidean representations. Hyperbolic representations have previously been employed in fields such as natural language processing (NLP) as a way to learn hierarchical and tree-like structures, making them a natural choice of representation. In this talk, I will discuss hyperbolic representations for biomedical imaging as well as our recent paper on the topic.
Bio
Jeffrey Gu is a 2nd year Ph.D. student in ICME at Stanford University advised by Serena Yeung. His research interests include representation learning, unsupervised learning, biomedical applications, and beyond. Prior to Stanford, Jeffrey completed his undergraduate studies at the California Institute of Technology, majoring in Mathematics.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(1/27/2022) Speaker: Jason Jeong

Arizona State University

Title
Applications of Generative Adversarial Networks (GANs) in Medical Image Synthesis, Translation, and Augmentation
Abstract
Medical imaging is a source of crucial information in modern healthcare. Deep learning models have been developed for various modalities such as CT, MRI, Ultrasound, and PET for automatic or semi-automatic diagnosis or assessment of diseases. While deep learning models have been proven to be very powerful, training such models sufficiently requires large, well-annotated but expensive datasets. However, medical images, especially those containing diseases, are rare. While there are a variety of solutions to improve models with limited and imbalanced datasets, one solution is generating these rare images through generative adversarial networks (GANs). In this presentation, I will present a quick review on the use of GANs in medical imaging tasks, specifically classification and segmentation. Then I will present and discuss our recent work on using GANs for generating synthetic dual energy CT (sDECT) from single energy CT (SECT). Finally, some interesting challenges and possible future directions of GANs in medical imaging will be discussed.
Bio
Jiwoong Jason Jeong is a Ph.D. student in ASU’s Data Science, Analytics, and Engineering program. His research interest involves applying GANs into the medical workflow with a focus on solving medical data imbalance and scarcity. Previously, he completed his Master’s in Medical Physics at Georgia Institute of Technology.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(1/20/2022) Speaker: Lequan Yu (TIME CHANGE - 4PM to 5PM PST)

University of Hong Kong

Title
Medical Image Analysis and Reconstruction with Data-efficient Learning
Abstract
Medical imaging is a critical step in modern healthcare procedures. Accurate interpretation of medical images, e.g., CT, MRI, Ultrasound, and histology images, plays an essential role in computer-aided diagnosis, assessment, and therapy. While deep learning provides an avenue to deliver automated medical image analysis and reconstruction via data-driven representation learning, the success is largely attributed to the massive datasets with abundant annotations. However, collecting and labeling such large-scaled dataset is prohibitively expensive and time-consuming. In this talk, I will present our recent works on building data-efficient learning systems for medical image analysis and reconstruction, such as computer-aided diagnosis, anatomical structure segmentation, and CT reconstruction. The proposed methods cover a wide range of deep learning and machine learning topics, including semi-supervised learning, multi-modality learning, multi-task learning, integrating domain knowledge, etc. The up-to-date progress and promising future directions will also be discussed.
Bio
Dr. Lequan Yu is an Assistant Professor at the Department of Statistics and Actuarial Science, the University of Hong Kong. Before joining HKU, he was a postdoctoral fellow at Stanford University. He obtained his Ph.D. degree from The Chinese University of Hong Kong in 2019 and Bachelor’s degree from Zhejiang University in 2015, both in Computer Science. He also experienced research internships in Nvidia and Siemens Healthineers. His research interests are developing advanced machine learning methods for biomedical data analysis, with a primary focus on medical images. He has won the CUHK Young Scholars Thesis Award 2019, Hong Kong Institute of Science Young Scientist Award shortlist in 2019, Best Paper Awards of Medical Image Analysis-MICCAI in 2017 and International Workshop on Machine Learning in Medical Imaging in 2017. He serves as the senior PC member of IJCAI, AAAI, and the reviewer for top-tier journals and conferences, such as Nature Machine Intelligence, IEEE-PAMI, IEEE-TMI, Medical Image Analysis, etc.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(12/2/21 to 1/13/22) Speaker: Winter Break -- we will see you next year :)

Video

(11/25/21) Speaker: No session this week -- Thanksgiving break

Stanford University

Video

(11/18/21) Speaker: Ramon Correa

Emory University

Title
Adversarial debiasing with partial learning - medical image case-studies
Abstract
The use of artificial intelligence (AI) in healthcare has become a very active research area in the last few years. While significant progress has been made in image classification tasks, only a few AI methods are actually being deployed in hospitals. A major hurdle in actively using clinical AI models currently is the trustworthiness of these models. When scrutinized, these models reveal implicit biases during the decision making, such as detecting race, ethnic groups, and subpopulations. These biases result in poor model performance, or racial disparity, for patients in these minority groups. In our ongoing study, we develop a two-step adversarial debiasing approach with partial learning that can reduce the racial disparity while preserving the performance of the targeted task. The proposed methodology has been evaluated on two independent medical image case-studies - chest X-ray and mammograms, and showed promises in reducing racial disparity while preserving the performance.
Bio
Ramon Correa is a Ph.D. student in ASU’s Data Science, Analytics, and Engineering program. His research interest involves studying model debiasing techniques. Previously, he completed his undergraduate studies at Case Western Reserve University, majoring in Biomedical Engineering.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/11/21) Speaker: Rocky Aikens

Stanford University

Title
Assignment Control Plots - A Visual Companion for Causal Inference Study Design
Abstract
An important step for any causal inference study design is understanding the distribution of the treated and untreated subjects in terms of measured baseline covariates. However, not all baseline variation is equally important. In the observational context, balancing on baseline variation summarized in a propensity score can help reduce bias due to self-selection. In both observational and experimental studies, controlling baseline variation associated with the expected outcomes can help increase the precision of causal effect estimates. We propose a set of visualizations that decompose the space of measured covariates into the different types of baseline variation important to the study design. These assignment-control plots and variations thereof visually illustrate core concepts of causal inference and suggest new directions for methodological research on study design. As a practical demonstration, we illustrate one application of assignment-control plots to a study of cardiothoracic surgery. While the family of visualization tools for studies of causality is relatively sparse, simple visual tools can be an asset to education, application, and methods development. (This work is in the peer-review process and is currently available as a preprint on arxiv https://arxiv.org/abs/2107.00122)
Bio
Rachael C. “Rocky” Aikens is a collaborative biostatistician. Her methodological research focuses on the development of simple, data-centered tools and frameworks to design stronger observational studies. As a collaborator, she has led the statistical analysis of randomized assessments of clinical decision support, clinical informatics applications in pediatrics, and lifestyle interventions for reducing sedentary behavior and improving nutrition. A central focus of her work is the design and deployment of simple, data-centered methodologies, grounded in the needs of applied researchers. She is finishing a doctoral degree in Biomedical Informatics at Stanford University with expected completion June 2022.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(11/4/21) Speaker: Mars Huang (TIME CHANGE - 9AM to 10AM PST)

Stanford University

Title
Towards Generalist Medical Imaging AI Using Multimodal Self-supervised Learning
Abstract
In recent years, deep learning models have demonstrated superior diagnostic accuracy compared to human physicians in several medical domains and imaging modalities. While deep learning and computer vision provide promising solutions for automating medical image analysis, annotating medical imaging datasets requires domain expertise and is cost-prohibitive at scale. Therefore, the task of building effective medical imaging models is often hindered by the lack of large-scale manually labeled datasets. In a healthcare system where myriad opportunities and possibilities for automation exist, it is practically impossible to curate labeled datasets for all tasks, modalities, and outcomes for training supervised models. Therefore, it is important to develop strategies for training generalist medical AI models without the need for large-scale labeled datasets. In this talk, I will talk about how our group plan to develop generalist medical imaging models by combining multimodal fusion techniques with self-supervised learning.
Bio
Mars Huang is a 3rd year Ph.D. student in Biomedical Informatics at Stanford University, co-advised by Matthew P. Lungren and Serena Yeung. He is interested in combining self-supervised learning and multimodal fusion techniques for medical imaging applications. Previously, he completed his undergraduate studies at the University of California, San Diego, majoring in Computer Science and Bioinformatics.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/28/21) Speaker: Sarah Hooper

Stanford University

Title
Training medical image segmentation models with less labeled data
Abstract
Segmentation is a powerful tool for quantitative analysis of medical images. Because manual segmentation can be tedious, be time consuming, and have high inter-observer variability, neural networks (NNs) are an appealing solution for automating the segmentation process. However, most approaches to training segmentation NNs rely on large, labeled training datasets that are costly to curate. In this work, we present a general semi-supervised method for training segmentation networks that reduces the required amount of labeled data. Instead, we rely on a small set of labeled data and a large set of unlabeled data for training. We evaluate our method on four cardiac magnetic resonance (CMR) segmentation targets and show that by using only 100 labeled training image slices---up to a 99.4% reduction of labeled data---the proposed model achieves within 1.10% of the Dice coefficient achieved by a network trained with over 16,000 labeled image slices. We use the segmentations predicted by our method to derive cardiac functional biomarkers and find strong agreement to expert measurements of predicted ejection fraction, end diastolic volume, end systolic volume, stroke volume, or left ventricular mass compared an expert annotator.
Bio
Sarah Hooper is a PhD candidate at Stanford University, where she works with Christopher Ré and Curtis Langlotz. She is broadly interested in applying machine learning to meet needs in healthcare, with a particular interest in applications that make quality healthcare more accessible. Sarah received her B.S. in Electrical Engineering at Rice University in 2017 and her M.S. in Electrical Engineering at Stanford University in 2020.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/21/21) Speaker: Khaled Saab

Stanford University

Title
Observational Supervision for Medical Image Classification using Gaze Data
Abstract
Deep learning models have demonstrated favorable performance on many medical image classification tasks. However, they rely on expensive hand-labeled datasets that are time-consuming to create. In this work, we explore a new supervision source to training deep learning models by using gaze data that is passively and cheaply collected during clinical workflow. We focus on three medical imaging tasks, including classifying chest X-ray scans for pneumothorax and brain MRI slices for metastasis, two of which we curated gaze data for. The gaze data consists of a sequence of fixation locations on the image from an expert trying to identify an abnormality. Hence, the gaze data contains rich information about the image that can be used as a powerful supervision source. We first identify a set of gaze features and show that they indeed contain class discriminative information. Then, we propose two methods for incorporating gaze features into deep learning pipelines. When no task labels are available, we combine multiple gaze features to extract weak labels and use them as the sole source of supervision (Gaze-WS). When task labels are available, we propose to use the gaze features as auxiliary task labels in a multi-task learning framework (Gaze-MTL). You can find details in our MICCAI 2021 paper.
Bio
Khaled Saab is a PhD student at Stanford, co-advised by Daniel Rubin and Christopher Re. His research interests are in developing more sustainable and reliable ML models for healthcare applications. Khaled is a Stanford Interdisciplinary Graduate Fellow, one of the greatest honors Stanford gives to a doctoral student pursuing interdisciplinary research.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/14/21) Speaker: Jean-Benoit Delbrouck

Stanford University

Title
Multimodal medical research at the intersection of vision and language
Abstract
Inspired by traditional machine learning on natural images and texts, new multimodal medical tasks are emerging. From Medical Visual Question Answering to Radiology Report Generation or Summarization using x-rays, we investigate how multimodal architectures and multimodal pre-training can help improving results.
Bio
Jean-Benoit holds a PhD in engineering science from Polytechnic Mons in Belgium and is now a postdoctoral scholar at the Departement of Biomedical Data Science . His doctoral thesis focused on multimodal learning on natural images and texts. His postdoctoral research focuses on applying new (or proven) methods on multimodal medical tasks at the intersection of vision and language.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(10/7/21) Speaker: Jonathan Crabbé

University of Cambridge

Title
Explainable AI - from generalities to time series
Abstract
Modern machine learning models are complicated. They typically involve millions of operations to turn their input into a prediction. Hence, in a human perspective, they are complete black-boxes. When these models are used in critical areas such as medicine, finance and the criminal justice system, this lack of transparency appears as a major hindrance to their adoption. With the necessity to address this problem, the field of Explainable AI (XAI) thrived. In this talk, we will first illustrate how XAI allows to achieve a better understanding of these complex machine learning models in general. We will then focus on model for time series data, which constitutes a big portion of the medical data.
Bio
Jonathan Crabbé is a PhD student in the Department of Applied Mathematics from the University of Cambridge, he is supervised by Mihaela van der Schaar. He joins the van der Schaar lab following a MASt in in theoretical physics and applied mathematics at Cambridge, which he passed with distinction, receiving the Wolfson College Jennings Price.
Jonathan’s work focuses on the development of explainable artificial intelligence (XAI), which he believes to be one of the most interesting challenges in machine learning. He is particularly interested in understanding the structure of the latent representations learned by state of the art models. With his theoretical physics background, Jonathan is also enthusiastic about time series models and forecasting.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/30/21) Speaker: Siyi Tang

Stanford University

Title
Graph-based modeling in computational pathology
Abstract
Advances in whole-slide imaging, deep learning, and computational power have enabled substantial growth in the field of computational pathology, including automating routine pathology workflows and discovery of novel biomarkers. Convolutional neural networks (CNNs) have been the most commonly used network architecture in computational pathology. However, a different line of work that leverages cellular interactions and spatial structures in whole slide images using graph-based modeling methods is emerging. In this journal club, I will lead a discussion on graph-based modeling, particularly graph neural networks, in the field of computational pathology.
Bio
Siyi Tang is a PhD candidate in Electrical Engineering at Stanford University, advised by Prof. Daniel Rubin. Her research aims to leverage the structure in medical data to develop better medical machine learning models and enable novel scientific discovery. She is also interested in enhancing the human interpretability and clinical utility of medical machine learning algorithms. Prior to Stanford, Siyi received her Bachelor of Engineering Degree with Highest Distinction Honor from National University of Singapore.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/23/21) Speaker: Jared Dunnmon (TIME CHANGE - 2PM to 3PM PST)

Visiting Scholar, Stanford University

Title
The Many Faces of Weak Supervision in Medical Representation Learning - Harnessing Cross-Modality, Enabling Multi-task Learning, and Mitigating Hidden Stratification
Abstract
Weakly supervised machine learning models have shown substantial promise in unlocking the value of vast stores of medical information in support of clinical decisionmaking. In this talk, we will discuss several avenues by which these approaches can be used in real-world medical imaging applications, and the value that can be provided in each case. We first demonstrate how cross-modal weak supervision can be used to train models that achieve results statistically similar to those trained using hand labels, but with orders-of-magnitude less labeling effort. We then build on this idea to show how the large-scale multi-task learning made practical by weak supervision can provide value by supporting anatomically-resoved models for volumetric medical imaging applications. Finally, we discuss recent results indicating that weakly supervised distributionally robust optimization can be used to improve model robustness in an automated way.
Bio
Dr. Jared Dunnmon is currently a Visiting Scholar in the Department of Biomedical Data Science at Stanford University. Previously, Jared was an Intelligence Community Postdoctoral Fellow in Computer Science at Stanford, where he was advised by Profs. Chris Ré and Daniel Rubin. His research interests focus on combining heterogeneous data modalities, machine learning, and human domain expertise to inform and improve decisionmaking around such topics as human health, energy & environment, and geopolitical stability. Jared has also worked to bridge the gap between technological development and effective deployment in a variety of contexts including foreign policy at the U.S. Senate Foreign Relations Committee, solar electrification at Offgrid Electric, cybersecurity at the Center for Strategic and International Studies, emerging technology investment at Draper Fisher Jurvetson, nuclear fusion modeling at the Oxford Mathematical Institute, and nonlinear energy harvesting at Duke University. Jared holds a PhD from Stanford University (2017), a B.S. from Duke University, and both an MSc. in Mathematical Modeling and Scientific Computing and an M.B.A. from Oxford, where he studied as a Rhodes Scholar.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(9/16/21) Speaker: No session this week -- summer break!


(9/9/21) Speaker: No session this week -- summer break!


(9/2/21) Speaker: Beliz Gunel

Stanford University

Title
Self-training vs. Weak Supervision using Untrained Neural Nets for MR Reconstruction
Abstract
Untrained neural networks use CNN architecture itself as an image prior for reconstructing natural images without requiring any supervised training data. This makes them a compelling tool for solving inverse problems such as denoising and MR reconstruction for which they achieve performance that is on-par with some state-of-the-art supervised methods. However, untrained neural networks require tens of minutes to reconstruct a single MR slice at inference time, making them impractical for clinical deployment. We propose using ConvDecoder to generate “weakly-labeled” data from undersampled MR scans at training time. Using few supervised pairs and constructed weakly supervised pairs, we train an unrolled neural network that gives strong reconstruction performance with fast inference time of few seconds. We show that our method considerably improves over supervised and self-training baselines in the limited data regime while mitigating the slow inference bottleneck of untrained neural networks. In this talk, I will also briefly talk about how self-training can be applied, and in fact be complementary to pre-training approaches, in other application domains such as natural language understanding.
Bio
Beliz Gunel is a fourth year PhD student in Electrical Engineering at Stanford University, advised by Professor John Pauly. Her research interests are primarily in representation learning for medical imaging and natural language processing, and building data-efficient machine learning methods that are robust to distribution drifts. She collaborates closely with Professor Akshay Chaudhari and Professor Shreyas Vasanawala, and had research internships at Google AI, Facebook AI, and Microsoft Research. Previously, she completed her undergraduate studies at University of California, Berkeley, majoring in Electrical Engineering and Computer Science.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/26/21) Speaker: No session this week -- break!


(8/19/21) Speaker: No session this week -- break!


(8/12/21) Speaker: Juan Manuel Zambrano Chaves

Stanford University

Title
Multimodal opportunistic risk assessment for ischemic heart disease
Abstract
Current risk scores for predicting ischemic heart disease (IHD) risk—the leading cause of global mortality—have limited efficacy. While body composition (BC) imaging biomarkers derived from abdominopelvic computed tomography (CT) correlate with IHD risk, they are impractical to measure manually. Here, in a retrospective cohort of 8,197 contrast-enhanced abdominopelvic CT examinations undergoing up to 5 years of follow-up, we developed improved multimodal opportunistic risk assessment models for IHD by automatically extracting BC features from abdominal CT images and integrating these with features from each patient’s electronic medical record (EMR). Our predictive methods match and, in some cases, outperform clinical risk scores currently used in IHD risk assessment. We provide clinical interpretability of our model using a new method of determining tissue-level contributions from CT along with weightings of EMR features contributing to IHD risk. We conclude that such a multimodal approach, which automatically integrates BC biomarkers and EMR data can enhance IHD risk assessment and aid primary prevention efforts for IHD. In this talk, I will also go over other recent publications related to opportunistic imaging, body composition analysis and cardiovascular disease.
Bio
Juan Manuel pursuing is PhD in Biomedical Informatics at Stanford University, advised by Daniel Rubin and Akshay Chaudhari. He is broadly interested in developing informatics tools to aid clinicians in clinical practice. His current research focuses on multimodal data fusion, building models that leverage medical images in addition to other relevant data sources. He was previously awarded a medical degree in addition to a B.S. in biomedical engineering from Universidad de los Andes in Bogotá, Colombia.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(8/5/21) Speaker: Mayee Chen

Stanford University

Title
Mandoline - Model Evaluation under Distribution Shift
Abstract
Machine learning models are often deployed in different settings than they were trained and validated on, posing a challenge to practitioners who wish to predict how well the deployed model will perform on a target distribution. If an unlabeled sample from the target distribution is available, along with a labeled sample from a possibly different source distribution, standard approaches such as importance weighting can be applied to estimate performance on the target. However, importance weighting struggles when the source and target distributions have non-overlapping support or are high-dimensional. Taking inspiration from fields such as epidemiology and polling, we develop Mandoline, a new evaluation framework that mitigates these issues. Our key insight is that practitioners may have prior knowledge about the ways in which the distribution shifts, which we can use to better guide the importance weighting procedure. Specifically, users write simple "slicing functions" - noisy, potentially correlated binary functions intended to capture possible axes of distribution shift - to compute reweighted performance estimates. We further describe a density ratio estimation framework for the slices and show how its estimation error scales with slice quality and dataset size. Empirical validation on NLP and vision tasks shows that Mandoline can estimate performance on the target distribution up to 3x more accurately compared to standard baselines.
This is joint work done with equal contribution from Karan Goel and Nimit Sohoni, as well as Fait Poms, Kayvon Fatahalian, and Christopher Ré. In this talk I will also connect the Mandoline framework to the broader theme of interactive ML systems and some of my collaborators research in this area.
Bio
Mayee Chen is a second year PhD student in Computer Science at Stanford University, advised by Professor Christopher Ré. She is interested in understanding the theoretical underpinnings of tools in modern machine learning and using them to develop new methods. Her current interests revolve around how to evaluate sources of supervision (e.g., weakly, semi-supervised, and self-supervised) throughout the ML pipeline, particularly through both information-theoretic and geometric lenses. Previously, she completed her undergraduate studies at Princeton University, majoring in Operations Research and Financial Engineering.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/29/21) Speaker: Shantanu Thakoor

DeepMind

Title
Bootstrapped Self-Supervised Representation Learning in Graphs
Abstract
Self-supervised graph representation learning aims to construct meaningful representations of graph-structured data in the absence of labels. Current state-of-the-art methods are based on contrastive learning, and depend heavily on the construction of augmentations and negative examples. Achieving peak performance requires computation quadratic in the number of nodes, which can be prohibitively expensive. In this talk, we will present Bootstrapped Graph Latents (BGRL) a method for self-supervised graph representation learning that gets rid of this potentially quadratic bottleneck. We show that BGRL outperforms or matches previous methods on several established benchmark datasets, while consuming 2-10x less memory. Moreover, it enables the effective usage of more expressive GNN architectures, allowing us to further improve the state of the art. Finally, we will present our recent results on applying BGRL to the very large-scale data regime, in the OGB-LSC KDD Cup, where it was key to our entry being among the top 3 awardees our track.
Bio
Shantanu is a Research Engineer working at DeepMind. His primary research interests are in graph representation learning and reinforcement learning. Prior to this, he received his MS from Stanford University, where he was working on AI safety and neural network verification, and B.Tech. from IIT Bombay, where he worked on program synthesis. Recently, he has been interested in applying graph representation learning methods to large-scale problems, including the OGB Large Scale Challenge.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/22/21) Speaker: Liangqiong Qu

Stanford University

Title
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
Abstract
Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as lack of convergence and potential for catastrophic forgetting in federated learning across real-world heterogeneous devices. While most research efforts focus on improving the optimization process in FL, in this talk, we will provide a new perspective by rethinking the choice of architectures in federated models. We will show that simply replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices, accelerate convergence, and reach a better global model, especially when dealing with heterogeneous data. The code related to this talk is released here, to encourage future exploration in robust architectures as an alternative to current research efforts on the optimization front.
Bio
Liangqiong Qu is currently a postdoctoral researcher at Stanford University. She received her joint-PhD in Pattern Recognition and Intelligent System from University of Chinese Academy of Sciences (2017) and Computer Science from City University of Hong Kong (2017). Before Join Stanford, she was a postdoctoral researcher at IDEA lab in the University of North Carolina at Chapel Hill during 2018~2019. She has published over 20 peer-reviewed articles including top-tier venues such as CVPR, MedIA, TIP, and MICCAI, and she also wrote a book chapter in Big Data in Psychiatry and Neurology.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/15/21) Speaker: No session this week -- Feedback Form here


(7/8/21) Speaker: Andre Esteva

Salesforce Research

Title
Frontiers of Medical AI - Therapeutics and Workflows
Abstract
As the artificial intelligence and deep learning revolutions have swept over a number of industries, medicine has stood out as a prime area for beneficial innovation. The maturation of key areas of AI - computer vision, natural language processing, etc. - have led to their successive adoption in certain application areas of medicine. The field has seen thousands of researchers and companies begin pioneering new and creative ways of benefiting healthcare with AI. Here we'll discuss two vitally important areas - therapeutics, and workflows. In the space of therapeutics we'll discuss how multi-modal AI can support physicians in complex decision making for cancer treatments, and how natural language processing can be repurposed to create custom-generated proteins as potential therapeutics. Within workflows, we'll explore how to build a COVID-specialized search engine, and discuss ways in which this could empower health systems to securely, and accurately, search over their highly sensitive data.
Bio
Andre Esteva is a researcher and entrepreneur in deep learning and computer vision. He currently serves as Head of Medical AI at Salesforce Research. Notably, he has led research efforts in AI-enabled medical diagnostics, and therapeutic decision making. His work has shown that computer vision algorithms can match and exceed the performance of top physicians at diagnosing cancers from medical imagery. Expanded into video they can diagnose behavioral conditions like autism. In the space of AI-enabled therapeutics, his research leverages multi-modal datasets to train AI models that can personalize oncology treatments for patients by determing the best course of therapy for them. He has worked at Google Research, Sandia National Labs, and GE Healthcare, and has co-founded two tech startups. He obtained his PhD in Artificial Intelligence at Stanford, where he worked with Sebastian Thrun, Jeff Dean, Stephen Boyd, Fei-Fei Li, and Eric Topol.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(7/1/21) Speaker: Rikiya Yamashita

Stanford University

Title
Learning domain-agnostic visual representation for computational pathology using medically-irrelevant style transfer augmentation
Abstract
Suboptimal generalization of machine learning models on unseen data is a key challenge which hampers the clinical applicability of such models to medical imaging. Although various methods such as domain adaptation and domain generalization have evolved to combat this challenge, learning robust and generalizable representations is core to medical image understanding, and continues to be a problem. Here, we propose STRAP (Style TRansfer Augmentation for histoPathology), a form of data augmentation based on random style transfer from non-medical style source such as artistic paintings, for learning domain-agnostic visual representations in computational pathology. Style transfer replaces the low-level texture content of an image with the uninformative style of randomly selected style source image, while preserving the original high-level semantic content. This improves robustness to domain shift and can be used as a simple yet powerful tool for learning domain-agnostic representations. We demonstrate that STRAP leads to state-of-the-art performance, particularly in the presence of domain shifts, on two particular classification tasks in computational pathology.
Bio
Rikiya Yamashita is radiologist turned applied research scientist working as a postdoctoral researcher in the Department of Biomedical Data Science at Stanford University. He is broadly interested in developing machine learning methodology for extracting knowledge from unstructured biomedical data. With his dual expertise, he is passionate about bridging the gap between machine learning and clinical medicine.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/24/21) Speaker: Garrett Honke

Google X, the Moonshot Factory

Title
βVAE Representation Learning and Explainability for Psychopathology with EEG and the constraints on deployment in the real world
Abstract
Despite extensive standardization, diagnostic interviews for mental health disorders encompass substantial subjective judgment. Previous studies have demonstrated that EEG-based neural measures can function as reliable objective correlates of depression, or even predictors of depression and its course. However, their clinical utility has not been fully realized because of 1) the lack of automated ways to deal with the inherent noise associated with EEG data at scale, and 2) the lack of knowledge of which aspects of the EEG signal may be markers of a clinical disorder. Here we adapt an unsupervised pipeline from the recent deep representation learning literature to address these problems by 1) learning a disentangled representation using β-VAE to denoise the signal, and 2) extracting interpretable features associated with a sparse set of clinical labels using a Symbol-Concept Association Network (SCAN). We demonstrate that our method is able to outperform the canonical hand-engineered baseline classification method on a number of factors, including participant age and depression diagnosis. Furthermore, our method recovers a representation that can be used to automatically extract denoised Event Related Potentials (ERPs) from novel, single EEG trajectories, and supports fast supervised re-mapping to various clinical labels, allowing clinicians to re-use a single EEG representation regardless of updates to the standardized diagnostic system. Finally, single factors of the learned disentangled representations often correspond to meaningful markers of clinical factors, as automatically detected by SCAN, allowing for human interpretability and post-hoc expert analysis of the recommendations made by the model.
Bio
Garrett is a neuroscientist working as a Senior Research Scientist at X, the Moonshot Factory (formerly Google X). He works with projects in the early pipeline on problem areas that generally involve ML, datascience, and human-emitted data.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/17/21) Speaker: Florian Dubost

Stanford University

Title
Hydranet -- Data Augmentation for Regression Neural Networks
Abstract
Deep learning techniques are often criticized to heavily depend on a large quantity of labeled data. This problem is even more challenging in medical image analysis where the annotator expertise is often scarce. We propose a novel data-augmentation method to regularize neural network regressors that learn from a single global label per image. The principle of the method is to create new samples by recombining existing ones. We demonstrate the performance of our algorithm on two tasks- estimation of the number of enlarged perivascular spaces in the basal ganglia, and estimation of white matter hyperintensities volume. We show that the proposed method improves the performance over more basic data augmentation. The proposed method reached an intraclass correlation coefficient between ground truth and network predictions of 0.73 on the first task and 0.84 on the second task, only using between 25 and 30 scans with a single global label per scan for training. With the same number of training scans, more conventional data augmentation methods could only reach intraclass correlation coefficients of 0.68 on the first task, and 0.79 on the second task.
Bio
Florian Dubost is a postdoctoral researcher in biomedical data science at Stanford University, CA, USA, and has with six years of experience in machine learning. He holds a PhD in medical computer vision and reached top rankings in international deep learning competitions. He is member of program committees at conference workshops in AI and medicine, authored a book in AI and neurology, and is an author and reviewer for top international journals and conferences in AI and medicine with over 20 published articles, including 11 as first author.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/10/21) Speaker: Edward Choi

KAIST

Title
Learning the Structure of EHR with Graph Convolutional Transformer
Abstract
Large-scale electronic health records (EHR) provide a great opportunity for learning representation of clinical entities (such as codes, visits, patients). As EHR data are typically stored in a relational database, their diverse information (diagnosis, medications, etc) can be naturally viewed as a graph. In this talk, we will study how this graphical structure can be exploited, or even learned for supervised prediction tasks using a combination of graph convolution and self-attention. Additionally, we will briefly present more recent works regarding multi-modal learning using Transformers.
Bio
Edward Choi is currently an assistant professor at KAIST, South Korea. He received his PhD at Georgia Tech under the supervision of Professor Jimeng Sun, focusing on interpretable deep learning methods for longitudinal electronic health records. Before he joined KAIST, Ed was a software engineer at Google Health Research, developing deep learning models for predictive healthcare. His current research interests include machine learning, healthcare analytics, and natural language processing.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(6/3/21) Speaker: No session this week -- summer break!


(5/27/21) Speaker: Amara Tariq

Emory University

Title
Patient-specific COVID-19 Resource Utilization Prediction Using Fusion AI model
Abstract
Strain on healthcare resources brought forth by recent COVID-19 pandemic has highlighted the need for efficient resource planning and allocation through prediction of future consumption. Machine learning can predict resource utilization such as the need for hospitalization based on past medical data stored in electronic medical records (EMR). We experimented with fusion modeling to develop patient-specific clinical event prediction model based on patient’s medical history and current medical indicators. A review of feature importance provides insight for future research and feedback from the community on the significance of various predictors of COVID-19 disease trajectory.
Bio
Dr. Amara Tariq received her PhD degree in Computer Science from University of Central Florida in 2016 where she was Fulbright Scholar. Her research was focused on automatic understanding of cross-modal semantic relationships, especially relations between images and text. After earning her PhD, she designed and taught courses focused on Artificial Intelligence and Machine Learning at graduate and post-graduate level in her home country, i.e., Pakistan. Her research interests evolved to include multi-modal data related to the fields of bioinformatics and health science. Since the beginning of 2020, she has been working in post-doctoral research capacity at Bioinformatics department, Emory University, GA. At Emory University, her research has been focused on analyzing electronic medical records, imaging studies, and clinical reports and notes for intelligent decision making regarding disease management and healthcare resource optimization. Her research has resulted in publications in top-tier venues including IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), IEEE Transactions on Image Processing (TIP), IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Journal of American College of Radiology (JACR), and npj Digital Medicine.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/20/21) Speaker: Nandita Bhaskhar

Stanford University

Title
Self-supervision & Contrastive Frameworks -- a vision-based review
Abstract
Self-supervised representation learning and contrastive techniques have picked up a lot interest in the last couple of years, especially in computer vision. Until recently, deep learning's successes thus far have been associated with a supervised learning paradigm, wherein labelled datasets are used to train models on specific tasks. This need for labelled datasets has been identified as the bottleneck for scaling deep learning models across various tasks and domains. They rely heavily on costly, time-consuming dataset curation and labelling schemes.

Self-supervision allows us to learn representations from large unlabelled datasets. Instead of relying on labels for inputs, it depends on designing suitable pre-text tasks to generate pseudo-labels from the data directly. Contrastive learning refers to a special subset of these self-supervised methods that have achieved the most success recently. In this talk, I will go over the top 6 recent frameworks - SimCLR, MoCo V2, BYOL, SwAV, DINO and Barlow Twins, giving a deeper dive into their methodology & performance and comparing each of the frameworks' strengths and weaknesses and discuss their suitability for applications in the medical domain.
Bio
Nandita Bhaskhar (see website) is a PhD student in the Department of Electrical Engineering at Stanford University advised by Daniel Rubin. She received her B.Tech in Electronics Engineering from the Indian Institute of Information Technology, IIIT, with the highest honours. She is broadly interested in developing machine learning methodology for medical applications. Her current research focuses on observational supervision and self-supervision for leveraging unlabelled medical data and out-of-distribution detection for reliable clinical deployment. Outside of research, her curiosity lies in a wide gamut of things including but not restricted to biking, social dance, travelling, creative writing, music, getting lost, hiking and exploring new things.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/13/21) Speaker: Xiaoyuan Guo

Emory University

Title
Segmentation and Quantification of Breast Arterial Calcifications (BAC) on Mammograms
Abstract
Measurements of breast arterial calcifications (BAC) can offer a personalized, noninvasive approach to risk-stratify women for cardiovascular disease such as heart attack and stroke. We aim to detect and segment breast arterial calcifications in mammograms accurately and suggest novel measurements to quantify detected BAC for future clinical applications. To separate BAC in mammograms, we propose a lightweight fine vessel segmentation method Simple Context U-Net (SCU-Net). To further quantify calcifications, we test five quantitative metrics to inspect the progression of BAC for subjects- Sum of Mask Probability Metric (PM), Sum of Mask Area Metric (AM), Sum of Mask Intensity Metric (SIM), Sum of Mask Area with Threshold Intensity Metric (TAMx) and Sum of Mask Intensity with Threshold X Metric (TSIMx). Finally, we demonstrate the ability of the metrics to longitudinally measure calcifications in a group of 26 subjects and evaluate our quantification metrics compared to calcified voxels and calcium mass on breast CT for 10 subjects.
Bio
Xiaoyuan Guo is a Computer Science PhD student at Emory University, working with Prof. Imon Banerjee, Prof. Hari Trivedi and Prof. Judy Wawira Gichoya. Her primary research interests are computer vision and medical image processing, especially improving medical image segmentation, classification, object detection accuracy with mainstream computer vision techniques. She is also interested in solving open-world medical tasks.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(5/6/21) Speaker: Angshuman Paul

NIH

Title
Few-shot Chest X-ray Diagnosis Using Clinical Images and the Images from the Published Scientific Literature
Abstract
Few-shot learning is the art of machine learning that tries to mimic the human cognitive ability of understanding new object classes from a few labeled training examples. In the last few years, several few-shot learning methods have been proposed for different tasks related to natural images. However, few-shot learning is relatively unexplored in the field radiology image analysis. In this seminar, we will present two few-shot learning methods for chest x-ray diagnosis. Our first method uses a discriminative ensemble trained using labeled clinical chest x-ray images. The second method uses labeled chest x-ray images from the published scientific literature and unlabeled clinical chest x-ray images to train a machine learning model. Experiments show the superiority of the proposed methods over several existing few-shot learning methods.
Bio
Angshuman Paul (M.E., Ph.D.) is a visiting (postdoctoral) fellow at the National Institutes of Health, USA. His primary research interest is in Machine Learning, Medical Imaging, and Computer Vision. He earned his Ph.D. from the Indian Statistical Institute, India. He has held a visiting scientist position at the Indian statistical Institute (2019) and a graduate intern position at the University of Missouri-Columbia (2011). Dr. Paul is the recipient of the NIH Intramural Fellowship (2019) from the National Institutes of Health, USA, and the best paper award in the Tenth Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP, 2016). He serves as a reviewer of several journals including IEEE Transactions on Medical Imaging, Pattern Recognition Letters, and IEEE Transactions on Image Processing.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/29/21) Speaker: Joseph Cohen

Stanford University

Title
Gifsplanation via Latent Shift - A Simple Autoencoder Approach to Counterfactual Generation for Chest X-rays
Abstract
Motivation Traditional image attribution methods struggle to satisfactorily explain predictions of neural networks. Prediction explanation is important, especially in medical imaging, for avoiding the unintended consequences of deploying AI systems when false positive predictions can impact patient care. Thus, there is a pressing need to develop improved models for model explainability and introspection.
Specific problem A new approach is to transform input images to increase or decrease features which cause the prediction. However, current approaches are difficult to implement as they are monolithic or rely on GANs. These hurdles prevent wide adoption.
Our approach Given an arbitrary classifier, we propose a simple autoencoder and gradient update (Latent Shift) that can transform the latent representation of a specific input image to exaggerate or curtail the features used for prediction. We use this method to study chest X-ray classifiers and evaluate their performance. We conduct a reader study with two radiologists assessing 240 chest X-ray predictions to identify which ones are false positives (half are) using traditional attribution maps or our proposed method. This work will be presented at MIDL 2021.
Results We found low overlap with ground truth pathology masks for models with reasonably high accuracy. However, the results from our reader study indicate that these models are generally looking at the correct features.We also found that the Latent Shift explanation allows a user to have more confidence in true positive predictions compared to traditional approaches (0.15±0.95 in a 5 point scale with p=0.01) with only a small increase in false positive predictions (0.04±1.06 with p=0.57).
Project Page https://mlmed.org/gifsplanation/
Source code https://github.com/mlmed/gifsplanation
Bio
Joseph Paul Cohen is a researcher and pragmatic engineer. He currently focuses on the challenges in deploying AI tools in medicine specifically computer vision and genomics and is affiliated to Stanford AIMI. He maintains many open source projects including Chester the AI radiology assistant, TorchXRayVision, and BlindTool – a mobile vision aid app. He is the director of the Institute for Reproducible Research, a US non-profit which operates ShortScience.org and Academic Torrents.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/22/21) Speaker: Pradeeban Kathiravelu

Emory University

Title
Understanding Scanner Utilization with Real-Time DICOM Metadata Extraction
Abstract
Understanding system performance metrics ensures better utilization of the radiology resources with more targeted interventions. The images produced by radiology scanners typically follow the DICOM (Digital Imaging and Communications in Medicine) standard format. The DICOM images consist of textual metadata that can be used to calculate key timing parameters, such as the exact study durations and scanner utilization. However, hospital networks lack the resources and capabilities to extract the metadata from the images quickly and automatically compute the scanner utilization properties. Thus, they resort to using data records from the Radiology Information Systems (RIS). However, data acquired from RIS are prone to human errors, rendering many derived key performance metrics inadequate and inaccurate. Hence, there is motivation to establish a real-time image transfer from the Picture Archiving and Communication Systems (PACS) to receive the DICOM images from the scanners to research clusters to conduct such metadata processing to evaluate scanner utilization metrics efficiently and quickly.

In this talk, we present Niffler (https://github.com/Emory-HITI/Niffler), an open-source DICOM Framework for Machine Learning Pipelines and Processing Workflows. Niffler analyzes the scanners' utilization as a real-time monitoring framework that retrieves radiology images into a research cluster using the DICOM networking protocol and then extracts and processes the metadata from the images. Niffler facilitates a better understanding of scanner utilization across a vast healthcare network by observing properties such as study duration, the interval between the encounters, and the series count of studies. Benchmarks against using the RIS data indicate that our proposed framework based on real-time PACS data estimates the scanner utilization more accurately. Our framework has been running stable and supporting several machine learning workflows for more than two years on our extensive healthcare network in pseudo-real-time. We further present how we use the Niffler framework for real-time and on-demand execution of machine learning (ML) pipelines on radiology images.
Bio
Pradeeban Kathiravelu is a postdoctoral researcher at the Department of Biomedical Informatics in Emory University. He has an Erasmus Mundus Joint Doctorate in Distributed Computing from Universidade de Lisboa (Lisbon, Portugal) and Université catholique de Louvain (Louvain-la-Neuve, Belgium). His research focus includes researching and developing latency-aware Software-Defined Systems and cloud-assisted networks for radiology workflows at the edge.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/15/21) Speaker: Jason Fries

Stanford University

Title
Weakly Supervised Learning in Medicine (Better Living through Programmatic Supervision)
Abstract
The high cost of building labeled training sets is one of the largest barriers to using supervised machine learning in medicine. Privacy concerns create additional challenges to sharing training data for modalities like patient notes, making it difficult to train state-of-the-art NLP tools for analyzing electronic health records. The COVID-19 pandemic underscores the need for faster, more systematic methods of curating and sharing training data. One promising approach is weakly supervised learning, where low cost and often noisy label sources are combined to programmatically generate labeled training data for commodity deep learning architectures such as BERT. Programmatic labeling takes a data-centric view of machine learning and provides many of the same practical benefits as software development, including better consistency, inspectability, and creating higher-level abstractions for experts to inject domain knowledge into machine learning models.

In this talk I outline our new framework for weakly supervised clinical entity recognition, Trove, which builds training data by combining multiple public medical ontologies and other imperfect label sources. Instead of manually labeling data, in Trove annotators focus on defining labelers using ontology-based properties like semantic types as well as optional task-specific rules. On four named entity benchmark tasks, Trove approaches the performance of models trained using hand-labeled data. However unlike hand-labeled data, our labelers can be shared and modified without compromising patient privacy.
Bio
Jason Fries (http://web.stanford.edu/~jfries/) is a Research Scientist at Stanford University working with Professor Nigam Shah at the Center for Biomedical Informatics Research. He previously completed his postdoc with Professors Chris Ré and Scott Delp as part of Stanford's Mobilize Center. He received his PhD in computer science from the University of Iowa, where he studied computational epidemiology and NLP methods for syndromic surveillance. His recent research explores weakly supervised and few-shot learning in medicine, with a focus on methods for incorporating domain knowledge into the training of machine learning models.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/8/21) Speaker: Michael Zhang

Stanford University

Title
Federated Learning with FOMO for Personalized Training and Deployment
Abstract
Federated learning (FL) is an exciting and relatively new deep learning framework that canonically trains a single global model across decentralized local datasets maintained by participating clients. Accordingly with respect to making deep learning more deployable, FL is particularly promising in real world settings where technological or privacy constraints prevent individual data from being aggregated together. However, one model may not always be optimal for all participating clients. From healthcare to recommendation systems, we would ideally like to learn and deliver a personalized model for each participating client, as data may not be identically distributed from one client to another. This problem is emphasized when we consider how we might deploy FL in practice, where individual clients may only choose to federate if they can guarantee a benefit from the model produced at the end.

In this talk, I will present some recent work on one solution, called FedFomo, where each client effectively only federates with other relevant clients to obtain stronger personalization. First we will review federated learning as a machine learning framework, emphasizing the motivations behind personalized FL. I will then go into the origin story of FedFomo's name, highlighting a simple yet effective approach based both on the "fear of missing out" and "first order model optimization". In tandem, these ideas describe how FedFomo can efficiently figure out how much each client can benefit from another's locally trained model, and then use these values to calculate optimal federated models for each client. Critically, this does not assume knowledge of any underlying data distributions or client similarities, as this information is often not known apriori. Finally, I will describe recent empirical results on FedFomo's promising performance on a variety of federated settings, datasets, and degrees of local data heterogeneity, leading to wider discussion on the future directions and impact of federated learning and distributed machine learning, when personalization is in the picture.
Bio
Michael Zhang is a Computer Science PhD Student at Stanford, currently working with Chris Ré and Chelsea Finn. He is broadly interested in making machine learning more deployable and reliable in the "real world", especially through the lenses of improving model robustness and personalization to distribution shifts and new tasks, as well as developing new systems that enable collaborative machine learning and/or learning with less labels.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

(4/1/21) Speaker: Amirata Ghorbani

Stanford University

Title
Equitable Valuation of Data
Abstract
As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcare and consumer markets, it has been suggested that individuals should be compensated for the data that they generate, but it is not clear what is an equitable valuation for individual data. In this talk, we discuss a principled framework to address data valuation in the context of supervised machine learning. Given a learning algorithm trained on a number of data points to produce a predictor, we propose data Shapley as a metric to quantify the value of each training datum to the predictor performance. Data Shapley value uniquely satisfies several natural properties of equitable data valuation. We introduce Monte Carlo and gradient-based methods to efficiently estimate data Shapley values in practical settings where complex learning algorithms, including neural networks, are trained on large datasets. We then briefly discuss the notion of distributional Shapley, where the value of a point is defined in the context of underlying data distribution.
Bio
Amirata Ghorbani is a fifth year PhD student at Stanford University advised by James Zou. He primarily works on different problems in machine learning such as research on equitable methods for data valuation, algorithms to interpret machine learning models, ways to make existing ML predictors fairer, and creating ML systems for healthcare applications such as cardiology and dermatology. He has also worked as a research intern in Google Brain, Google Brain Medical, and Salesforce Research.
Video
Questions for the Speaker
Please add your questions to the speaker either to this google form or directly under the YouTube video

This site uses Just the Docs, a documentation theme for Jekyll. Source code for this version can be found here