Thomas Charlon is a research associate postdoc at Harvard Medical School. He completed his computer science PhD at the University of Geneva (2019) while being employed as a Bioinformatician at Quartz Bio (Merck Serono spin-off) to develop algorithms for clustering and sparse coding of genome-wide data in systemic autoimmune diseases. He then independently researched withheld content on social networks in European countries, and developed a Shiny web app for real-estate price estimation using 10 years of French tax office open-data. As a research associate in the CELEHS laboratory, he focuses on standardizing analysis processes, enhancing statistical visualizations, and facilitating the dissemination of analyses results using APIs. His research applies natural language processing to unstructured text data related to mental health and suicide prevention, as scientific publications and electronic health records, to assist psychiatrists and clinicians in identifying at-risk patients.

Presentations

22x

The best of both worlds: building R / Python pipelines for biomedical LLM semantic search apps

At the CELEHS laboratory we are particularly interested by LLM-based embeddings as BGE and BERT. As the number of models increases, we need methods to compare their clinical usefulness. While some R packages exist to leverage GPU capabilities, Pytorch is by far more used for GPU computation. In contrast, R is efficient for data management and visualization. How should one build robust and reproducible pipelines incorporating them both ? My answer is well-designed pipelines with Docker, Makefile, and Elasticsearch. In this talk I will showcase my design approaches to such challenges.

See Presentation