Minicourse on Bayesian Machine Learning for Scientific Research

October 28 – November 1, 2024

São Paulo, Brazil

ICTP-SAIFR/IFT-UNESP

Home

We will present five 3-hour lectures that will introduce participants to the world of Bayesian Machine Learning for scientific purposes. The minicourse is tailored to suit both senior and junior researchers, catering to their respective levels of experience and interest.

In the first block of each lecture, we aim to transmit the big picture of the lecture’s topic with a focus on the details from a supervising point of view. The fine points and subtleties will be addressed here, but without strict demonstrations or supplied code. This block is intended for both seniors and juniors: for seniors as a summary that shows how to apply these tools to scientific research; and for juniors as an entrance to the second block in which we put our hands in the dough. We conclude the block with an extended coffee break where we expect that the proposed ideas trigger discussions around each participant’s field of study and how to apply it in their data.

The second block is very hands-on and is intended for juniors, but seniors interested in getting actively involved in the calculations are welcome as well. We present, discuss and write code. Participants are engaged in coding exercises and discussing practical applications. This block emphasizes practical skills and real-world problem-solving. We use different libraries, and we deploy statistical software especially designed to tackle the presented problems

The minicourse is generally designed for any scientific career. We use mostly physics examples, but the material will be useful and insightful for any other field with hard scientific research. We will try to adapt and discuss the problems within the participants’ fields of research.

Participants are expected to have taken courses in algebra and analysis, be familiar with multi-dimensional vectors and expressions, have some knowledge of probability and statistics, and be prepared for non-trivial abstract reasoning and thinking. Juniors, in addition, are expected to have some knowledge of Python.

There is no registration fee and limited funds are available for local expenses.

Lecturer:

  • Ezequiel Alvarez (ICAS-UNSAM, Argentina)

Organizer:

  • Rogério Rosenfeld (IFT-UNESP/ICTP-SAIFR, Brazil)

 

Announcement:

Click HERE for online application

Application deadline: September 18, 2024

 

Registration

Announcement:

Click HERE for online application

Application deadline: September 18, 2024

Program

Reading Materials: HERE

  • Lecture 1: Introduction to Bayesian techniques
    – Theory: Bayes theorem, fields of application (games, puzzles, problems, machine learning, etc.). Bayes theorem in scientific research: Bayesian Machine Learning and Bayesian Workflow. Trade-off in replacing Neural Networks by Bayesian techniques when simulations are not reliable enough. Learning from the data. Graphical Models. Mixture Models as a general problem faced by scientists. Algorithms for tackling Mixture Models. Simple Gaussian Mixtures
    – Hands-on: Introduction to STAN Statistic Language. Solving basic inference problems with STAN. Gaussian Mixture.

 

  • Lecture 2: Simple Bayesian examples
    – Theory: Simple well known Bayesian Machine Learning problems (Eight-schools, etc). Parameters, hyperparameters and Hierarchical Bayes. Bernoulli mixture. Self-conjugate priors on a simple counting problem.
    – Hands-on: Notebooks and numerical analysis to solve the problems presented in the Theory section

 

  • Lecture 3: Mixture Models
    – Theory: Internal structure of the data. Latent variables. Graphical Models. Constructing non-trivial Probability Density Functions (PDFs) from trivial PDFs. Mixture Models. Conditionally-independent variables. Explicit expression for likelihood in Mixture Models. Explicit examples of Mixture Models, e.g. pp > hh > bbAA.
    – Hands-on: Introduction to different distributions (Dirichlet, truncated exponential, truncated Normal, etc.) Solving real mixture model problems using STAN. Hacks and tricks in Mixture Models.

 

  • Lecture 4: Studying consistency between data, model and results
    – Theory: How to check whether the results make sense. Testing the modeling against the data. Probability of the data. Posterior predictive check.
    – Hands-on: Implementing posterior predictive check. Making statements on the modeling, and on the inference results. Testing unbiasedness in chain samplings. Rhat.

 

  • Lecture 5: Mixture Model for non-parametric distributions
    – Theory: Including structured priors to take advantage of expected properties in the distributions. Exploiting continuity in the distributions. Gaussian Processes. Exploiting unimodality in distributions. Tagging in Mixture Models, ROC curves comparison. Limitations and adaptations in presented Mixture Models. Discussion on not conditionally independent variables, using and modeling correlations.
    – Hands-on: Scripts to infer on structured priors. Sampling smooth and unimodal distributions. Parallelizing inference programming for complex datasets in STAN.

Videos and Files

Photos

Additional Information

How to reach the Institute: The program will be held at ICTP South American Institute, located at IFT-UNESP, which is across the street from a major bus and subway terminal (Terminal Barra Funda). The address which is closer to the entrance of the IFT-UNESP building is R. Jornalista Aloysio Biondi, 120 – Barra Funda, São Paulo. The easiest way to reach us is by subway or bus, please find instructions here.