Blog

Markov chain Monte Carlo Sampling (1)

In this first post of Tweag's four-part series on Markov chain Monte Carlo sampling algorithms, you will learn about why and when to use them and the theoretical underpinnings of this powerful class of sampling methods. We discuss the famous Metropolis-Hastings algorithm and give an intuition on the choice of its free parameters. Interactive Python notebooks invite you to play around with MCMC yourself and thus deepen your understanding of the Metropolis-Hastings algorithm.

Code Line Patterns

We visualize large collections of Haskell and Python source codes as 2D maps using methods from Natural Language Processing (NLP) and dimensionality reduction and find a surprisingly rich structure for both languages. Clustering on the 2D maps allows us to identify common patterns in source code which give rise to these structures. Finally, we discuss this first analysis in the context of advanced machine learning-based tools performing automatic code refactoring and code completion.