Explaining Bayesian A/B Testing with Python Implementation

Image by Alessandro Crosato from Splash

There are many applications of A/B testing across various industries. From trying to identify optimal market groups to target to medical drug testing, it has various applications and allows businesses to make decisions based on the results. There are two common ways to approach A/B testing, the frequentist approach and the bayesian approach, both stepping from the foundations of hypothesis testing. In this article, I’ll cover the explanation and implementation of the bayesian approach to A/B testing. …

This article will explain the frequentist approach to A/B testing and provide an example with code of when and how to use it

Image by Jason Dent from Splash

A/B testing is commonly used across all industries to make decision in different aspects of the business. From writing emails, to choosing landing pages, implementing specific feature designs, A/B testing can be used to make the best decision based on statistical analysis. This article will cover the basis of the frequentist approach to A/B testing and outline an example of how to derive a decision through A/B testing. I will also provide the associated Python implementation of the code for a specific example.

Table of Contents

  • What is A/B Testing
  • Frequentist Approach
    - Null & Alternative Hypothesis
    - Sample Mean Estimate
    - Confidence…

Explaining and Implementing kMeans Algorithm in Python

Image by Kelly Sikkema from Unsplash

This article will outline a conceptual understanding of the k-Means algorithm and its associated python implementation using the sklearn library. K means is a clustering algorithm with many use cases in real world situations. This algorithm generates K clusters associated to a dataset, it can be done for various scenarios in different industries including pattern detection, medical diagnostic, stock analysis, community detection, market segmentation, image segmentation etc. It is often used to gain intuition about the dataset you’re working with, by grouping similar data points close to another (a cluster). …

Explaining and Implementing SVM in Python

Image from Splash

Support Vector Machines (SVM) is a core algorithm used by data scientists. It can be applied for both regression and classification problems but is most commonly used for classification. It’s popularity stems from the strong accuracy and computation speed (depending on size of data) of the model. Due to the fact that SVM operate through kernels, it is excellent at solving non linear problems as well. The premise behind how SVM works is quite simple, given data plotted on a plane, this algorithm would create a line / hyperplane to separate the data into different classes.

In continuation of my…

Understanding & Implementation of Decision Tree & Random Forest

Image from Splash

Decision Trees and Random Forests are robust algorithms commonly used in the industry because of their ease of interpretability and performance. One of the strongest attributes to this algorithm is that it allows users to see which features contribute the most to the prediction and its importance based on the depth of the tree.

This article will provide an conceptual understanding of the decision tree and random forest algorithms. Although this algorithm is robust enough for both classification and regression based problems, this article will focus on the classification based examples. You can apply a similar thought process described below…

Understand the Random Walk with Restart algorithm and its associated implementation in Python

Image from Unplash

The scope of this article is to explain and focus around the conceptual understanding behind the random walk with restart algorithm. A strong mathematical understanding will not be provided here, but I have left links to resources where for those interested can investigate further into the mathematics behind this algorithm. I will also provide a documented Python script at the end of the article associated to the implementation of this algorithm.

Table of Content

  • Introduction
  • What is a Random Walk?
  • Random Walk with Restart
  • Advantages
  • Disadvantages
  • Python Implementation
  • Resources


This algorithm is highly applicable in research as well as the industry. If you’re…

Understand how Markov Chains work and implement them in Python to generate text


In this article, I will explain and provide the python implementations of Markov chain. This article will not be a deep dive into the mathematics behind Markov chains, instead, it will prioritize the conceptual understanding of how it works and how to implement it with python. I left resources I’ve used and other materials at the bottom of this article which goes into a deep dive in the mathematics behind Markov chains.

A Markov chain is a stochastic model created by Andrey Markov, which outlines the probability associated with a sequence of events occurring based on the state in the…

Understand the Monte Carlo method and how to implement it in Python


In this post, I will introduce, explain and implement the Monte Carlo method to you. This method of simulation is one of my favourites because of its simplicity and yet it’s a refined method to resolve complex problems. It was invented by Stanislaw Ulam, a polish mathematician in the 1940s. It was named after a gambling town in Monaco because the principles of randomness mimic a game of roulette. Monte Carlo simulations are a very common concept to quantify risk in various areas like stock prices, sales forecasting, predictive modelling, etc.

How does the Monte Carlo Method Work?

Monte Carlo simulations are a method of simulating statistical…


This blog post will continue in my series of reviewing masterclasses I’ve watched. This week I’ll cover The Art of Negotiation by Chris Voss. A preface to my remarks regarding this post as it is one subject to my own opinions. If you disagree with my review feel free to comment and let me know your thoughts on this master class.

Just as many other masterclasses I’ve watched, I had very high expectations from this one based on the recommendations I’ve received from my peers to watch it. I’ve never heard of Chris Voss nor have I read any of…

Understand the KNN algorithm and its implementation in Python using the sklearn library

Image from : https://unsplash.com/photos/lW25Zxpkln8

In this article I will give a general overview, implementation, drawbacks and resources associated to the K Nearest Neighbours algorithm. Supervised learning is a subsection of machine learning generally associated to classification and regression based problems. Supervised learning implies that you are training a model using a labelled dataset. K Nearest Neighbours (KNN) falls under the supervised learning umbrella and is one of the core algorithms in machine learning. It’s a highly used, simple yet efficient example of a non-parametric, lazy learner classification algorithm.

  • Lazy Learner implies that it doesn’t learn a discriminative function from the training data but rather…


Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store