
Explanation of self-attention (single head)
Attention (*Source: https://jasperwalshe.com/perspective-and-performance/) 1. Main ideas Convolutional neural networks (CNN) only focus...

Pattern Recognition - Chapter 6: Bayesian parameter estimation
1. Introduce Bayesian parameter estimation (Maximum-A-posterior) 2. An example 2.1. The pre-observation 2.2. The post-observation...

Introduce Bayesian network
(*Source: Stefan Conrady - Bayesian Network Presentation) 1. Human and machine (*Source: Stefan Conrady - Bayesian Network...

Review on Techfest Munich 2017
I took part in Hackathon Techfest Munich at TU Munich in September, 2017. This was the first time I joined Hackathon. It was really...

Review on Pattern Recognition Chapter 5: Maximum-Likelihood estimation
Chapter 5 is very complicated to understand. That is why I publish this blog to summarize the knowledge in the Chapter 5. Basically, we...

Pattern Recognition - Chapter 5: Maximum-Likelihood estimation
1. Maximum-Likelihood estimate (MLE) Equation (1) (*Source: Richard O.Duda et al. Pattern Recognition) Equation (2) Equation (3) 2. MLE...

Pattern Recognition - Chapter 4: Discriminant function with Gaussian distribution
Equation (1) 1. Independent features with the same (common) covariance (*Source: Tso B. and Mather P. Classification Methods for Remotely...

Pattern Recognition - Chapter 3: Bayes decision rule
(*Source: Richard O.Duda et al. Pattern Recognition) 1. Bayes decision rule 1.1. Risk function 1.1.1. Simple cost function (*Source:...

Pattern Recognition - Chapter 2: Normal distribution
1. Normal distribution (*Source: https://en.wikipedia.org/wiki/Normal_distribution) 2. Multivariate normal densities (*Source:...


Pattern Recognition - Chapter 1: Basic probability theory
1. Discrete - Continuous random variable 1.1. Discrete random variable 1.2. Continuous random variable 2. Statistical independence 2.1....


Three basic functions in functional programming (Python)
There are three functions that facilitate functional programming: map, filter and reduce. To explain three functions, I will tell you...

Optimize code for Parallel Processing
1. Parallel strategies In the age of internet, computers need to process big data and the problem here is reduce running time. To reduce...