Machine Learning

2026

Unpacking Manifold-Constrained Hyper-Connections: A Deep Dive into DeepSeek's Architecture

23 January 2026·26 mins

A technical deep dive into DeepSeek’s Manifold-Constrained Hyper-Connections (mHC), exploring how Doubly Stochastic Matrices and the Birkhoff Polytope solve gradient instability while expanding network capacity.

2025

A Deep Dive into On-Policy TD Control: The SARSA Algorithm

18 September 2025·9 mins

Deep dive into SARSA, a foundational on-policy TD control algorithm for reinforcement learning.

Monte Carlo Learning in RL

15 September 2025·13 mins

Guide to Monte Carlo methods in RL: learning from complete episodes and full returns.