Posts

2026

Unpacking Manifold-Constrained Hyper-Connections: A Deep Dive into DeepSeek's Architecture

23 January 2026·26 mins

A technical deep dive into DeepSeek’s Manifold-Constrained Hyper-Connections (mHC), exploring how Doubly Stochastic Matrices and the Birkhoff Polytope solve gradient instability while expanding network capacity.

Understanding RMSNorm: My Notes on Faster Layer Normalization ↗ ↖

3 January 2026

Originally published on NeuraForge

2025

The Deadly Triad in Reinforcement Learning: Why Agents Fail and How DQN Fixed It ↗ ↖

22 September 2025

Originally published on NeuraForge

A Deep Dive into Q-Learning: The Off-Policy TD Control Algorithm ↗ ↖

19 September 2025

Originally published on NeuraForge

A Deep Dive into On-Policy TD Control: The SARSA Algorithm

18 September 2025·9 mins

Deep dive into SARSA, a foundational on-policy TD control algorithm for reinforcement learning.

Temporal Difference: Bootstrapping in Reinforcement Learning

17 September 2025·2 mins

Understanding the TD learning update rule and bootstrapping in reinforcement learning.

Monte Carlo Learning in RL

15 September 2025·13 mins

Guide to Monte Carlo methods in RL: learning from complete episodes and full returns.

Model Free RL: Prediction, Control, and the MRP-MDP Duality ↗ ↖

14 September 2025

Originally published on NeuraForge

My Three Months at Relativity: Building AI for Legal Tech

7 September 2025·6 mins

Reflections on building AI for legal tech during my Applied Science internship at Relativity.

Reinforcement Learning Essentials: MDPs & Optimal Control ↗ ↖

9 August 2025

Originally published on NeuraForge