International Journal of Innovative Research in Computer and Communication Engineering

ISSN Approved Journal | Impact factor: 8.771 | ESTD: 2013 | Follows UGC CARE Journal Norms and Guidelines

| Monthly, Peer-Reviewed, Refereed, Scholarly, Multidisciplinary and Open Access Journal | High Impact Factor 8.771 (Calculated by Google Scholar and Semantic Scholar | AI-Powered Research Tool | Indexing in all Major Database & Metadata, Citation Generator | Digital Object Identifier (DOI) |


TITLE Implementation of a NEAT-Based Reinforcement Learning System for Autonomous Vehicle Navigation on ESP32
ABSTRACT This paper presents the complete design and implementation of an autonomous vehicle navigation system that unifies NeuroEvolution of Augmenting Topologies (NEAT) with a dense Reinforcement Learning (RL) reward signal to evolve controllers capable of collision-free path following in complex, obstacle-laden environments. NEAT evolves neural network topologies — adding and removing nodes and connections — across 40 generations of 50 genomes, using cumulative RL reward as the fitness function. A multi-objective A* planner with a five-stage path optimisation pipeline (RDP simplification, line-of-sight pruning, obstacle pushing, collision validation) supplies the reference path, and a 30+ terrain type, 5-wheel-profile vehicle model ensures physical plausibility. The best-evolved genome is deployed to an ESP32 microcontroller over Wi-Fi, which executes move/turn commands via a PID-regulated L293D motor driver with quadrature encoder feedback. Experimental results demonstrate convergence at generation 30, a 71% reduction in collisions compared to A*-only planning, a 94% simulation success rate, and a 10 Hz real-time execution rate on hardware.
AUTHOR PROF. VAISHNAVI SONAWANE, MASOOD MADKI Diploma Student, Department of Artificial Intelligence & Machine Learning, AISSMS Polytechnic, Pune, India
VOLUME 182
DOI DOI: 10.15680/IJIRCCE. 2026.1403073
PDF pdf/73_Implementation of a NEAT-Based Reinforcement Learning System for Autonomous Vehicle Navigation on ESP32.pdf
KEYWORDS
References [1] K. O. Stanley and R. Miikkulainen, Evolving neural networks through augmenting topologies, Evol. Comput., vol. 10, no. 2, pp. 99-127, 2002.
[2] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed., MIT Press, 2018.
[3] A. Y. Ng, D. Harada, and S. Russell, Policy invariance under reward transformations: Theory and application to reward shaping, ICML, 1999.
[4] P. E. Hart, N. J. Nilsson, and B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Trans. Syst. Sci. Cybern., vol. 4, no. 2, pp. 100-107, 1968.
[5] M. Likhachev, G. Gordon, and S. Thrun, ARA*: Anytime A* with provable bounds on sub-optimality, Adv. Neural Inf. Process. Syst., 2003.
[6] K. J. Astrom and T. Hagglund, PID Controllers: Theory, Design, and Tuning, ISA Press, 1995.
[7] Texas Instruments, L293x Quadruple Half-H Drivers, Datasheet SLRS008H, 2022.
[8] Espressif Systems, ESP32 Technical Reference Manual v5.1, 2023.
[9] D. H. Douglas and T. K. Peucker, Algorithms for the reduction of the number of points required to represent a digitized line, Cartographica, vol. 10, no. 2, pp. 112-122, 1973.
[10] J. Togelius et al., Search-based procedural content generation: A taxonomy and survey, IEEE Trans. Comput. Intell. AI Games, 2011.
[11] N. Jakobi, P. Husbands, and I. Harvey, Noise and the reality gap: The use of simulation in evolutionary robotics, ECAL, 1995.
[12] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics, MIT Press, 2005.
image
Copyright © IJIRCCE 2020.All right reserved