Online learning control based on projected gradient temporal difference and advanced heuristic dynamic programming | IEEE Conference Publication | IEEE Xplore