Google misinterpreted Karpathy's (joke) tweet, and by searching " what is the best learning rate for adam optimizer " says that " 3e-4 is the best hands down ".
Google misinterpreted Karpathy's (joke) tweet, and by searching " what is the best learning rate for adam optimizer " says that " 3e-4 is the best hands down ".