New AI technique speeds up language models on edge devices
http://feedproxy.google.com/~r/venturebeat/SZYF/~3/qKicMC1LRoM/
Researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and MIT-IBM Watson AI Lab recently proposed Hardware-Aware Transformers (HAT), an AI model training technique that incorporates Google’s Transformer architecture. They claim that HAT can achieve a 3 times inferencing speedup on devices like the Raspberry Pi 4 while reducing model size by 3.7 times compared with a baseline. » ….