Efficient AI

One of the great benefits of lockdown for me is the time I have to catch up on some of the papers released that are not directly related to my day to day work. In the past week I’ve been catching up on some of the more general outputs from NeurIPS 2020. One of the papers that really caught my eye was “Ultra-Low Precision 4-bit Training of Deep Neural Networks” by Xiao Sun et al.

It’s no doubt that AI in its current form takes a lot of energy. You only have to look at some of the estimated costs of GPT-3 to see how the trend is pushing for larger, more complex models with larger, more complex hardware to get state of the art results. These AI super-models take a tremendous amount of power to train, with costs out of the reach of individuals and most businesses. AI edge computing has been looking at moving on going training into smaller models on edge devices, but to get the accuracy and the speed, the default option is expensive dedicated hardware and more memory. Is there another way?

Continue reading Efficient AI

ImageNet in 4 Minutes? What the paper really shows us

ImageNet has been a deep learning benchmark data set since it was created.  It was the competition that showed that DL networks could outperform non-ML techniques and it’s been used by academics as a standard for testing new image classification systems.  A few days ago an exciting paper was published on arxiv for training ImageNet in four minutes.  Not weeks, days or hours but minutes.  This is on the surface a great leap forward but it’s important to dig beneath the surface.  The Register sub headline says all you need to know:

So if you don’t have a relaxed view on accuracy or thousands of GPUs lying around, what’s the point? Is there anything that can be taken from this paper?

Continue reading ImageNet in 4 Minutes? What the paper really shows us