Samsung Electronics Introduces A High-Speed, Low-Power NPU Solution for AI Deep Learning

Share open/close
URL copied.

Deep learning algorithms are a core element of artificial intelligence (AI) as they are the processes by which a computer is able to think and learn like a human being does. A Neural Processing Unit (NPU) is a processor that is optimized for deep learning algorithm computation, designed to efficiently process thousands of these computations simultaneously.


Samsung Electronics last month announced its goal to strengthen its leadership in the global system semiconductor industry by 2030 through expanding its proprietary NPU technology development. The company recently delivered an update to this goal at the conference on Computer Vision and Pattern Recognition (CVPR), one of the top academic conferences in computer vision fields.


This update is the company’s development of its On-Device AI lightweight algorithm, introduced at CVPR with a paper titled “Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss”. On-Device AI technologies directly compute and process data from within the device itself. Over 4 times lighter and 8 times faster than existing algorithms, Samsung’s latest algorithm solution is dramatically improved from previous solutions and has been evaluated to be key to solving potential issues for low-power, high-speed computations.



Streamlining the Deep Learning Process

Samsung Advanced Institute of Technology (SAIT) has announced that they have successfully developed On-Device AI lightweight technology that performs computations 8 times faster than the existing 32-bit deep learning data for servers. By adjusting the data into groups of under 4 bits while maintaining accurate data recognition, this method of deep learning algorithm processing is simultaneously much faster and much more energy efficient than existing solutions.



Samsung’s new On-Device AI processing technology determines the intervals of the significant data that influence overall deep learning performance through ‘learning’. This ‘Quantization1 Interval Learning (QIL)’ retains data accuracy by re-organizing the data to be presented in bits smaller than their existing size. SAIT ran experiments that successfully demonstrated how the quantization of an in-server deep learning algorithm in 32 bit intervals provided higher accuracy than other existing solutions when computed into levels of less than 4 bits.


When the data of a deep learning computation is presented in bit groups lower than 4 bits, computations of ‘and’ and ‘or’ are allowed, on top of the simpler arithmetic calculations of addition and multiplication. This means that the computation results using the QIL process can achieve the same results as existing processes can while using 1/40 to 1/120 fewer transistors2.


As this system therefore requires less hardware and less electricity, it can be mounted directly in-device at the place where the data for an image or fingerprint sensor is being obtained, ahead of transmitting the processed data on to the necessary end points.



The Future of AI Processing and Deep Learning

This technology will help develop Samsung’s system semiconductor capacity as well as strengthening one of the core technologies of the AI era – On-Device AI processing. Differing from AI services that use cloud servers, On-Device AI technologies directly compute data all from within the device itself.



On-Device AI technology can reduce the cost of cloud construction for AI operations since it operates on its own and provides quick and stable performance for use cases such as virtual reality and autonomous driving. Furthermore, On-Device AI technology can save personal biometric information used for device authentication, such as fingerprint, iris and face scans, onto mobile devices safely.


“Ultimately, in the future we will live in a world where all devices and sensor-based technologies are powered by AI,” noted Chang-Kyu Choi, Vice President and head of Computer Vision Lab of SAIT. “Samsung’s On-Device AI technologies are lower-power, higher-speed solutions for deep learning that will pave the way to this future. They are set to expand the memory, processor and sensor market, as well as other next-generation system semiconductor markets.”


A core feature of On-Device AI technology is its ability to compute large amounts of data at a high speed without consuming excessive amounts of electricity. Samsung’s first solution to this end was the Exynos 9 (9820), introduced last year, which featured a proprietary Samsung NPU inside the mobile System on Chip (SoC). This product allows mobile devices to perform AI computations independent of any external cloud server.


Many companies are turning their attention to On-Device AI technology. Samsung Electronics plans to enhance and extend its AI technology leadership by applying this algorithm not only to mobile SoC, but also to memory and sensor solutions in the near future.


Four individuals who played key roles in developing Samsung’s On-Device AI Lightweight Algorithm. From Left to right; Jae-Joon Han, Chang-Young Son, Sang-Il Jung, Chang-Kyu Choi of Samsung Advanced Institute of Technology


Quantization is the process of decreasing the number of bits in data by binning the given data into sections of limited number levels, which can be represented in certain bit values and are regarded as having the same value per section
2 Transistors are devices that control the flow of current or voltage in a semiconductor by acting as amplifiers or switches

Products > Semiconductors

For any issues related to customer service, please go to for assistance.
For media inquiries, please contact

Check out the latest stories about Samsung

Learn More