Special Session 1: Implementation of Deep Neural Networks and Its Applications
Yiran Chen, University of Pittsburgh

Presentation 1

Title: Conditional Deep Learning: Energy Efficient and Enhanced Pattern Recognition

Speaker: Priyadarshini Panda and Kaushik Roy, Purdue University

Abstract: Brain inspired computing models like Artificial Neural Networks (ANN) and Spiking Neural Networks (SNN) have emerged as one of the powerful tools for pattern recognition and classification problems. Nano devices emulating the functionality of neurons and synapses are a crucial requirement for such neuromorphic computing platforms. Recent experiments on lateral spin valves (LSV) have demonstrated the switching of nano-magnets when the net spin potential in the non-magnetic channel due to injected spin-polarized current exceeds a certain threshold, thereby emulating the biological neuron. Programmable domain wall strips can be interfaced with such "spin-neurons" to inject weighted spin-polarized current in the channel to mimic the synaptic functionality. The low resistance, magneto-metallic neurons operate at a small terminal of ~20mV, while performing analog computation upon inputs leading to the possibility of ultra low power, low-voltage ANN based neuromorphic computing. On the other hand, a more biologically realistic computing model in comparison to ANNs, SNNs perform unsupervised learning by spike transmission and require the online programming of synapses based on the temporal information of spikes. Programmable resistive synapses based on ferromagnet-heavy metal hetero-structure offers the possibility of implementing Spike Timing Dependent Plasticity mechanisms by utilizing the highly energy-efficient spin-orbit torque. Results indicate that the proposed design schemes can achieve ~100X reduction in computation energy compared to the state of art CMOS designs.

Presentation 2

Title: Reflections of Deep Networks in Multimedia: From an Industry Perspective

Speaker: Liangliang Cao, Yahoo Labs

Abstract: Deep neural network has become a hot topic in recent years. It opens a door to big jumps of recognition accuracy and dramatic speedup of model evaluation. In the multimedia field, we have seen a number of cases where deep neural networks have created huge value to company’s business. This talk will first introduce some fundamental concepts of deep learning, then analyze a few success projects in industry, and finally discuss some future challenges of deep neural networks.

Presentation 3

Title: Real-Time Pedestrian Detection with Convolutional Neural Network on Customized Hardware

Speaker: Yu Wang, Tsinghua University and DeePhi Tech.

Abstract: Pedestrian detection is a core problem in computer vision and plays an important role in many real-word applications such as Advanced Driving Assistance System (ADAS) and video surveillance. The adoption of Convolutional Neural Network (CNN) on pedestrian detection has significantly increased the accuracy of pedestrian detection but the complexity of CNN makes it hard to run on embedded device in real time. In this paper, we present a novel customized hardware platform on embedded FPGA for pedestrian detection. The platform runs in real-time with power less than 4 watt.

Presentation 4

Title: ApesNet: A Pixel-wise Efficient Segmentation Network

Speaker: Yiran Chen, University of Pittsburgh

Abstract: Analyzing road scenes using cameras could have a crucial impact in many domains, such as autonomous driving, personal navigation and mapping of large scale environments. It requires the ability to model appearance (road, building), shape (cars, pedestrians) and understand the spatial-relationship (context) between different classes such as road and side-walk. In typical road scenes, the majority of the pixels belong to large classes such as road, building and hence the network must produce smooth segmentations. We present a deep convolutional neural network based segmentation engine to delineate moving and other objects based on their shape despite their small size, hence retain boundary information in the extracted image representation. From a computational perspective, optimizations on network architecture are demonstrated to be efficient in terms of both memory and computation time during inference.

   Special Session 2: Power-Thermal Efficiency and Self-Awareness in Mobile Multimedia Devices
Muhammad Shafique, Vienna University of Technology (TU Wien), Austria

Presentation 1

Title: Dynamic Power Management in Mobile Systems: Opportunities and Challenges

Authors: Raid Ayoub and Michael Kishinevsky, Intel, USA

Abstract: Power consumption of high-end mobile systems continue to rise as results of unprecedented market demand for high performance. High power consumption leads to a lower battery life and high temperature levels which severely impact user experience and reliability of the devices. As a result, efficient power management solutions become a critical design aspect for mobile system. Traditional SW level power management solutions are mostly studied for the PC and server domains which focused on the CPU subsystem. These approaches are not suitable for the mobile domain due to emerge of additional power hungry components (e.g. GPU and display) that are commonly used in these systems. On the opportunities end, the number of power control variables at the SW level continue to increase to enable better power management solutions. The interesting research challenge is how to efficiently control these variables to optimize for the system power. In this work we focus on the SW level dynamic power management approaches for the CPU-GPU and display subsystems in high-end mobile systems. We address the system level aspect in terms of energy savings, performance improvement, and user satisfaction.

Presentation 2

Title: Towards Self-Aware Mobile Multimedia Systems through Intelligent Cross-Layer Coordination

Authors: Fadi Kurdahi and Nikil Dutt, University of California Irvine (UCI), USA, and Axel Jantsch, Vienna University of Technology (TU Wien), Austria

Abstract: Although there is a rich history of cross-layer design for mobile multimedia systems to achieve desired QoS, we are facing ever more challenges from the intertwined goals of energy-, efficiency, and thermal design constraints, as well as resilience to errors emanating from the application, environment and hardware platforms. We posit that next-generation computing platforms for mobile multimedia must necessarily deploy intelligent cross-layer design achieved through self-awareness principles inspired by biology and nature. Self-awareness means that the system entertains an abstract but still comprehensive model of its own state and performance allowing for an understanding of how the system is expected to perform, if it deviates from these expectations and which resources are required to accomplish specific goals. Such an approach will move us from current strategies (using limited cross-layer coordination) to a holistic cross-layer strategy that enables intelligent cross-layer management policies which can adaptively tune itself based on the current state of the system. The talk will present multimedia design exemplars that embrace this intelligent cross-layer approach, and highlight the role of self-awareness in achieving dynamic adaptivity. The talk will also outline different attributes and levels of self-awareness and speculate on the needs and overheads of these attributes.

Presentation 3

Title: Providing Sustainable Performance in Thermally Constrained Mobile Devices

Authors: Ayse K. Coskun and Onur Sahin, Boston University, USA

Abstract: State-of-the-art smartphones can generate excessive amounts of heat during high computational activity or long durations of use. While throttling mechanisms ensure safe component and outer surface level temperatures, frequent throttling can largely degrade the user perceived performance. This work explores the impact of multiple different thermal constraints in a real-life smartphone on user experience. In addition to high processor temperatures, which have traditionally been a major point of interest, we show that applications can also quickly elevate battery and device skin temperatures to critical levels. We introduce and evaluate various thermally-efficient runtime management techniques that can slow down heating under performance guarantees so as to sustain a desirable performance for maximum durations.

Presentation 4

Title: Scheduling Challenges and Opportunities in Integrated CPU+GPU Processors

Authors: Sherief Reda and Kapil Dev, Brown University, USA

Abstract: Heterogeneous processors with architecturally different devices (CPU and GPU) integrated on the same die provide good performance and energy efficiency for wide range of workloads. However, they also create challenges and opportunities in terms of scheduling workloads on the appropriate device. Current scheduling practices mainly use the characteristics of kernel workloads to decide the CPU/GPU mapping. However, we observe that runtime conditions such as power and CPU load also affect the mapping decision. Consequently, in this paper, we propose techniques to characterize the OpenCL kernel workloads during run-time and map them on appropriate device under time-varying physical (i.e., chip power limit) and CPU load conditions, in particular the number of available CPU cores for the OpenCL kernel. We implement our dynamic scheduler on a real CPU-GPU processor and evaluate it using various OpenCL benchmarks. Compared to the state-of-the-art kernel-level scheduling method, the proposed scheduler provides up to 31% and 10% improvements in runtime and energy, respectively.