Dr Vasilios Kelefouras
Profiles

Dr Vasilios Kelefouras

Lecturer in Computer Science

School of Engineering, Computing and Mathematics (Faculty of Science and Engineering)

Biography

Biography

Since September 2018, I have held the position of a Lecturer (Assistant Professor) at the University of Plymouth within the Department of Computing. My research area, in simple terms, lies in the broad field of optimizing software applications, e.g., Deep Neural Network (DNN) applications, in terms of execution time, energy consumption, and memory footprint; my research includes a diverse spectrum of areas, including low-level hardware-dependent compiler optimizations/ techniques for CPUs, GPUs, and FPGAs, high-level compression-based optimizations for DNNs, such as filter pruning, low-rank factorization and quantization, task scheduling, and memory management strategies. 

In 2013, I received my PhD from the department of Electrical and Computer Engineering at University of Patras; I composed and won the Greek PhD Research Scholarship. 
From September 2013 until December 2016 I had been working as a postdoctoral researcher at VLSI lab at Dept. of Electrical and Computer Engineering, University of Patras. Additionally, from October 2015 until December 2016, I had been working as a postdoctoral researcher at Embedded System Design and Application Lab of Technological Educational Institute of Western Greece. From Jan. 2017 until Dec. 2017 I was a Research Fellow at Distributed Systems and Services Research Group, School of Computing, University of Leeds (UK). Last, from Dec. 2017 until Sept. 2018 I had been working as a Lecturer at Sheffield Hallam University.
Teaching

Teaching

Teaching interests

Currently, I am the module leader of the following modules
  • Parallel Computing (COMP3001)
  • Computer Systems (COMP1001)
Admin Roles
  • Undergraduate Admissions Tutor
  • Academic Liaison Person for Partner Colleges 
  • Equality, diversity and inclusion (EDI) committee member 
Research

Research

Research interests

Research Interests: 
  • Compiler Optimizations for High Performance Computing
  • Optimizing Deep Neural Networks in terms of execution time and memory size
  • Optimization of Matrix/Tensor Computations
  • Loop Transformations
  • Data Movement Optimization in Cache Memories
  • Task Scheduling for High Performance Computing
  • Compiler Optimizations for reducing Energy Consumption
  • Compiler Optimizations for Faulty Cache Memories
I have strong R&D experience in optimizing software applications, in terms of execution time, energy consumption and memory size, on a wide range of hardware platforms including embedded systems, GPUs and FPGAs. I have published more than 45 research papers in high quality journals and conferences, such as IEEE/ACM transactions. I am currently supervising three PhD students.  

Research Highlights:
Compiler Optimizations for accelerating Deep Neural Networks [1]: Convolution layers are the main performance bottleneck in many classes of Deep Neural Networks and especially in Convolutional Neural Networks which are widely used in artificial intelligence applications such as computer vision. In this research work [1], a novel analytical methodology is developed for super-fast convolution layers on CPUs. The experimental results, which include 112 different convolution layers and two hardware platforms, show that the convolution layers of ResNet-50, DenseNet-121 and SqueezeNet are executed from x1.1 up to x7.2 times faster than Intel oneDNN state of the art library. 

[1] V. Kelefouras and G. Keramidas, “Design and Implementation of Deep Learning 2D Convolutions on modern CPUs,” IEEE Transactions on Parallel and Distributed Systems, 2023 
Compiler Optimizations for accelerating Smoothing, Sharpening and Edge Detection, image/video processing software applications [2]: A novel methodology is developed [2] to speedup Smoothing, Sharpening and Edge Detection algorithms on CPUs. To accelerate such routines, the popular OpenCV library supports Intel IPP optimized library to accelerate its routines on Intel CPUs (not by default - this needs to be specified during the installation phase). Based on our experimental results, which include 20 different image sizes and two hardware platforms, the proposed methodology achieves from x2.8 to x40 speedup over the Intel IPP / OpenCV library for GaussianBlur() and Filter2D() OpenCV routines.  

[2] V. Kelefouras and G. Keramidas, “Design and implementation of 2D convolution on x86/x64 processors,” IEEE Transactions on Parallel and Distributed Systems, 2022 

Other research

  • Won the HiPEAC technology transfer award https://www.hipeac.net/awards/#/tech-transfer/2022/ for the proposal 'DSE methodology for Tensor Train Decomposition in NEOX AI-SDK'. The technology has been transferred to Think Silicon, a provider of ultra-low-power graphics processing units and Machine Learning accelerators. 
  • Won the Best Paper Award at SAMOS XXII Conference: 22nd International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation. The paper is entitled “A Design Space Exploration Methodology for Enabling Tensor Train Decomposition in Edge Devices”

Research degrees awarded to supervised students

PhD Students:
  1. “Accelerating Machine Learning Algorithms on Heterogeneous Multi-GPU Clusters”, 2nd Supervisor (2020-now) 
  2.  “Hardware-Software Co-design Methodologies for accelerating Machine Learning algorithms”, 2nd Supervisor (2022-now) 
  3.  “Improving the performance of HiRep Lattice Simulations software by exploiting the CPU/GPU hardware architecture details and algorithm characteristics”, Main Supervisor (DoS) (2020-now)
  4.  "Optimising Flow Routing Using Network Performance Analysis", 2nd Supervisor for Dr Muna Al-Saadi (2020-2023) 

Grants & contracts

Co-I in ‘Development of GIB optimisation for investigating local scour around complex structure for marine restoration and protection’. This research grant is funded by Engys LTD and University RnD solutions fund (£30,000). Partners are Engys LTD and ARC Marine companies. April 2023 - Dec. 2023.
PI in 'Optimisation of sparse matrix solvers for Computational Fluid Dynamics', funded by RnD solutions fund (£2,000). June 2023 - July 2023.
Publications

Publications

Key publications

Key publications are highlighted

Journals
Articles
Kelefouras V & Keramidas G (2023) 'Design and Implementation of Deep Learning 2D Convolutions on modern CPUs' IEEE Transactions on Parallel and Distributed Systems Open access
Kolosov D, Kelefouras V, Kourtessis P & Mporas I (2023) 'Contactless Camera-Based Heart Rate and Respiratory Rate Monitoring Using AI on Hardware' Sensors 23, (9) , DOI Open access
Al-Saadi M, Khan A, Kelefouras V, Walker DJ & Al-Saadi B (2023) 'SDN-Based Routing Framework for Elephant and Mice Flows Using Unsupervised Machine Learning' Network 3, (1) 218-238 , DOI Open access
Dimitrios K, Kelefouras V, PANDELIS K & Iosif M (2022) 'Anatomy of Deep Learning Image Classification and Object Detection on Commercial Edge Devices: A Case Study on Face Mask Detection' IEEE Access 10, 109167-109186 , DOI Open access
Kelefouras V & Djemame K (2022) 'Workflow Simulation and Multi-Threading Aware Task Scheduling for Heterogeneous Computing' Journal of Parallel and Distributed Computing (Elsevier) , DOI Open access
Kelefouras V, Djemame K, Keramidas G & Voros N (2022) 'A methodology for efficient tile size selection for affine loop kernels' International Journal of Parallel Programming , DOI Open access
Kelefouras V & Keramidas G (2022) 'Design and Implementation of 2D Convolution on x86/x64 Processors' IEEE Transactions on Parallel and Distributed Systems , DOI Open access
Mporas I, Perikos I, Kelefouras V & Paraskevas M (2020) 'Illegal Logging Detection Based on Acoustic Surveillance of Forest' Applied Sciences 10, (20) 7379-7379 Publisher Site , DOI Open access
Kelefouras V & Djemame K (2019) 'A methodology correlating code optimizations with data memory accesses, execution time and energy consumption' Journal of Supercomputing , DOI Open access
Vasilios K, Georgios K & Nikolaos V (2018) 'Combining Software Cache Partitioning and Loop Tiling for Effective Shared Cache Management' ACM Transactions on Embedded Computing Systems 17, (3) 1-25 , DOI Open access
Kelefouras V (2017) 'A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details' Computing 99, (9) 865-888 , DOI Open access
Kritikakou A, Catthoor F, Kelefouras V & Goutis C (2016) 'Array Size Computation under Uniform Overlapping and Irregular Accesses' ACM Transactions on Design Automation of Electronic Systems 21, (2) 1-35 , DOI
Kelefouras V, Kritikakou A, Mporas I & Kolonias V (2016) 'A high-performance matrix–matrix multiplication methodology for CPU and GPU architectures' The Journal of Supercomputing 72, (3) 804-844 , DOI
Michail HE, Athanasiou GS, Kelefouras VI, Theodoridis G, Stouraitis T & Goutis CE (2015) 'Area-Throughput Trade-Offs for SHA-1 and SHA-256 Hash Functions’ Pipelined Designs' Journal of Circuits, Systems, and Computers 25, (04) 1650032-1650032 , DOI Open access
Kelefouras V, Kritikakou A & Goutis C (2015) 'A methodology for speeding up loop kernels by exploiting the software information and the memory architecture' Computer Languages, Systems & Structures 41, 21-41 , DOI
Kelefouras V, Kritikakou A, Papadima E & Goutis C (2015) 'A methodology for speeding up matrix vector multiplication for single/multi-core architectures' The Journal of Supercomputing 71, (7) 2644-2667 , DOI
Kritikakou A, Catthoor F, Kelefouras V & Goutis C (2014) 'A scalable and near-optimal representation of access schemes for memory management' ACM Transactions on Architecture and Code Optimization 11, (1) 1-25 , DOI
Kelefouras V, Kritikakou A & Goutis C (2014) 'A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD' The Journal of Supercomputing 68, (3) 1418-1440 , DOI
Kelefouras V, Kritikakou A & Goutis C (2013) 'A methodology for speeding up edge and line detection algorithms focusing on memory architecture utilization' The Journal of Supercomputing 68, (1) 459-487 , DOI
Kritikakou A, Catthoor F, Kelefouras V & Goutis C (2013) 'Near-optimal and scalable intrasignal in-place optimization for non-overlapping and irregular access schemes' ACM Transactions on Design Automation of Electronic Systems 19, (1) 1-30 , DOI
Kelefouras VI, Kritikakou AS, Siourounis K & Goutis CE (2013) 'A Methodology for Speeding up MVM for Regular, Toeplitz and Bisymmetric Toeplitz Matrices' Journal of Signal Processing Systems 77, (3) 241-255 , DOI
Kritikakou A, Catthoor F, Athanasiou GS, Kelefouras V & Goutis C (2013) 'Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput Constraints' ACM Transactions on Architecture and Code Optimization 10, (2) 1-25 , DOI
Kritikakou A, Catthoor F, Kelefouras V & Goutis C (2013) 'A systematic approach to classify design-time global scheduling techniques' ACM Computing Surveys 45, (2) 1-30 , DOI
Michail HE, Athanasiou GS, Kelefouras V, Theodoridis G & Goutis CE (2012) 'On the exploitation of a high-throughput SHA-256 FPGA design for HMAC' ACM Transactions on Reconfigurable Technology and Systems 5, (1) 1-28 , DOI
Kelefouras VI, Athanasiou GS, Alachiotis N, Michail HE, Kritikakou AS & Goutis CE (2011) 'A Methodology for Speeding Up Fast Fourier Transform Focusing on Memory Architecture Utilization' IEEE Transactions on Signal Processing 59, (12) 6217-6226 , DOI
Alachiotis N, Kelefouras VI, Athanasiou GS, Michail HE, Kritikakou AS & Goutis CE (2010) 'A data locality methodology for matrix–matrix multiplication algorithm' The Journal of Supercomputing 59, (2) 830-851 , DOI
Kokhazadeh M, Keramidas G, Kelefouras V & Stamoulis I 'A Practical Approach for Employing Tensor Train Decomposition in Edge Devices' International Journal of Parallel Programming, Springer Open access
Chapters
Anthimopulos T, Keramidas G, Kelefouras V & Stamoulis I (2023) 'A Comparative Study of Neural Network Compilers on ARMv8 Architecture' Architecture of Computing Systems Springer Nature Switzerland 18-33 , DOI
Kelefouras V, Djemame K, Keramidas G & Voros N (2022) 'An Analytical Model for Loop Tiling Transformation' Lecture Notes in Computer Science Springer International Publishing 95-107 , DOI
Djemame K, Kavanagh R, Kelefouras V, Aguilà A, Ejarque J, Badia R, Pérez DG, Pezuela C, Deprez J-C & Guedria L (2018) 'Towards an Energy-Aware Framework for Application Development and Execution in Heterogeneous Parallel Architectures' Hardware Accelerators in Data Centers Springer
Conference Papers
Gkountelos D, Kokhazadeh M, Bournas C, Keramidas G & Kelefouras V (2023) 'Towards Highly Compressed CNN Models for Human Activity Recognition in Wearable Devices' 2023 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) 9-/-0/20239-/-0/2023IEEE , DOI
Kokhazadeh M, Keramidas G, Kelefouras V & Stamoulis I (2022) 'A Design Space Exploration Methodology for Enabling Tensor Train Decomposition in Edge Devices' International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS) , DOI Open access
Dorr T, Schade F, Masing L, Becker J, Keramidas G, Antonopoulos CP, Mavropoulos M, Kelefouras V & Voros N (2022) 'Safety by Construction: Pattern-Based Application of Safety Mechanisms in XANDAR' 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) 7-/-0/20227-/-0/2022IEEE , DOI
Siddiqui F, Khan R, Sezer S, McLaughlin K, Masing L, Dorr T, Schade F, Becker J, Ahlbrecht A & Zaeske W (2022) 'XANDAR: A holistic Cybersecurity Engineering Process for Safety-critical and Cyber-physical Systems' 2022 IEEE 95th Vehicular Technology Conference (VTC2022-Spring) 6-/-0/20226-/-0/2022IEEE , DOI Open access
Djemame K, Datsev D & Kelefouras V (2022) 'Evaluation of language runtimes in open-source serverless platforms' 12th International Conference on Cloud Computing and Services Science 4-/-0/20224-/-0/2022Open access
Masing L, Dorr T, Schade F, Becker J, Keramidas G, Antonopoulos CP, Mavropoulos M, Tiganourias E, Kelefouras V & Antonopoulos K (2022) 'XANDAR: Exploiting the X-by-Construction Paradigm in Model-based Development of Safety-critical Systems' 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE) 3-/-0/20223-/-0/2022IEEE , DOI Open access
Becker J, Masing L, Dorr T, Schade F, Keramidas G, Antonopoulos CP, Mavropoulos M, Tiganourias E, Kelefouras V & Antonopoulos K (2021) 'XANDAR: X-by-Construction Design framework for Engineering Autonomous & Distributed Real-time Embedded Software Systems' 2021 31st International Conference on Field-Programmable Logic and Applications (FPL) 8-/-0/20219-/-0/2021IEEE , DOI
Kelefouras V (2021) 'An analytical model for loop tiling transformation' International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation , DOI Open access
Tiganourias E, Mavropoulos M, Keramidas G, Kelefouras V, Antonopoulos CP & Voros N (2021) 'A Hierarchical Profiler of Intermediate Representation Code based on LLVM' 2021 10th Mediterranean Conference on Embedded Computing (MECO) 6-/-0/20216-/-0/2021IEEE , DOI Open access
Al-Saadi M, Khan A, Kelefouras V, Walker DJ & Al-Saadi B (2021) 'Unsupervised Machine Learning-Based Elephant and Mice Flow Identification' Springer International Publishing 357-370 , DOI
Kelefouras V & Djemame K (2018) 'Workflow Simulation Aware and Multi-Threading Effective Task Scheduling for Heterogeneous Computing' 25th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC) Bengaluru, India 2-/-1/20182-/-1/2018, DOI Open access
Kelefouras V & Djemame K (2018) 'A methodology for efficient code optimizations and memory management' ACM International Conference on Computing Frontiers 2018 (CF '18) Ischia, Italy , DOI Open access
Kelefouras V, Keramidas G & Voros N (2017) 'Cache partitioning + loop tiling: A methodology for effective shared cache management”' IEEE Computer Society Annual Symposium on VLSI Bochum, Germany 7-/-0/20177-/-0/2017, DOI Open access
Emertlis A, Kelefouras V, Theodoridis G, Nanou M, Politi C, Georgoulakis K & Glentis O (2015) 'FPGA IMPLEMENTATION OF A MIMO DFE IN 40 GB/S DQPSK OPTICAL LINKS' EUSIPCO NICE, FRANCE
Kritikakou A, Catthoor F, Kelefouras V & Goutis C (2015) 'Near-optimal & Scalable Representation of Access Schemes for Memory Management' European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC) Amsterdam, The Netherlands
Emeretlis A, Kelefouras V, Theodoridis G & Glentis O (2014) 'EFFICIENT FPGA IMPLEMENTATIONS OF VOLTERRA DFES FOR OPTICAL SYSTEMS' IEEE Dallas Circuits and Systems Conference (DCAS) 0-/-1/20140-/-1/2015
Kritikakou A, Catthoor F, Athanasiou GS, Kelefouras V & Goutis C (2012) 'A template-based methodology for efficient microprocessor and FPGA accelerator co-design' 2012 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XII) 7-/-0/20127-/-0/2012IEEE , DOI
Michail H, Kelefouras V, Panagianopoulou D, Gregoriades A, Kotsiolis A & Goutis C (2010) 'HW/SW co-Design Integrating High – Speed Authentication Module for IPSec/IPv6' Oral Presentation in the fifth International Conference on Digital Telecommunications (ICDT 2010) Athens/Glyfada Greece
Michail H, Apostolopoulou D, Anastasiou L, Porpodas V, Athanasiou G, Kelefouras V & Goutis C (2009) 'Novel Hardware Implementation of the Cipher Message Authentication Code (CMAC)' Oral Presentation in 1st Panhellenic Conference on Electronics and Telecommunications (PACET '08) Patras Greece
Kelefouras V 'XANDAR: An X-by-Construction Framework for Safety, Security, and Real-Time Behavior of Embedded Software Systems' Design, Automation and Test in Europe (DATE) Open access
Personal

Personal

Reports & invited lectures

  • Visiting Lecturer in “Compilers for Embedded Systems” module of the master program «Integrated Software and Hardware Systems», at the department of Computer Engineering and Informatics at the University of Patras (winter semester 2013-2014, 2014-2015, 2015-2016, 2016-2017, 2017-2018, 2018-2019, 2019-2020, 2021-2022)
  • Invited Speaker to the 1st European Compiler Seminar organized by the Customized Parallel Computing group of Tampere University (http://tuni.fi/cpc )– Spring 2022

Other academic activities

Program Committee Member
  • Computing Frontiers (CF) Conference (https://www.computingfrontiers.org/2023/ )