Tutorial - CLiC-it 2023

Danilo Croce

Danilo Croce is a Tenure track Assistant Professor in Computer Science at the University of Rome, Tor Vergata, Faculty of Engineering. He holds a Ph.D. in Informatics Engineering from the same university. His expertise spans both theoretical and applied Machine Learning. He primarily delves into Natural Language Processing, Information Retrieval, and Data Mining domains. His particular interest lies in innovative Kernels within Support Vectors, Deep Neural Machines, Transformer-based Neural Architectures, and Large Language Models. His focus centers on advanced syntactic and semantic processing within the realms of Natural Language Processing and Computer Vision. Throughout his academic career, he has actively participated in various roles across numerous national and European projects. He has authored over 150 publications in international journals, conference proceedings, and workshops (H-index: 24), with 10 of these being recognized as “Best Paper” and more than 15 top ranks in national and international benchmarking champaigns. Danilo has been a member of the Program Committee for more than 25 International Conferences and Workshops. He is also a regular reviewer for esteemed International Conferences including ACL, EMNLP, COLING, and AAAI.

Claudiu Daniel Hromei

Claudiu Daniel Hromei is a 3rd year Ph.D. student at the University of Rome Tor Vergata, with a keen interest in the dynamic field of Human-Robot Interaction. His research primarily revolves around Natural Language Processing (NLP) and Generative AI. In 2020, Claudiu successfully earned his Master’s Degree in Computer Science from Tor Vergata, in the context of Human-Robot Interaction, with a specific focus on medical scenarios. Over time, his research focused on exploring Kernel-based Support Vector Machines and the world of Large Language Models (LLMs), especially in light of the Era of ChatGPT. His approach extends to the development and maintenance of LLMs while ensuring their sustainability through innovative approaches such as Low-Rank Adaptation (LoRA) and model quantization with lower floating-point precision. Additionally, he is focused on architectures that integrate multi-modal inputs with LLMs.

Large Language Models and How to Instruction Tune Them (in a Sustainable Way)

In recent years, the evolution of Large Language Models (LLMs) has marked a profound transformation in computational linguistics and computer science. This tutorial aims to provide participants with a comprehensive understanding of the state-of-the-art LLMs introduced in the literature. We will articularly focus on the progression that led models like GPT or LLaMA to be adapted for language inference tasks and their capability to be instruction fine-tuned, eventually leading to the development of models such as ChatGPT, Alpaca and Vicuna.
A notable challenge with LLMs has been their computational complexity, making them appear unfeasible for common usage. To (partially) solve such issues, this tutorial will delve into techniques like quantization and, more prominently, low-rank adaptation (LoRA). These techniques make it possible to fine-tune an LLM with 7 or 13 billion parameters on a standard 16 GB GPU. The application of these methods has enabled the adaptation of foundational LLMs pre-trained on large-scale text collections to a myriad of tasks, leading to a remarkable proliferation of models being released on the web daily.
Building on this, we will explore the development of a unified architecture that participated under the name of ExtremITA in all tasks of EVALITA 2023. By effectively combining prompting engineering and sustainable learning techniques, this monolithic architecture based on LLaMA tackled twenty-two complex semantic processing tasks in the Italian language, across varied semantic dimensions, including Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence.
Lastly, attendees will gain insights into training a state-of-the-art foundational model (LLaMA2) using data from the competition, replicating the approach in ExtremITA. This will empower them with the knowledge to develop a monolithic model capable of addressing all tasks and extending beyond.