Program of ML in PL Conference 2024

/ Contributed talks

Klaudia Balcer

Maciej Chrabaszcz

NASK - National Research Institute / Warsaw University of Technology

Co-authors:

Hubert Baniecki,Piotr Komorowski,Szymon Płotka,Przemysław Biecek

Contributed talk 5: Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models

Friday / 8 November 11:05 - 11:30 Hall B (CfC Session 2)

Abstract:

Analysis of 3D segmentation models, especially in the context of medical imaging, is often limited to segmentation performance metrics that overlook the crucial aspect of explainability and bias. Currently, effectively explaining these models with saliency maps is challenging due to the high dimensions of input images multiplied by the ever-growing number of segmented class labels. To this end, we introduce Agg$^2$Exp, a methodology for aggregating fine-grained voxel attributions of the segmentation model's predictions. Unlike classical explanation methods that primarily focus on the local feature attribution, Agg$^2$Exp enables a more comprehensive global view on the importance of predicted segments in 3D images. Our benchmarking experiments show that gradient-based voxel attributions are more faithful to the model's predictions than perturbation-based explanations. As a concrete use-case, we apply Agg$^2$Exp to discover knowledge acquired by the Swin UNEt TRansformer model trained on the TotalSegmentator v2 dataset for segmenting anatomical structures in computed tomography medical images. Agg$^2$Exp facilitates the explanatory analysis of large segmentation models beyond their predictive performance.

Biography:

Maciej Chrabąszcz is a dedicated researcher in the field of Artificial Intelligence, with a particular focus on AI model behavior analysis, alignment, and efficient computing. As a PhD student in Computer Science, his work contributes to the critical areas of AI development and understanding. Having completed his Master's in Mathematical Statistics at the Warsaw University of Technology (WUT), Maciej is now pursuing his doctoral studies at the same institution. Concurrently, he contributes his expertise to NASK - National Research Institute.

Barbara Klaudel

France Rose

University of Cologne

Co-authors:

Monika Michaluk, Timon Blindauer, Bogna M. Ignatowska-Jankowska, Liam O’Shaughnessy, Greg J. Stephens, Talmo D. Pereira, Marylka Y. Uusisaari, Katarzyna Bozek

Contributed talk 9: Uncertainty-aware self-supervised learning on multi-dimensional time series for animal behavior

Friday / 8 November 15:30 - 15:55 Main Hall (CfC Session 3)

Abstract:

Studying freely moving animals is essential to understand how animals behave and make decisions -- e.g. when they escape predators, find mates, or raise their young -- in an undisturbed manner. Although animal behavior has been studied for decades, animal movements can only now be recorded at high throughput thanks to recent technical progress. On one hand, videos from synchronized cameras can be coupled with deep learning pose estimation methods, automatically tracking the trajectories of a few keypoints. On the other hand, motion capture systems directly outputs the 3D trajectories of physical reflectors apposed on the body (reflectors on a suit for humans, reflecting piercings for rodents). However, these methods are not perfect and contain missing data. Since animal behavior cannot be easily scripted and additional recordings are not always possible due to constraints in experimental design, missing data is a more pressing problem in animal compared to human behavior analysis. So far, few works have effectively addressed these issues in animal recordings, with most relying on linear interpolation and smoothing (e.g. Kalman filter) only suitable for short gaps, or lacking large-scale testing. We hypothesized that recent advances in deep learning architectures and self-supervised learning (SSL) can help recover missing data by learning dynamics within and between keypoints. Specifically masked modeling has proven to be successful in recent large language models and computer vision transformers. Mimicking the missing data during training via masked modeling, we tested several neural network architectures: Gated Recurrent Unit (GRU), Temporal Convolutional network (TCN), Spatio-Temporal Graph Convolutional Network (ST-GCN), Space-Time-Separable Graph Convolutional Network (STS-GCN), and a custom transformer encoder named DISK (Deep Imputation for Skeleton data). For testing, we gathered seven datasets, covering five species (human, fly, mouse, rat, fish), in 2D and 3D, from one to two animals, and a variety of number of keypoints (from 3 to 38 per animal). Furthermore we adapted a probabilistic head, initially proposed for probabilistic forecasting of time-series, to assess the reliability of the imputed data at inference time. We found that DISK outperformed other architectures and linear interpolation baseline (42% to 89% root mean square error improvement compared to linear interpolation, calculated between true coordinates and imputed ones on a held-out test set - one value per dataset). DISK probabilistic head outputs an estimated error linearly correlated with the real error (Pearson correlation coefficient: 0.746 to 0.890 - one value per dataset). This estimated error allows to filter out less reliable predictions and control the amount of noise in the imputed dataset. As SSL methods are known to learn general properties about input data, we further explored the latent space of DISK and showed motion sequences clustered by behavior categories (e.g. attack, mount, investigation). While animal behavior experiments are expensive and complex, tracking errors make sometimes large portions of the experimental data unusable. DISK allows for filling in the missing information and for taking full advantage of the rich behavioral data. Available as a stand-alone imputation package (github.com/bozeklab/DISK.git), DISK is applicable to results of any tracking method (cameras or motion capture) and allows for any type of downstream an.

Biography:

France Rose is a post-doctoral researcher at the University Hospital of Cologne. Her research topics cover biomedical image and time-series analysis. At the time of exploding data generation in Biology and Medical Sciences, it is exciting to meet the needs in image analysis and challenge current scientific knowledge.

Natasha Alkhatib

Cybersecurity and AI researcher

Contributed talk 10: How LLMs are Revolutionizing the cybersecurity field

Friday / 8 November 14:30 - 14:55 Hall A (CfC Session 4)

Abstract:

The ever-evolving threat landscape demands constant adaptation. Traditional methods struggle. Large Language Models (LLMs) emerge, wielding the power of language. This talk explores LLMs' revolution in cybersecurity. LLMs are AI models trained on massive text and code datasets. This grants them an understanding of complex linguistic patterns, invaluable in cybersecurity. Firstly, LLMs excel at advanced threat detection. Analyzing vast amounts of data, they identify subtle anomalies indicating brewing attacks. Traditional methods rely on pre-defined rules, vulnerable to novel attack vectors. LLMs, with their ability to learn and adapt, identify unseen threats, providing a crucial early warning system. Secondly, LLMs offer proactive threat analysis. By ingesting vast quantities of threat intelligence data, including past attack methods and attacker motivations, LLMs uncover patterns and predict future attack vectors. This allows security teams to take a pre-emptive approach, focusing resources on fortifying potential weaknesses before attackers exploit them. Imagine an LLM analyzing a hacker forum, identifying discussions about targeting a specific software vulnerability. This foresight empowers security professionals to patch the vulnerability before a widespread breach. Furthermore, LLMs can revolutionize vulnerability research . Traditionally, identifying vulnerabilities is time-consuming and laborious. LLMs, with their ability to analyze vast code repositories, pinpoint potential vulnerabilities through code patterns and language constructs associated with known weaknesses. This streamlines the vulnerability discovery process, allowing security teams to address critical issues before attackers identify them. While LLMs offer a powerful new frontier, challenges remain. Issues surrounding explainability, bias in training data, and potential misuse require careful consideration. However, the potential benefits are undeniable. As these models continue to evolve and integrate with existing security solutions, they hold the promise of a more secure and resilient digital landscape.

Biography:

Dr.Natasha Alkhatib Researcher & Engineer - Cybersecurity & AI for Automotive Dr.Natasha Alkhatib is a researcher and engineer with expertise in cybersecurity and artificial intelligence (AI) for the automotive industry. Her passion for securing vehicles against cyber threats led her to pursue a Ph.D. thesis at the prestigious Institut Polytechnique de Paris. Her doctoral research focused on leveraging AI to develop robust solutions against cyberattacks in connected and autonomous vehicles. Currently, Dr.Natasha Alkhatib applies her expertise at ETAS Bosch, a leading provider of embedded systems for the automotive industry. In this role, she is instrumental in developing AI-based solutions that enhance the cybersecurity of automotive products. She plays a key role in ensuring the safety and security of future generations of vehicles.

Klaudia Bałazy

NVIDIA | Jagiellonian University

Co-authors:

Mohammadreza Banaei, Karl Aberer, Jacek Tabor

Contributed talk 11: Efficient Fine-Tuning of LLMs: Exploring PEFT Methods and LoRA-XS Insights

Friday / 8 November 15:00 - 15:25 Hall A (CfC Session 4)

Abstract:

The rapid scaling of large language models (LLMs) has underscored the need for parameter-efficient fine-tuning (PEFT) methods to manage increasing computational and storage demands. Among these methods, Low-Rank Adaptation (LoRA) has emerged as a prominent solution, often matching or exceeding the performance of full fine-tuning with significantly fewer parameters. Despite its success, LoRA faces challenges related to the storage of numerous task-specific or user-specific modules on top of a base model. In this talk, I will discuss the importance of parameter-efficient fine-tuning in natural language processing (NLP) and provide an overview of various PEFT approaches for large language models. I will introduce our latest research, LoRA-XS (Low-Rank Adaptation with eXtremely Small number of parameters), which leverages Singular Value Decomposition (SVD) to further enhance parameter efficiency. I will also highlight emerging trends and future possibilities in efficient fine-tuning.

Biography:

Klaudia Bałazy is a Senior Deep Learning Engineer at NVIDIA and a PhD student at the Jagiellonian University. She is also an active member of the Group of Machine Learning Research (GMUM). Her research primarily focuses on enhancing the efficiency of deep learning solutions, with particular emphasis on model compression, dynamic neural networks, and the parameter efficiency of large language models. Klaudia holds both a Master's and an Engineer's degree in Computer Science from the AGH University of Science and Technology. Throughout her career, she has led and participated in various AI-based projects across several tech startups, contributing to the development of practical AI applications.

Adam Dziedzic

Filip Ręka

AGH University of Krakow

Poster 6: Generating music with Large Language Models

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Emilia Wiśnios

Independent

Co-authors:

Gracjan Góral

Poster 9: When All Options Are Wrong: Evaluating Large Language Model Robustness with Incorrect Multiple-Choice Options

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

The ability of Large Language Models (LLMs) to identify multiple-choice questions that lack a correct answer is a crucial aspect of educational assessment quality and an indicator of their critical thinking skills. This paper investigates the performance of various LLMs on such questions, revealing that models experience, on average, a 55\% reduction in performance when faced with questions lacking a correct answer. The study also highlights that Llama 3. 1-405B demonstrates a notable capacity to detect the absence of a valid answer, even when explicitly instructed to choose one. The findings emphasize the need for LLMs to prioritize critical thinking over blind adherence to instructions and caution against their use in educational settings where questions with incorrect answers might lead to inaccurate evaluations. This research establishes a benchmark for assessing critical thinking in LLMs and underscores the ongoing need for model alignment to ensure their responsible and effective use in educational and other critical domains.

Biography:

University of Warsaw graduate with a Master's in Machine Learning. Specializes in Natural Language Processing (NLP), large language models, and the intersection of NLP with political science.

Warsaw University of Technology

Poster 13: Position: Do Not Explain Vision Models Without Context

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

Does the stethoscope in the picture make the adjacent person a doctor or a patient? This, of course, depends on the contextual relationship of the two objects. If it’s obvious, why don’t explanation methods for vision models use contextual information? The role of context has been widely covered in Natural Language Processing and Time Series but much less in Computer Vision. I will explain what contextual information within images is, using some real-world examples. I will outline how the issue of spatial context was addressed in the Deep Learning models and contrast it with the small number of works concerning the topic within the field of Explainable AI (XAI). I will show examples of failures of popular XAI methods when the spatial context plays a significant role. Finally, I will argue that there is a need to change the approach to explanations from 'where' to 'how'.

Biography:

Paulina Tomaszewska is a PhD student at the Warsaw University of Technology. She gained experience in the field of AI at universities in Singapore, South Korea, Austria and Switzerland. Her research covers Explainable AI, the importance of context in images and digital pathology.

Joanna Kaleta

Warsaw University of Technology; Sano Centre for Computational Medicine

Co-authors:

Kacper Kania, Tomasz Trzcinski, Marek Kowalski

Poster 14: LumiGauss: High-Fidelity Outdoor Relighting with 2D Gaussian Splatting

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

Decoupling lighting from geometry using unconstrained photo collections is notoriously challenging. Solving it would benefit many users as creating complex 3D assets takes days of manual labor. Many previous works have attempted to address this issue, often at the expense of output fidelity, which questions the practicality of such methods. We introduce LumiGauss - a technique that tackles 3D reconstruction of scenes and environmental lighting through 2D Gaussian Splatting. Our approach yields high-quality scene reconstructions and enables realistic lighting synthesis under novel environment maps. We also propose a method for enhancing the quality of shadows, common in outdoor scenes, by exploiting spherical harmonics properties. Our approach facilitates seamless integration with game engines and enables the use of fast precomputed radiance transfer. We validate our method on the NeRF-OSR dataset, demonstrating superior performance over baseline methods. Moreover, LumiGauss can synthesize realistic images when applying novel environment maps.

Biography:

Joanna Kaleta is a PhD student at the Warsaw University of Technology and the Sano Centre for Computational Medicine. She holds a Master’s degree in Computer Science from the Warsaw University of Technology. Joanna’s current research focuses on the intersection of computer graphics and deep learning, particularly in neural rendering. At Sano, she is part of the Health Informatics team, where she applies deep learning methods to image-guided therapy, working on advancements in medical technology.

Alicja Dobrzeniecka

Lingaro / NASK National Research Institute

Poster 15: Continual Learning of Multi-Modal Models

Saturday / 9 November 10:30 - 12:00 (Poster Session 1)

Abstract:

AI models can become obsolete after training as new data becomes available. Re-training large models is costly and energy inefficient. Continual Learning attempts to find a solution to one of the most challenging bottlenecks of current AI models - the fact that data distribution changes over time. In my poster I would like to show the capabilities of Continual Learning methods for multimodal models, and in particular for vision-language models such as CLIP. Vision-Language models can handle both textual and visual data, which has a wide range of use cases such as image analysis, object recognition and scene understanding, image captioning, answering visual questions, and more. I will present the current state of the art in applying Continual Learning to vision-language models, their limitations and opportunities for improvement, and the results of experiments on selected methods.

Biography:

Alicja Dobrzeniecka have been studying and researching AI for a number of years. She hold a Master of Science in Artificial Intelligence from the Vrije Universiteit Amsterdam and a Bachelor of Arts in Philosophy from the University of Gdansk. She has recently published an article entitled "A Bayesian Approach to Uncertainty in Word Embedding Bias Estimation" in Computational Linguistics in MIT Press Direct. My Master's thesis focused on the interpretability of large language models such as BERT. Alicja share some of my research with a wider audience by publishing on the Medium platform. She has commercial experience as a Data Scientist, developing machine learning and deep learning models for business. In her last role, she worked on the use of LLMs for machine translation applications. Alicja currently focused on exploring the area of Continual Learning for multimodal models, which she believe will be a crucial direction for AI in the near future due to energy and resource constraints.

Valeriya Khan

IDEAS NCBR, Warsaw University of Technology

Co-authors:

Kamil Deja, Bartłomiej Twardowski, Tomasz Trzcinski

Poster 16: Assessing the Impact of Unlearning Methods on Text-to-Image Diffusion Models

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

Text-to-image diffusion models like Stable Diffusion and Imagen set a new standard in generating photorealistic images. However, their widespread use raises concerns about the nature of the content they produce, particularly when models are trained on large datasets that may include inappropriate or copyrighted material. In response, various unlearning methods have been developed to effectively remove unwanted information. This research evaluates the impact of unlearning methods on the overall performance of text-to-image diffusion models. Specifically, we examine how unlearning certain content influences the models' ability to generate accurate and diverse images across different concepts. Through a series of experiments, we investigate potential trade-offs, such as unintended reductions in image quality or diminishing features related to the remaining classes. Our findings offer valuable insights into balancing the need to eliminate specific content with the goal of preserving the broader functionality and integrity of diffusion models.

Biography:

Valeriya Khan is a PhD student at IDEAS NCBR and Warsaw University of Technology with focus on continual learning and unlearning of generative models.

Katarzyna Zaleska

Warsaw University of Technology

Co-authors:

Łukasz Staniszewski*, Kamil Deja

Poster 17: Style and Object Low-Rank Continual Personalization of Diffusion Models

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Dominik Lewy

Lingaro Group

Co-authors:

Karol Piniarski

Poster 20: Beyond Benchmarks: What to consider when evaluating foundational models for commercial use?

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

This presentation provides a comprehensive overview of critical considerations for utilizing foundational models within commercial use cases, with a focus on Computer Vision and Natural Language Processing domains. It outlines a systematic framework comprising essential steps for verification. Additionally, the presentation illuminates the process through examples of evaluation protocols, offering practical insights into assessing model performance and applicability in real-world scenarios. The analysis will concern mainly generative models, particularly text-to-image synthesis, and Large Language Models (LLMs). Through this detailed exploration, participants will gain a deeper understanding of the strategic and technical prerequisites for leveraging foundational models to drive innovation and efficiency in commercial applications.

Biography:

Dominik has over 10 years of hands-on experience in Machine Learning, Deep Learning, Data Exploration and Business Analysis projects primarily in the FMCG industry. He is a technical leader setting goals and preparing road maps for projects. He is also a PhD candidate at Warsaw University of Technology where he focuses on the study of neural networks for image processing. He tries to be a bridge between commercial and academic worlds. His main research interest is digital image processing in context of facilitating adoption of deep learning algorithms in business context where training data is scarce or non-existing.

Jędrzej Warczyński

Poznan University Of Technology

Co-authors:

Mateusz Lango, Onfrej Dusek

Poster 21: Interpretable Rule-Based Data-to-Text Generation Using Large Language Models

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

In the field of natural language generation (NLG), converting structured data into coherent text poses significant challenges. "Interpretable Rule-Based Data-to-Text Generation Using Large Language Models," introduces a novel approach that integrates the interpretability and precision of rule-based systems with the generative power of large language models (LLMs). This method focuses on generating Python code to transform RDF triples into readable text, achieving a balance between accuracy and flexibility. Approach: The core innovation lies in automating the creation of a rule-based system using LLMs. The process involves three key steps: Rule Generation: An LLM is prompted to write Python code that specifies how to convert given RDF triples into natural language text. Rule Testing: The generated code is checked for syntactic correctness and its output is compared to desired references to ensure alignment. Rule Refinement: The code undergoes iterative refinement using silver-standard references, reducing hallucinations and enhancing accuracy. This approach leverages the strengths of both rule-based and neural methods, creating a system that runs efficiently on a single CPU without the need for GPU resources. Experimental Results: Evaluations on the WebNLG dataset demonstrate that this system outperforms zero-shot LLMs in BLEU and BLEURT scores, and significantly reduces hallucinations compared to a fine-tuned BART model. The system's interpretability allows for easy modification and extension by developers, providing high control over the output. Highlights: The system achieves higher text quality than zero-shot LLMs. It produces fewer hallucinations than a fine-tuned BART baseline. The rule-based approach offers full interpretability and control over generated text. It operates efficiently on a single CPU, eliminating the need for costly GPU resources. This research presents a promising step towards creating efficient, interpretable, and flexible NLG systems by combining the strengths of rule-based and neural approaches. It opens new avenues for further advancements in the field, particularly in multilingual text generation.

Biography:

Jędrzej Warczyński is a computer scientist pursuing a Master's degree in Artificial Intelligence at Poznań University of Technology. He earned his Bachelor's degree in Computer Science with honors from the same institution. With over two years of experience as a full-stack Java developer, Jędrzej has contributed to building robust web applications. His research focuses on natural language processing and natural language generation (NLG). His recent paper, "Interpretable Rule-Based Data-to-Text Generation Using Large Language Models," was accepted for oral presentation at INLG 2024.

Pisula Juan Ignacio

University of Cologne

Co-authors:

Katarzyna Bozek

Poster 22: Addressing data heterogeneity in federated learning with Mixture-of-Experts models

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

Federated Learning (FL) offers a solution to collaborative learning when sharing private data is not possible. However, domain shift among the different clients in the federation remains an important challenge. When this occurs, the Federated Averaging (FedAvg) strategy typically performs poorly, as each client optimizes towards its local empirical risk minimum, which may be inconsistent with the global direction. Handling this issue is not only of theoretical interest, but could be critical in real-world scenarios, for example, in medical applications where each client acquires geographically-biased data using its own protocol. The problem of non-independent, identically distributed (non-iid) data in FL has been studied mainly on situations where it is the distribution of labels that shifts among clients, and there is limited work on data originated from different domains. In the FL literature, non-iid distributions are commonly addressed with novel federated algorithms that train a better global model, or that include local models that mitigate the biases of their respective clients. In this work, we study how the domain shift problem can be overcome by using Mixture-of-Experts architectures (MoEs). The MoE layers that we employ compute their output as a linear combination of the outputs of a pool of experts, where the coefficients are predicted by a router network. Furthermore, if the routing to the experts is sparse, the computation of unused experts can be spared, providing a boost in inference speed. Our experiments show that the ability of MoEs to process different inputs with different experts can be exploited to automatically deal with data heterogeneity among clients, and a single global model can be trained even with a naive FedAvg strategy without compromising performance. Additionally, we report an increase in accuracy when the gradients of the MoE layers are estimated using a heuristic. Overall, we show that MoEs make a solid solution to federated scenarios where data heterogeneity is a concern.

Biography:

Electronics engineer born and raised in La Pampa.

Jolanta Śliwa

AGH University of Krakow

Co-authors:

Paulina Jędrychowska, Bogumiła Papiernik,Oskar Simon

Poster 23: Application of machine learning to support pen & paper RPG game design

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

In recent years one can observe continuous and dynamic development of the pen & paper RPG market. One of the problems that the industry is facing is the need to design new opponents and to determine the scale of challenge they pose to players. The traditional way of establishing this figure requires many hours of practical tests. Currently there is no automatic way to estimate the level. The prediction of the scale of challenge can be reduced to ordinal regression. As part of this thesis several machine learning models with different sets of properties were tested in order to create a solution that would estimate the level of the designed monster in a fast and precise way. Additionally, with the use of explainable AI and counterfactual examples the authors developed suggestions on how to modify monster's properties to increase or decrease its level to desired value. The result of this thesis is a web application that allows designing monsters using the functionalities described above.

Biography:

Jolanta Śliwa is a Data Science student at the AGH University of Krakow. As part of my engineering thesis, she co-developed an application that supports the design of opponents in a pen & paper RPG game, using Machine Learning. For this reason, Jolanta have recently been spending her free time playing this type of game, and she also immerse myself in the fascinating world of animation.

Bartłomiej Fliszkiewicz

Military University of Technology

Poster 24: Repurposing Pharmaceuticals for Organophosphorus Poisoning

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

Organophosphorus (OP) compounds found in pesticides and chemical warfare agents (CWA), continue to pose a significant global health risk. There are still over 385 million cases of unintended acute pesticide poisoning annually, mostly in southern Asia and east Africa, resulting in approximately 11,000 deaths, despite a global push to reduce the use of pesticides. There is also a significant number of intended OP poisoning, mostly in suicide attempts. The number of OP poisoning could escalate rapidly in the event of terrorist incidents, warfare and other crises. Due to geographic focus of the problem and the relatively small number of cases, there is little interest in developing new drugs against OP poisoning. Notably the most used antidote, pralidoxime (2-PAM), was developed in 1950s. Most studied antidotes are charged molecules and therefore poorly penetrate the blood-brain-barrier. Repurposing existing pharmaceuticals offers a strategic solution to the lack of interest in developing novel antidotes, as the drug discovery is both costly and time-consuming. This study employs a structure-based method for repurposing compouds from the ChEMBL database. a machine learning model constructed with Light Gradient Boosting Machine algorithm is applied to classify compounds as actives or inactives in treating organophosphorus poisoning. The training database was created by curating PubChem compounds tested against acetylcholinesterase (gene ID 43), focusing on bioassays containing the terms „reactivation” and „nimp”, „gb”, „sarin”, „sp-gbc” or „sp-gb-am”. The model was trained using the structural representations of 62 molecules. The approach was evaluated using the Leave-One-Out cross validation method, yielding an area under the ROC curve of 0.93. Since 52 of the training molecules contained an oxime moiety, the classification was limited to such compounds. Among 34 oximes from ChEMBL database 16 were classified as actives and chosen for further analysis including protein - ligand docking.

Biography:

Bartek is a research assistant at the Department of Radiology and Contamination Monitoring at the Military University of Technology. His scientific interest is in cheminformatics and drug design. He is planning to obtain a PhD soon and start postgraduate studies in bioinformatics. As a hobby project Bartek developed an Android app called Gaslands Builder.

Jan Dubiński

Warsaw University of Technology; IDEAS NCBR

Co-authors:

Piotr Warchoł, Maciej Kafel

Poster 25: Efficiently enhancing product design process with Stable Diffusion

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

Navigating the landscape of product design demands a nuanced approach, necessitating considerable time and expertise. The integration of AI-aided design systems, though promising, demands substantial resource allocation. In response to these imperatives, this work introduces an innovative solution poised to enhance and expedite the product design process. Leveraging the state-of-the-art Stable Diffusion Model, a cutting-edge framework for generative image generation, our approach propounds a streamlined and resource-efficient methodology to efficiently empower the product design process. Our solution demonstrates three key capabilities: 1) facilitating the generation of new product designs and styles observed on e-commerce platforms; 2) swiftly creating product prototypes based on existing products or new designer sketches; 3) enabling precise modifications to product designs according to the designer's preferences. Leveraging the dreambooth technique, we seamlessly incorporate new styles or products with minimal input data, diversifying design possibilities dynamically. Precision is attained through the ControlNET mechanism, informed by a visual prior, aligning output with a desired product shape. Finally, a masking mechanism allows for product editing to enhance customization. Noteworthy, our solution requires only a single 8GB RAM GPU. Successfully developed, tested, and applied at Eljot Sp. z o. o., specialists in wooden product design, our solution showcases the potential to revolutionize and accelerate the product design process.

Biography:

Jan Dubiński is currently pursuing a PhD degree in deep learning at the Warsaw University of Technology. He is a member of the ALICE Collaboration at LHC CERN. Jan has been working on fast simulation methods for High Energy Physics experiments at the Large Hadron Collider at CERN. The methods developed in this research leverage generative deep learning models such as GANs to provide a computationally efficient alternative to existing Monte Carlo-based methods. More recently, he has focused on issues related to the security of machine learning models and data privacy. His latest efforts aim to improve the security of self-supervised and generative methods, which are often overlooked compared to supervised models.

Bartosz Cywiński

Warsaw University of Technology

Co-authors:

Kamil Deja, Tomasz Trzciński, Bartłomiej Twardowski, Łukasz Kuciński

Poster 26: GUIDE: Guidance-based Incremental Learning with Diffusion Models

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

We introduce GUIDE, a novel continual learning approach that directs diffusion models to rehearse samples at risk of being forgotten. Existing generative strategies combat catastrophic forgetting by randomly sampling rehearsal examples from a generative model. Such an approach contradicts buffer-based approaches where sampling strategy plays an important role. We propose to bridge this gap by incorporating classifier guidance into the diffusion process to produce rehearsal examples specifically targeting information forgotten by a continuously trained model. This approach enables the generation of samples from preceding task distributions, which are more likely to be misclassified in the context of recently encountered classes. Our experimental results show that GUIDE significantly reduces catastrophic forgetting, outperforming conventional random sampling approaches and surpassing recent state-of-the-art methods in continual learning with generative replay.

Biography:

Karol Szymański

Tooploox

Co-authors:

Szymon Płaneta

Poster 31: Comparing Large Language Models in Retrieval-Augmented Generation: A Multi-Metric Evaluation

Friday / 8 November 17:00 - 18:30 (Poster Session 1)

Abstract:

The rapid evolution of generative AI has led to widespread use of Large Language Models (LLMs) in various industries. However, a comprehensive comparison highlighting their strengths and weaknesses is often lacking. This study aims to fill that gap by evaluating popular open-source and commercial LLMs, including GPT-3.5, GPT-4, GPT-4 Turbo, Mistral, and Llama13B, in conjunction with Retrieval Augmented Generation (RAG) systems. Our methodology involved a standardized dataset, a set of relevant questions, and a suite of metrics like answer correctness, faithfulness, and context relevance. The results revealed significant performance variations across models, with GPT-4 generally providing the most accurate answers. Interestingly, open-source models like Mistral demonstrated competitive performance, particularly in faithfulness. Furthermore, while GPT-4 was the only model to admit to lack of necessary information, others tended to generate hallucinated responses when unable to provide accurate answers. This study underscores the importance of choosing the right LLM for specific use cases and the potential of open-source models as viable alternatives to their commercial counterparts.

Biography:

Karol Szymański completed his Master’s degree in 2017, focusing on the application of autoencoders in herding tasks. After graduating, he worked at Intel and Amazon, gaining experience in the industry. Since 2020, he has been working at Tooploox, where he focuses on building deep learning-based solutions for image processing.

Aleksander Obuchowski

TheLion.AI

Co-authors:

Mikołaj Badocha, Kinga Marszałkowska, Maciej Gierczak, Barbara Klaudel

Poster 32: Eskulap - The First Polish Open-source Medical Large Language Model

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Mateusz Kapusta

Astronomical Observatory, University of Warsaw

Poster 40: Iris-ML: Simulation-Based Inference for the Spectral Energy Distribution fitting.

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

Markov chain sampling is a versatile algorithm used in modern astronomy for inference tasks. Unfortunately, it is not always suitable for large inference tasks where the evaluation of the likelihood function is computationally expensive. Here, an alternative approach is presented, in the form of a Simulation-Based Inference. I use it to tackle the Spectral Energy Distribution fitting problem. The basic idea for this type of analysis is to uncover the true physical properties of objects by fitting complicated physical models to broadband brightness measurements. Mastering analysis of the photometric data is essential for modern astronomical research. Based on the transformer architecture for the preprocessing, the proposed model greatly accelerates the sampling process with the help of the MAF Normalizing Flow. Such models are a great step forward compared to the usually used MCMC, as they allow for a much faster sampling procedure. They will become more influential, as the next generation of astronomical surveys will produce unprecedented amounts of data, that need to be processed.

Biography:

Mateusz Kapusta have been working in the field of Observational Astronomy for 3 years, mainly applying various Bayesian models in real-case astronomical scenarios. He is involved in research in Simulation-Based Inference, with an application to big-astronomical surveys.

Emilia Majerz

AGH University of Krakow

Co-authors:

Aleksandra Pasternak

Poster 41: Siamese Ensembles for image data augmentation

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

Even the most powerful neural network architectures can be useless when provided with a small amount of training data. In this scenario, the use of data augmentation techniques can help with generalization. However, they should be applied carefully, as some modifications can alter the labels of the samples, which may be difficult to spot without expert knowledge. In this work, we introduce a simple label-preserving image data augmentation technique, especially suitable for small datasets. This network training method allows for expanding the data by using pairs of images instead of single samples in an ensemble learning-like manner and is inspired by Siamese neural networks, with two networks working together to achieve a common goal. It can be easily implemented to improve the accuracy of various image classification tasks and be particularly useful for smaller, medical or technology industry-related datasets. In our experiments, we focus on a difficult and very specific aircraft dataset, containing images of fuselage of aircraft structures, with corroded and non-corroded surfaces. We also provide results on standard baseline data. Our preliminary experiments showed that the proposed augmentation improves the classification accuracy by even over a dozen percentage points, and the gain in accuracy is especially visible in the case of a smaller dataset size.

Biography:

Emilia Majerz is a PhD candidate at the AGH University of Krakow, working on theory-inspired Machine Learning. She hold an MSc in Data Science and a BEng in Computer Science. Her main research area is incorporating Physics knowledge into Machine Learning models, focusing on the detectors of ALICE at CERN.

Moritz Staudinger

TU Wien

Co-authors:

Wojciech Kusa, Florina Piroi, Aldo Lipani, Allan Hanbury

Poster 42: Beyond ChatGPT: A Reproducibility and Generalizability Study of Large Language Models for Query Generation

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

Systematic literature reviews (SLRs) are a cornerstone of academic research, yet they are often labour-intensive and time-consuming due to the detailed literature curation process. The advent of generative AI and large language models (LLMs) promises to revolutionize this process by assisting researchers in several tedious tasks, one of them being the generation of effective Boolean queries that will select the publications to consider including in a review. This paper presents a extensive study of Boolean query generation using LLMs for systematic reviews, reproducing and extending the work of Wang et al. and Alaniz et al. Our study investigates the replicability and reliability of results achieved using ChatGPT and compares its performance with open-source alternatives like Mistral and Zephyr to provide a comprehensive analysis of LLMs for query generation. Therefore, we implemented a pipeline, which automatically creates a Boolean query for a given review topic by using a previously selected LLM, retrieves all documents for this query from the PubMed database and then evaluates the results. With this pipeline we first assess whether the results obtained using ChatGPT for query generation are reproducible and consistent. We then generalize our results by analyzing and evaluating open-source models and evaluating their efficacy in generating Boolean queries.

Biography:

Hubert Rybka

Jagiellonian University, Faculty of Chemistry

Co-authors:

Tomasz Danel, Sabina Podlewska

Poster 48: PROFIS: Design of structurally-novel drug candidates by probing molecular fingerprint space with RNNs

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

The contemporary landscape of drug discovery is characterized by the increasing complexity of the tasks, the rising cost of research and development, and the demand for faster and more efficient ways to bring innovative therapeutics to market. As a solution to these challenges, computational methods have become more prevalent, with generative ML paving the way to faster and more effective drug discovery in the recent years. Before any computational algorithm can process a molecular structure, it needs to be encoded in a way that allows the machine to parse it. Several textual encoding methods have emerged, including SMILES (Simplified Molecular Input Line Entry System), its more recent and ML-suited counterpart DeepSMILES, and SELFIES (Self-referencing Embedded Strings). Another common way to represent molecular structures is to use molecular fingerprints (FPs). Those are structural representations of chemical compounds in the form of binary or numerical vectors that capture critical information about a molecule's constituent atoms, bonds, and substructures. In contrast to molecular graphs or textual encodings, FPs have the potential to extract information about biochemically relevant functional groups and present it in a compact, machine-readable format, and have a great potential to be used as features for ML-based QSAR (quantitative structure-activity relationship) modeling. In this study, we propose a novel generative model, PROFIS, which allows for the design of target-focused compound libraries by probing continuous fingerprint space with RNNs. PROFIS is an innovative molecular VAE that maps molecular fingerprints into a continuous, low-dimensional space and decodes molecule structures in a sequential notation, ensuring alignment with the initial FP description. In the task of generating potential novel ligands, PROFIS employs a Bayesian search algorithm in tandem with a QSAR model to traverse the space of embedded molecular FPs and identify subspaces that correspond to potential good binders. The latent vectors sampled from those subspaces are then decoded into textual formats, such as SMILES or DeepSMILES using a recurrent neural network. Since many FPs do not determine the full chemical structure, our method can generate diverse molecules that match the particular FP description. The generated structures are target-specific, which allows for generating potential ligands tailored to a specific receptor. We prove that PROFIS exhibits excellent scaffold-hopping capabilities, enabling the exploration of novel chemical space, an essential feature of computational tools for de novo ligand generation. We present the application of our protocol in the task of ligand generation for the dopamine D2R. However, the developed methodology is universal and can be applied to any biological target provided a dataset of known ligands is available. To facilitate the widespread use of PROFIS, we share all the scripts needed to run the developed protocol via GitHub.

Biography:

Hubert Rybka graduated from the Faculty of Chemistry, Jagiellonian University in 2023 with a Master's degree in Chemistry. Currently pursuing PhD at Łukasz Skalniak's group of Bioorganic and Medicinal Chemistry, employing computational methods for modern, data-based drug design. Research interests include ML-assisted molecular design, cheminformatics, and molecular dynamics of biologically relevant systems. When not doing research - a rock climber and a friend of small animals.

Jakub Poziemski

Institute of biochemiostry and biophysics Polish Academy of Sciences

Co-authors:

Paweł Siedlecki

Poster 49: Application of vision transformers to protein-ligand affinity prediction.

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

The transformer architecture has revolutionized many areas related to AI. It was originally adopted for natural language processing (NLP), but in recent years there has been rapid development of transformer architectures for computer vision (CV) data, the so-called Vision Transformers (ViT). ViT is achieving spectacular results in many CV areas, displacing architectures based on convolutional neural networks. (CNN). In this paper, we present a successful application of ViT to the problem of protein-ligand affinity prediction based on 3D crystallographic complexes. Despite the relatively small dataset and the very complex nature of the problem, ViT achieves results comparable to the best methods used for this problem. The paper also includes extensive model diagnostics that provide information on important aspects of the input data and its representation.

Biography:

Jakub Poziemski completed my bachelor's and master's degree in Bioinformatics and Systems Biology at the University of Warsaw, Faculty of Mathematics, Informatics and Mechanics. He is currently a PhD student at the Institute of Biochemistry and Biophysics of the Polish Academy of Sciences (IBB PAS) in the Chemoinformatics and Molecular Modeling Laboratory. His PhD thesis focuses on protein-ligand affinity prediction using artificial intelligence and machine learning methods. He has 8 years of experience in the areas of AI and ML, with expertise in natural language processing (NLP), AI applications in bioinformatics and chemoinformatics, programming in Python, data analysis and visualization. Jakub has gained experience in both commercial and scientific projects.

Paweł Skierś

Warsaw University of Technology

Co-authors:

Kamil Deja

Abstract:

Novel diffusion models (DMs) can synthesize photo-realistic images with integrated high-quality text. In this work, we demonstrate through attention activation patching that less than 0.5% of DMs' parameters influence the text generation within the images. In contrast to prior work, our localization approach is broadly applicable across various diffusion model architectures, including both U-Net and Transformer-based, utilizing diverse text encoders. Building on this observation, by precisely targeting specific parameters of the model, we improve the efficiency and performance of existing image-editing methods, which often inadvertently modify not only the text but also the other visual elements within an image. Furthermore, we demonstrate that fine-tuning solely the localized parameters enhances the general text-generation capabilities of large diffusion models, providing a more efficient fine-tuning approach.

Biography:

Ignacy Stępka

Carnegie Mellon University, Poznan University of Technology

Co-authors:

Nicholas Gisolfi, Artur Dubrawski

Poster 57: Adaptive fill-in: how to mitigate the loss of an agent in decentralized federated learning

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

In decentralized learning, agents collaborate by training models on their local data while regularizing based on information from their neighboring agents, aiming to achieve a common model and maximize overall performance. However, the permanent loss of an agent, especially one with unique knowledge about the data distribution (non-iid), can significantly degrade system performance. To address this issue, we introduce a model-inversion technique as an adaptive fill-in strategy for agent reconstruction. This method reconstructs data points similar to those used by the lost agent during training and utilizes them to create and deploy a new agent, effectively restoring system performance and maintaining the optimization process. We demonstrate the effectiveness of this approach across various data distribution scenarios, including non-overlapping data distributions, distinct class assignments, and uniform distributions. Via experimental analysis, we show that our adaptive patching method not only recovers performance after a persistent agent failure but also accelerates convergence compared to other baseline approaches.

Biography:

Ignacy Stepka is a fourth-year Artificial Intelligence student at Poznan University of Technology. His research experience includes work at the Robotics Institute of Carnegie Mellon University, where he has contributed to a project on the resilience of decentralized learning algorithms in adverse scenarios, funded by the U.S. Army. He has also developed formal verification approaches for Bayesian Networks in critical care trauma delivery under a DARPA initiative. At Poznan University of Technology, Ignacy's research focuses on robust counterfactual explanations. Previously, he worked on methods utilizing multi-criteria analysis for generating counterfactual explanations, and more recently, he has developed a statistical framework to ensure their robustness against model shifts. In addition to his research, Ignacy has gained significant professional experience over three years at the Poznan Supercomputing and Networking Center, where he has contributed to EU HORIZON-funded projects. His work includes predictive maintenance for Volkswagen assembly lines, explainable AI analyses for air traffic management, and anomaly detection in large HPC clusters. Ignacy is also actively involved in the academic community, having led seminar sessions on Machine Unlearning and Explainable AI as part of his university's student research group, GHOST.

Mikołaj Zieliński

Commonwealth Scientific and Industrial Research Organisation, Poznan University of Technology

Co-authors:

Dominik Belter, Peyman Moghadam

Poster 58: Smart sampling for object removal operations in Neural Radiance Fields

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

Neural Radiance Fields (NeRF) have emerged as a powerful tool for generating immersive space representations. However, once trained, modifying these representations poses significant challenges due to the implicit storage of scene information within the weights of these coordinated neural networks. Existing approaches can be categorized into two main types: methods that modify the input dataset using inpainting techniques and methods that manipulate the density and sampling functions. The first category often involves time-consuming retraining, as edits cannot be applied to an already trained model without starting the training process anew. In contrast, the second category enables real-time editing without the need for model retraining. In the context of editing, the second category of methods provides significant flexibility, allowing for on-the-fly adjustments. However, object removal often results in distortions and artifacts in the scene behind the removed object. These distortions arise from na\"ive object removal techniques that suppress unwanted density function values, resulting in undersampled regions that should be reconstruced. Although the network properly encodes the knowledge of these regions, their reconstruction is impaired due to insufficient sampling. To address these issues, we propose a novel sampling technique that accounts for spatial regions containing the object to be removed and avoids sampling from these areas. Our approach focuses on sampling from regions underrepresented by existing methods, resulting in enhanced sampling of the regions behind the removed object. This technique mitigates distortion issues and improves the quality of rendered novel views. Additionally, our method reduces the number of samples required for successful rendering. Unlike other approaches, we demonstrate that our sampling strategy enables precise reconstruction of scene geometry, provided the network has seen the reconstructed regions from different angles during training. In cases where this is not possible, the network may exhibit hallucinations. However, it can still interpolate to approximate the geometry of the unseen regions.

Biography:

Mikołaj Zieliński is a PhD student at Poznań University of Technology and an intern at the Commonwealth Scientific and Industrial Research Organisation (CSIRO). He completed a Master’s degree in Automation and Robotics, with my research focusing on neural space representations. My work is dedicated to developing advanced representations to improve how robots manipulate objects and navigate their environments. Outside of my academic and professional pursuits, He enjoy machining, drinking tea and travelling. My hobbies often influence my approach to both my research and everyday life.

Piotr Stefański

University of Economics in Katowice

Poster 59: Improved Scene Classification in Dynamic Combat Sports by Video Frame Segmentation

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

#Abstract The current literature on image classification focuses on using images in which significant information objects contain a large part of the image. The problem of classifying images that contain significant objects over an area of only a few percent of the total image is largely ignored. This paper addresses the problem of classifying single video frames in which significant and informative objects contain less than 1.5% of the registered scene. Original video frames were used for the baseline approach, in addition, an algorithm for image segmentation was proposed, which for the balanced accuracy measure obtained an increase of 35 percentage points over the baseline approach. #Introduction Cameras generate increasing amounts of data that cannot be analyzed manually. Thus, solutions are needed that will automatically provide valuable information about the recorded scene. Such a problem is addressed in the field of Computer Vision, where researchers have developed several algorithms for classifying the recorded scene. The paper proposes an approach for segmenting an image before the classification process. The approach applies the operation of subtracting the n-th earliest frame recorded by the camera. Experiments proved that the approach for the balanced accuracy measure obtains an increase of 35 percentage points over the baseline approach using the original video frames. #Experiments As part of the experiments, a binary classification of video frames was created, and the classifier was trained to classify a single frame into the "punch" or "no punch" class. The database contained 11,345 examples of the "punch" class and 100,614 examples of "no punch." As part of the experiments, the proposed algorithm was tested for the set n = {1, 2, 3, 5, 8, 13, 21, 34}, to compare the results and evaluate the impact of the proposed algorithm, a classifier was also trained on the original images(0_original approach). To train the classifiers the own convolutional neural network was used, with fewer convolutional layers and parameters compared to, for example, ResNet50 to speed up the training process. To statistically validate the results, the classifiers were trained and evaluated 30 times. The proposed algorithm was compared with the 0_original baseline approach. In addition, three other algorithms from the literature were tested and evaluated during experiments: - Background subtraction based on K-nearest neighbours. - Background subtraction based on the Gaussian mixture. - Background subtraction based on BSUV-Net algorithm based on the convolution neural networks.

Biography:

Piotr Stefański is a graduate of Computer Science at the University of Economics in Katowice, where he received a master's degree with a very good grade. From the beginning of my career, he combined learning with practice, working as a programmer and then as a team leader. He devoted my bachelor's thesis to the development of a tool for automatic verification of data from photos of ID cards, which was implemented in business. After graduation, Piotr began my research career as an assistant at the University, while preparing a doctorate in technical informatics and telecommunications at the Wroclaw University of Technology. He initiated cooperation between the University and industry, which resulted, among other things, in the development of an algorithm for gambling addiction detection and a publication delivered at an international conference. My research focuses on image processing and neural networks, which results in leading a research club and participating in projects related to the application of vision technologies. In addition to my academic work, He is involved in the community as a volunteer firefighter, developing a support system for rescue operations using drones, image processing and machine learning algorithms.

Gracjan Góral

University of Warsaw, IDEAS NCBR, IMPAN

Co-authors:

Alicja Ziarko, Michał Nauman, Maciej Wołczyk

Poster 60: Seeing Through Their Eyes: Evaluating Visual Perspective Taking in Vision Language Models

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

Visual perspective-taking (VPT) is the ability to understand another person's viewpoint, allowing individuals to predict the actions of others. For example, a driver can avoid accidents by considering what pedestrians see. Humans generally develop this skill in early childhood, but it is unclear whether recently developed Vision Language Models (VLMs) have this ability. As these models are increasingly used in real-world applications, it is important to understand how they perform on complex tasks like VPT. In this paper, we introduce two manually curated datasets, called "Lego" and "Dots," to test VPT skills, and we use these datasets to evaluate 12 commonly used VLMs. We observe a significant drop in performance across all models when perspective-taking is required. Furthermore, we find that performance in object detection tasks does not strongly correlate with performance on VPT tasks, indicating that existing benchmarks may not be adequate for understanding this problem.

Biography:

Knows everything... except the language. Former math student, currently 'wrestling' with language models (though it's unclear who's winning). PhD candidate at the University of Warsaw, under the watchful eye of IDEAS NCBR.

Aleksandra Dagil

University College London (UCL) and True North Partners

Poster 61: Probabilistic Time Series Forecasting Transformer Model: comparative analysis with statistical ARIMA method for short-term wind power prediction

Saturday / 9 November 10:30 - 12:00 (Poster Session 2)

Abstract:

Despite the widespread use of transformer architectures in Natural Language Processing and Computer Vision, they have not yet become the state-of-the-art in wind-power time series forecasting, where statistical methods remain more popular. This research aims to bridge that gap by comparing the accuracy of wind power predictions using two approaches: a novel Probabilistic Time Series Forecasting Transformer Model and the traditional AutoRegressive Integrated Moving Average (ARIMA) method. By analyzing 134 time series from the SDWPF dataset and using three metrics—Mean Absolute Percentage Error (MAPE), Symmetric Mean Absolute Percentage Error (sMAPE), and Mean Arctangent Absolute Percentage Error (MAAPE)—I demonstrate that the transformer model consistently outperforms ARIMA. The transformer model shows higher accuracy across all metrics for most time series and exhibits less variation in performance between different time series. These findings suggest that transformer models have significant potential for broader adoption in very short-term and short-term wind power forecasting.

Biography:

Aleksandra graduated first-class from MSc Data Science at UCL, and BA Economics from University of Oxford (specilizing in Econometrics), where she obtianed the Award of Undergraduate Exhibition and the Junior Schlorship for outstaning academic work. She is also an alumni of Bona Fide scholarship awarded by Fundacja Orlen. Her research experience is in time series forecasts, which she gained at Bank of England in comparing methodologies for inflation forecasts and at UCL, where she developed wind-power predictions. She also fine-tuned the BERT and GPT-3 models for a classification task (fake news detection). Aleksandra is now working as a Data Sciencist at a London-based financial consultancy, True North Partners, deploying statistical and ML methods for credit risk modelling and Anti-Money Laundering applications.