Hide menu

Examensarbeten och uppsatser / Final Theses

Framläggningar på IDA / Presentations at IDA


Se även framläggningar annonserade hos ISY och ITN i Norrköping / See also presentations announced at ISY and ITN in Norrköping (in Swedish)

If nothing is stated about the presentation language then the presentation is in Swedish.


WExUpp - kommande framläggningar
2024-05-13 - AIICS
Identifiera beteendemönster hos truckförare för att förutspå risk för olycka
Victoria Winqvist, Unn Zachrison
Avancerad (30hp)
kl 14:15, Alan Turing (På svenska)
[Abstract]
This thesis explores the possibility of identifying risk behaviour patterns among forklift drivers through the analysis of telemetry data using unsupervised clustering algorithms. The objective is to predict whether certain behaviour patterns increase the risk of accidents. With the increasing accessibility of Internet of Things technology, data from forklifts has become more available, allowing for the study of driver behaviour. The telemetry data utilised is sourced from Toyota Material Handling Manufacturer Sweden’s internal database, collected from Data Handling Units that are installed on forklifts across Europe. This data, known as "shock data," is triggered when a force is applied to the forklift, such as a collision. The thesis investigates combinations of various clustering algorithms and dataset modifications. The evaluation of the results is conducted using several quantitative measures and visualisation, along with analysis of time distribution, geographical placement, comparison of forklift models, and comparison with "no shock data." The evaluation yields K-Prototypes and K-Means as the best performing algorithms, while indicating that soft clustering and density-based clustering are not well-suited for the data. The identified best performing algorithms reveal two recurring driver behaviour patterns: the first one being driving forward at high speed with the lift motor idle, and the second pattern being driving backward at low speed while lowering the forks. However, a majority of the data points remain unclassified into specific behaviour patterns, suggesting that the dataset or methods used may not be sufficient enough. This prompts further exploration into the inclusion of additional features such as steering angle and forklift height. The thesis demonstrates the feasibility of identifying risk behaviour patterns, with potential for future research expanding on the findings to further contribute to the prevention of workplace accidents involving forklifts.
2024-05-16 - ADIT
Analys, design och utvärdering av databasscheman i Azure Data Explorer
Angelica Ferlin, Linn Petersson
Avancerad (30hp)
kl 13:00, Babbage (In English)
[Abstract]
Data warehouses are today used to store large amounts of data. This thesis investigates the impact different database schema designs have on query execution time within the cloud platform Azure Data Explorer. It is a relatively new platform, and limited research exists on how the database schema should be designed in Azure Data Explorer. Further, the design of the database schema has a direct impact on the query execution times. The design should also align with the use case of the data warehouse. This thesis conducts a requirements analysis, determines the use case, and designs three database schemas. The three database schemas are implemented and evaluated through a performance test. Schema 1 is designed utilizing results tables from stored functions, while schema 2 utilizes sub-functions divided by different departments or products aimed to minimize the data accessed per query. Finally, schema 3 uses the results tables from the sub-functions found in schema 2. The conclusion from the performance test shows that schema 3 has the best overall improvement in terms of query execution time compared to the other designs and the original design. The findings emphasize the critical role of database schema design in influencing query performance. Additionally, a conclusion is drawn that using more than one approach to enhance query performance is increasing the potential query performance.
2024-05-17 - AIICS
Automatic De-Identification of Magnetic Resonance Images
Victor Dahlsberg, Adam Sundberg
Avancerad (30hp)
kl 15:00, Alan Turing (In English)
[Abstract]
Magnetic resonance (MR) images of the head need to be de-identified to enable research and education while complying with rules and regulations such as HIPAA and GDPR. This
thesis explores a new approach to de-identifying MR images by utilizing generative machine learning (ML) techniques. The presented solution combines a vector quantized variational autoencoder (VQ-VAE) with a latent diffusion model (LDM) featuring a modified reverse process to enable postconditional inpainting of 3D MR images. The solution takes
two inputs: the image to be de-identified and a binary mask of the regions that should not be modified by the inpainting process. Given these, the VQ-VAE encodes the inputs into a
latent representation, where the modified LDM blends the original image with new sampled data. The output of LDM is then decoded to achieve a new de-identified MR image that looks like an unmodified image but contains synthesized data to hide the identity of the original subject.

Three different defacing tools are used to produce binary masks of different sizes to test the solution. The result shows that the size of the mask has a large impact on how different an inpainted image is compared to the original, and how different multiple inpainted images are from each other. Furthermore, downstream performance is measured by applying skull stripping tools to original, defaced, and inpainted images. Inpainted images are shown to perform as well or slightly better than defaced images in this task. Finally, the
inpainted images are fed to a classifier trained to classify whether an image contains a face or not. The model predicts that the images contain a face more than 93 % of the time.
2024-05-20 - HCS
Damage Assessment on Remote Sensing Imagery with Foundation Models
Gustaf Lindgren
Avancerad (30hp)
kl 08:15, Alan Turing (In English)
[Abstract]
There is currently an ongoing paradigm shift in machine learning; instead of training task-specific models from scratch, foundation models i.e., large pre-trained models are adapted for various downstream tasks. Foundation models excel in zero- and few-shot learning, ideal for domains with limited labeled data, such as disaster assessment on remote sensing imagery (RSI).

This thesis explores how the foundation models CLIP and SAM can be utilized to classify RSI affected by natural disasters and segment intact and damaged infrastructure without extensive retraining. For the scene classifications, various text prompt techniques are tested as well as zero-shot prompting with images. Moreover, few-shot learning methods such as linear probing and prompt learning are explored. For the open vocabulary semantic segmentation task, "pipelines" are implemented that leverage the open vocabulary classification abilities of CLIP and zero-shot image segmentation capabilities of SAM.

This work demonstrates that foundation models can be used effectively for detecting flooding on RSI and there were promising results on other disaster types as well. While handcrafted text prompts yielded the best accuracy, the zero- and few-shot learning methods with images offered a better trade-off between accuracy and consistency. Although the performance of the zero-shot segmentation pipelines was generally poor, they showcased the potential of SAM for accurate segmentations on disaster imagery when being provided with prompts of sufficient quality.
2024-05-20 - AIICS
Forecasting Patient Occupancy in Hospital Wards Using a Supervised Machine Learning Approach
Axel Falk, Philip Folkunger
Avancerad (30hp)
kl 09:15, John von Neumann (In English)
[Abstract]
o The healthcare sector faces challenges in balancing resource allocation and meeting patient demand, especially in the Emergency Department (ED) and other wards. This study explores the potential of supervised machine learning models to predict occupancy rates across different hospital departments using data from a hospital in western Norway from 2020 to 2023. The research combines Fourier analysis, seasonal decomposition (STL), and crosscorrelation techniques to identify cyclical patterns and dependencies within the data. Various supervised machine learning models, including Linear Regression, Random Forests, XGBoost, and neural networks, are evaluated using k-fold cross-validation and performance metrics such as MAPE and MAE. The results reveal distinct daily and weekly patterns in hospital occupancy rates, with notable anomalies during holidays and weekends. The study finds that occupancy rates are consistent over time, as ED, Cardiology Ward (CW), and Total Patients (TP) series are stationary, with stable mean values and variances. Both TP and ED exhibit daily seasonality, while all three series display weekly seasonality. Machine learning models perform differently across wards. The smallest prediction errors using only time features were 5.595 MAE for ED, 1.794 MAE for cardio, and 0.096 MAPE for TP. Cross-correlation analysis revealed strong correlations in daily cycles between ED and TP when lagged in time, suggesting that ED and TP occupancy rates are closely linked, while cardio shows slightly different patterns. The study concludes that simpler models, like Linear Regression, may offer a more efficient and effective approach for hospital occupancy forecasting.
2024-05-22 - HCS
Säkerställande av förarsäkerhet vid interaktion med touchskärmar för arbetsverktyg i bilar
Johanna Lundin
Avancerad (30hp)
kl 10:15, IDA Alan Turing (På svenska)
[Abstract]
The integration of devices within cars is continuously evolving, enabling us to interact with them to an increasingly greater extent. This has transformed the way we drive, communicate and access information on the go. Despite this, there is a lack of research on how to guarantee driver safety while interacting with these systems, especially when looking at in-car systems used in professional settings as a work tool. This master's thesis was conducted in collaboration with NIRA Dynamics and aimed to investigate how the interface of in-car touch screen work tools can be designed to ensure usability and safety for the driver. The study included development of a prototype in the form of a new touch screen interface for a data acquisition system which was used by NIRAs test drivers for the purpose of testing their products. The prototype design was developed iteratively based on the test drivers' opinions as well as theory about important design aspects related to designing in-vehicle systems for high safety and usability. The resulting prototype was evaluated using the System Usability Scale in order to compare the prototype to the original system design and asses to what extent the new interface contributed to increased safety for the driver. The study revealed that some of the main issues that needed to be taken into account in the prototype design was prioritization of information, placement and gathering of elements and reduced amount clicks and scrolling. The final usability evaluation was conducted through user tests and the results indicated that the usability of the prototype was higher than for the original system design, thereby indicating an increased safety for the driver. Overall, this thesis contributes to the research of mitigating the risks of drivers related to interaction with in-car software systems.
2024-06-05 - ADIT
How does the use of Autonomous Penetration Testing Strengthen The Continuous Integration Flow?
Jonatan Eshak
Avancerad (30hp)
kl 13:15, Herbert Simon (In English)
[Abstract]
The thesis introduces the problems developers face when creating, optimizing, and testing their systems. The focus of this thesis is the testing of a web application with the use of autonomous penetration testing integrated into a GitLab CI/CD pipeline. The thesis wants to answer whether the use of open API, a specification made to ease the documentation of API in a system, creates an environment where one can save time on integration and increase efficiency through knowledge of performance from specific endpoints. The thesis wants to measure further the applications where autonomous penetration testing with open API could be preferred. The thesis also measures the different use cases of using autonomous black box testing against a white box to answer when one can be preferred over the other and when it is helpful to have both. The thesis goes through the theory of penetration testing and how it is conducted, what common strategies, and what attacking methods are standard. The thesis also goes into the theory of autonomous PT against manual PT, a theory of web application, and API to describe open API and what Swagger is as a tool. The thesis also goes into the theory of continuous integration flow, its design, and how a developer builds one from scratch. The thesis also brings up five significant articles related to the work, such as an article that discusses the problems faced when designing a black box vulnerability scanner on web servers. This article discusses the implementations of continuous integration on automatic performance testing and one for automating security scanning. The thesis also brings up an article on introducing continuous fuzzing to study the scalability of fuzzing in CI/CD pipelines. These related works enhance the purpose and method of this thesis and its goal to measure autonomous penetration testing in a CI pipeline. The method to answer the research question is to build a website to serve as a target with open API and integrate the website into the GitLabs CI pipeline along with vulnerability scanning tools configured to perform black box, grey box, and white box testing. The results show that while the black box is a lot more thorough through having to search and test every different point of discovery, it does so at the cost of time. The grey box shows similar results as the black box, although it only focuses on finding vulnerabilities from API endpoints. Using the white box, the results showed more critical vulnerabilities mainly focused on packages installed and stored in an environment directory. The vulnerabilities also differentiate from the black box and grey box, showing a need to use both scans to discover as many unique vulnerabilities as possible.


Page responsible: Final Thesis Coordinator
Last updated: 2022-06-03