Hide menu

Examensarbeten och uppsatser / Final Theses

Framläggningar på IDA / Presentations at IDA


Se även framläggningar annonserade hos ISY och ITN i Norrköping / See also presentations announced at ISY and ITN in Norrköping (in Swedish)

If nothing is stated about the presentation language then the presentation is in Swedish.


WExUpp - kommande framläggningar
2024-05-23 - SaS-UND
Energy consumption of video streaming – A literature review and a model
John Lindström
Avancerad (30hp)
kl 13:00, Alan Turing (In English)
[Abstract]
Energy consumption and correlated greenhouse gas emissions are a big global problem.
It affects all parts of society, and each industrial sector must work toward reducing its
carbon footprint. This thesis details the research of different methods to model the energy
consumption of video streaming, and works towards creating a final model. The video
streaming process is broken down into a core process consisting of head-end, distribution
and transmission, and terminals. The process that contributes the most to energy consumption
at the head-end is found to be video encoding. This thesis explores video encoding in
depth and how it is affected by parameters such as hardware, codec choice, codec preset
selection, and video details such as resolution, framerate, and duration, but these parameters
are found to be insufficient to model the energy consumption of video encoding. In
distribution and transmission, the highest contributor is found to be content delivery networks.
The energy consumption of content delivery networks is investigated however no
appropriate model is found. For terminals, the most important factor is the kind of terminal
used. The energy consumption of televisions, desktop computers, laptops, and mobile
terminals is investigated, and models are presented for each. The thesis also discusses the
different models, their advantages, and their shortcomings. Additionally, an application to
visualize features of the model is created and presented.
2024-05-23 - HCS
Bemästring och rörelse över tid i fysikbaserade 2D träningsspel
Filip Josefsson, Jim Magnusson
Avancerad (30hp)
kl 13:00, Donald Knuth (På svenska)
[Abstract]
In the modern world, people are not moving enough. People can sit down for multiple hours in a row at an office job, which can lead to multiple health risks like obesity, depression and cancer. One way of counteracting this sedentary behavior is to encourage people to move more by playing exergames, video games that require the player to move and exert themselves to play. The benefit of playing exergames does, however, only apply when they are continually played over a longer period of time. To see how well exergames work over a longer period of time, two casual physics based exergames were developed. Each exergame was then played 100 times each by both authors to see how score and movement changed over time. It was found that both games required greater movement to get higher scores. The movement amount of the exergames slightly decreased over time in some cases but, most often, stayed at the same level or increased over time.
2024-05-24 - HCS
Utforskande av exergaming: En studie om spelarbeteenden och speldesign
Matthias Gerdin, Lisa Green
Avancerad (30hp)
kl 09:00, IDA Alan Turing (På svenska)
[Abstract]
This thesis investigates the potential use of exergames, games controlled by the player’s body movements, to promote large and powerful arm movements as a means to combat sedentary behavior. Two score-driven games, Smash It and Pop It, are developed to explore whether a scoring system could incentivise players to maintain or increase their arm movements as their skill levels increases. The study employ a methodology involving data collection from participants engaging with the games, with a focus on analyzing movement and gameplay performance. Results indicate promising outcomes, with both games demonstrating effectiveness in motivating desired movements. While Smash It offers players freedom in gameplay, Pop It provides a more linear experience, allowing developers greater control over player actions.
2024-05-24 - HCS
The Impact of Level Design on Movement in a 2D Platformer Exergame
Josef Karlsson
Avancerad (30hp)
kl 13:15, Kurt Gödel (På svenska)
[Abstract]
This thesis investigates how level design impacts user movement in a 2D platformer exergame and how this movement evolves as players gain experience. A single participant played the exergame 100 times, with 30 minutes of rest separating each session. Findings indicate that level designs that require or encourage waiting behavior reduce user movement and limit the potential for increased movement as the user gains experience.
2024-05-29 - ADIT
Traffic Load Balancing in Virtual Private Networks - An implementation and comparison of algorithms
Gustav Elmqvist
Avancerad (30hp)
kl 09:15, John von Neumann (In English)
[Abstract]
This thesis explores the implementation of load balancing for VPN servers, focusing on practical software-based methods. DNS-based load balancing was found to be suitable for Virtual Private Network (VPN) traffic due to its minimal interference with the data flow. Through research on load-balancing methods and strategies, a DNS-based load balancer was selected and tested by sending traffic using iperf3 in a Docker container environment using OpenVPN clients and servers. The algorithms were evaluated based on performance metrics of total traffic throughput, load distribution, and fairness between clients. Tests were conducted both in a configuration where all servers had identical computational resources and in a configuration where they had varying capacities. Results indicated similar performance among tested algorithms for servers with equal capabilities, while more significant differences emerged for servers with varied computational resources. This result highlights the importance of taking server capacity into account when designing systems for server load balancing.
2024-05-29 - SaS-UND
Machine Learning-Based Automated Vulnerability Classification in C/C++ Software
Artin Fazeli
Avancerad (30hp)
kl 13:15, Alan Turing (In English)
[Abstract]
The degree of impact caused by software vulnerabilities is escalating as software systems become increasingly integrated into the everyday lives of human beings. Different methods, such as static and dynamic analysis, are commonly used to classify software vulnerabilities. However, these methods are often plagued by certain limitations, including high false positive and false negative rates. It is crucial to examine C/C++ software vulnerabilities, as C/C++ is widely implemented in many industries and critical infrastructures, where software vulnerabilities could have catastrophic consequences if exploited by malicious actors. This thesis examines the feasibility of utilizing machine learning-based models for automated C/C++ software vulnerability classification. Additionally, the effect of hyperparameter tuning on the predictive performance of the utilized models is explored. The models investigated were divided into two main groups, namely, traditional machine learning models and transformer-based models. All models were trained, evaluated, and compared using a large and diverse C/C++ dataset. The findings suggest that autoregressive large language models, particularly Llama 2 and Code Llama utilizing a decoder-only transformer architecture, demonstrate significant potential for accurate C/C++ vulnerability classification, achieving F1-scores of 0.912 and 0.905, respectively. The results further indicate that hyperparameter tuning has a limited positive effect on predictive performance. Moreover, specific traditional machine learning models, like the SVM model, outperformed many of the transformer-based models, potentially indicating limitations in training procedures and the architectures of many pre-trained language models. Nevertheless, autoregressive large language models exhibit significant potential for precise C/C++ software vulnerability classification and should remain a focal point for future research.
2024-05-31 - AIICS
Strategies for Accurate Context Retrieval in Retrieval-Augmented Generation Systems Across Diverse Datasets
Axel Andersson, Hugo Björk
Avancerad (30hp)
kl 13:15, John von Neumann (In English)
2024-06-04 - ADIT
Evaluation of Unsupervised Anomaly Detection in Structured API Logs
Gabriel Hult
Avancerad (30hp)
kl 13:15, Charles Babbage (In English)
[Abstract]
With large quantities of API logs being stored, it becomes difficult to manually inspect them and determine whether the requests are benign or anomalies, indicating incorrect access to an application or perhaps actions with malicious intent. Today, companies can rely on third-party penetration testers who occasionally attempt various techniques to find vulnerabilities in software applications. However, to be a self-sustainable company, implementing a system capable of detecting abnormal traffic, potentially maliciously, would be beneficial. By doing so, attacks can be proactively prevented, mitigating risks faster than waiting for third parties to detect these issues. A potential solution to this is applying machine learning, specifically anomaly detection, which is detecting patterns not behaving to the normal standard. This thesis covers the process of having structured log data to find anomalies in the log data. Various unsupervised anomaly detection models were evaluated on their capabilities of detecting anomalies in API logs. These models were K-means, Gaussian Mixture Model, Isolation Forest and One-Class Support Vector Machine. The findings from the evaluation show that the best baseline model without tuning can reach a precision of 63%, a recall of 72%, resulting in an F1-score of 0.67, an AUC score of 0.76 and an accuracy of 0.71. By tuning the models, the best model could reach a precision of 67% and a recall of 80%, resulting in an F1-score of 0.73, an AUC score of 0.83 and an accuracy of 0.75. The pros and cons of each model are presented and discussed along with insights related to anomaly detection and its applicability in API log analysis and API security.
2024-06-05 - HCS
Automating Figma Design to Web Component Conversion: Enhancing Workflow Efficiency in Web Development
Oliver Börjesson
Avancerad (30hp)
kl 10:00, Kurt Gödel (In English)
[Abstract]
This thesis aims to address the significant challenges in the collaboration between designers and developers in web and application development. The focus is on developing a tool or methodology that bridges the gap between design and implementation, thereby creating a more seamless, efficient, and error-free workflow. Existing methodologies and tools often favor either designers or developers and are time-consuming, leading to repeated revisions and misunderstandings. This research proposes a design-to-code conversion tool that streamlines the process, reduces errors, and enhances productivity. Through a comprehensive literature review, prototype development, and empirical evaluation, this study explores the technical requirements, best practices, and user feedback mechanisms necessary for creating an effective tool. The results demonstrate that while the prototype offers promising improvements in design-to-code translation, further refinements are necessary to address dynamic positioning and broader design tool compatibility. The study concludes with recommendations for future work, including support for more complex design elements and the integration of machine learning techniques to enhance the tool's capabilities.
2024-06-05 - ADIT
How does the use of Autonomous Penetration Testing Strengthen The Continuous Integration Flow?
Jonatan Eshak
Avancerad (30hp)
kl 13:15, Herbert Simon (In English)
[Abstract]
The thesis introduces the problems developers face when creating, optimizing, and testing their systems. The focus of this thesis is the testing of a web application with the use of autonomous penetration testing integrated into a GitLab CI/CD pipeline. The thesis wants to answer whether the use of open API, a specification made to ease the documentation of API in a system, creates an environment where one can save time on integration and increase efficiency through knowledge of performance from specific endpoints. The thesis wants to measure further the applications where autonomous penetration testing with open API could be preferred. The thesis also measures the different use cases of using autonomous black box testing against a white box to answer when one can be preferred over the other and when it is helpful to have both. The thesis goes through the theory of penetration testing and how it is conducted, what common strategies, and what attacking methods are standard. The thesis also goes into the theory of autonomous PT against manual PT, a theory of web application, and API to describe open API and what Swagger is as a tool. The thesis also goes into the theory of continuous integration flow, its design, and how a developer builds one from scratch. The thesis also brings up five significant articles related to the work, such as an article that discusses the problems faced when designing a black box vulnerability scanner on web servers. This article discusses the implementations of continuous integration on automatic performance testing and one for automating security scanning. The thesis also brings up an article on introducing continuous fuzzing to study the scalability of fuzzing in CI/CD pipelines. These related works enhance the purpose and method of this thesis and its goal to measure autonomous penetration testing in a CI pipeline. The method to answer the research question is to build a website to serve as a target with open API and integrate the website into the GitLabs CI pipeline along with vulnerability scanning tools configured to perform black box, grey box, and white box testing. The results show that while the black box is a lot more thorough through having to search and test every different point of discovery, it does so at the cost of time. The grey box shows similar results as the black box, although it only focuses on finding vulnerabilities from API endpoints. Using the white box, the results showed more critical vulnerabilities mainly focused on packages installed and stored in an environment directory. The vulnerabilities also differentiate from the black box and grey box, showing a need to use both scans to discover as many unique vulnerabilities as possible.
2024-06-05 - ADIT
A Comparative Analysis of Open Source Dynamic Application Security Testing Tools
Isak Chorell, Christoffer Ekberg
Avancerad (30hp)
kl 15:15, Muhammad al-Khwarizmi (In English)
[Abstract]
In today’s digital era, the increase of internet usage presents a growing challenge in cyber security. An increase in cyber attacks underscore the need for robust software systems to protect them. One way of detecting vulnerabilities is by using Dynamic Application Security Testing (DAST) tools, which simulate cyber attacks without knowledge of the internal structure of its target. This thesis investigates the four open source DAST tools Black Widow, Nuclei, Wapiti and ZAP in their ability to identify security vulnerabilities in web applications. A comparative analysis was performed, focusing on the tools vulnerability detection capabilities, how different web applications affect their results as well as their practical applicability. Each DAST tool was run against web applications, both with and without intentional vulnerabilities, where measures such as scan time and reported vulnerabilities were collected. The tools were also run against a benchmark to be able to calculate the metrics accuracy, precision, recall and F-measure. The results show that ZAP reported the most vulnerabilities, where Cross Site Scripting and SQL injection are the most common types, but also had the largest number of false positives. However, on the benchmark, none of the DAST tools had any false positives. It was also found that the architecture of the web application highly influenced the tools' attack capabilities. Conclusively, DAST tools can help to improve the security of web applications but come with some drawbacks and limitations. To achieve a more comprehensive scan, one can use more than one DAST tool, but it comes with a cost of longer scan times and an increase in manual effort to review the reported vulnerabilities.
2024-06-17 - ADIT
Player Type Classification in Ice Hockey Using Soft Clustering
Anton Olivestam, Axel Rosendahl
Avancerad (30hp)
kl 13:15, Charles Babbage (In English)
[Abstract]
Ice hockey is a team sport, and thus it is of the utmost importance to assemble a team of players who are capable of performing at their optimal level when playing together. For a team to perform at the highest level, it is essential that the team consists of players with different player types who excel in different situations on the ice. Despite the importance of team composition, there is a shortage of previous research in this area. Previous studies have been limited to a single league or the quantity of available data. This thesis investigates the classification of player types in ice hockey using soft clustering techniques. The aim is to determine if fuzzy clustering and Gaussian mixture models (GMM) can effectively categorize player styles, and to evaluate the suitability of each method. The player types were derived from play-by-play data from three different leagues over three seasons. We represent teach player's individual playing style in two different player vectors. One vector employs frequent sequences combined with event frequency, while the other vector features distinct skills characterizing a playing style. After constructing the vectors, we applied both of the clustering algorithms. Our findings show that both FCM and GMM successfully derived playing styles. FCM provided fuzzier clusters, meaning that it performed better at handling the fact that players may assume multiple playing styles. Additionally, the vectors based on the distinct skills of a hockey player led to highly interpretable clusters compared to the vectors using frequent sequences and event frequency.
2024-06-18 - ADIT
Enhancing Log Analysis through Data Storage and Visualization Tools
Paul Blåberg Kristoffersson, David Envall
Avancerad (30hp)
kl 09:15, Babbage (In English)
[Abstract]
The technology behind autonomous vehicles and advanced-driver assistance systems requires rigorous testing to ensure safety, reliability, and efficiency. This testing leads to a continuous stream of log data that requires an effective storage solution, as well as the ability to visualize the log data to facilitate analysis. This thesis investigates how such a storage solution can be designed, with scalability and efficiency in mind. To achieve this, requirement elicitation is performed, a data schema is designed, and a data storage solution is integrated with real log data provided by Magna Electronics, the collaborator for this thesis. The scalability and efficiency of the storage solution are then assessed with two different evaluation methods. Additionally, a visualization solution in the form of an interactive dashboard is integrated with the new storage solution. The usability in terms of troubleshooting, trend analysis, and log analysis of the dashboard is evaluated through user tests and a follow up interview using the Usability Metric for User Experience questionnaire. The results from the scalability evaluation indicates that the ability to handle increased workloads without losing performance, both in terms of querying during increasing data loads, as well as increasing numbers of concurrent read and write operations, is necessary. The results of the efficiency evaluation indicates that by utilizing queries instead of manual operations, the speed of tasks can be increased. The result of the usability evaluation is that an integrated visualization can be useful in the log analysis process.
2024-06-18 - ADIT
Run-Time Optimization of ElasticSearch for Medical Documents
Ludvig Bolin, Emil Carlsson
Avancerad (30hp)
kl 13:15, Babbage (In English)
[Abstract]
ElasticSearch is a database management system used to index and search documents, and as with all database management systems, performance is important. The aim of this thesis is to investigate whether the configuration of an ElasticSearch system can be tuned to improve either index- or search performance using different optimization algorithms. With that goal in mind, this thesis has evaluated three different optimization algorithms as a means to generate performance-improving ElasticSearch configurations. Two local algorithms, Simulated Annealing and Simultaneous Perturbation Stochastic Approximation, and one global algorithm, Genetic Algorithm.
The benchmarking tool ESRally is used as an objective function for the local algorithms. Since the global algorithm requires near-instant evaluation, two machine-learning models are trained to predict configuration performance in said benchmarks instead. The machine learning models, Random Forest, and Regression-Enhanced Random Forest, performed with similar accuracy. Both models could predict the performance of a configuration measuring index performance well but could not predict the search performance to the same extent.
The configuration generated by the various optimization algorithms is then evaluated in a simulation replaying four hours of real traffic from an ElasticSearch instance used in a hospital for medical data indexing and searching. Unfortunately most configurations generated by the various algorithms failed to improve search performance. On the other hand all the algorithms succeeded in generating configurations that outperform the default configuration in the simulation regarding indexing performance, with Simultaneous Perturbation Stochastic Approximation producing the best performance configuration.


Page responsible: Final Thesis Coordinator
Last updated: 2022-06-03