TDDD82 Projekttermin inklusive kandidatprojekt: Säkra, mobila system
Anna Pestrea (annpe689)
Niklas Granberg (nikgr371)
Daniel Holmberg (danho525)
Victor Nyberg (vicny076)
Oscar Andell (oscan898)
Albin Andersson (alban042)
|Projekt 4||Andrei Gurtov|
Erik Andersson (erian599)
David Combler (davco958)
Eric Lindskog (erili798)
Jesper Wrang (jeswr740)
Daniel Jonsson (Danjo390)
Jesper Holmström (jesho280)
Linnea Lundström (linlu607)
Sebastian Ragnarsson (sebra023)
Samuel Johansson (samjo788)
Karol Wojtulewicz (karwo001)
Robin Ellgren (robel708)
Tobias Löfgren (toblo956)
Gustav Aaro (gusaa960)
Daniel Roos (danro880)
Varje par skriver i ett mejl en sorterad lista över alla projekt (högst prioriterad först). I mejlet skall också framgå vilka två personer som ingår i gruppen. Bifoga inte några dokument eller liknande, allt ska stå direkt i mejlet. Notera att alla projekt måste ingå i listan. Mejlet skall se ut som följande exempel:
Marcus Bendtsen (marbe800)
Jakob Pogulis (jakpo779)
Kursledningen kommer sedan att dela ut kandidatarbeten. Vi utgår från era preferenser men kan inte garantera att ni får de projekt ni har satt högst på listan.
Eftersom dessa kandidatarbeten har en vetenskaplig karaktär så genomförs arbetena på engelska. Framläggning och opposition är på svenska.
Vissa projekt har specifika krav, de står skrivna i texten till projektet, se till att ni uppfyller dessa.
Energifotavtryck för trådlös kommunikation i krisscenarier
Applikationen som ni demonstrerar ska kunna använda 3G/4G eller WiFi för överföring av krislägesinformation. Ett krav i ett krissammanhang är att systemet ska vara energisnål i de handburna enheterna, men energieffektivitet i 3G/4G nätet påverkas stort av både mängden överföringar och distribution av paket upp/nerladdning över tiden. I detta miniprojekt är det tänkt att ni ska mäta den exakta energiåtgången för en apps interaktioner över 3G/4G/WiFi-nätet. Eftersom repeterbara experiment med olika appar för att mäta deras energiavtryck är både svårt och tidskrävande, så har vi ett verktyg - EnergyBox - som tillåter att interaktionen med en app ("packet traces" för appen) ska kunna användas för att estimera energiåtgången. I detta miniprojekt ska ni identifiera metoder för att systematiskt generera olika transmissionsspår. Olika dataflöden som skickas av appen kan vara små notifieringar (tex positionen på en karta) eller komplexa flöden som hänger ihop (tex ett röstmeddelande eller videoinspelning som ska delas med olika aktörer via servern). För att röstöverföring resp. videoöverföring ska fungera ska transmissionen ske utan mycket jitter. Ni förväntas att använda systematiska metoder för att skapa realistiska (test) dataflöden som output från er app och sedan mäta appens energifotavtryck genom att mäta dessa olika typfall. Typfallen ska innefatta röst och videoinspelning med rimliga jitternivåer. Observera att applikationen behöver eventuellt färdigställas *innan* dessa mätningar utförs så att ni faktiskt kan generera typiska paketflöden i ett realistiskt scenario innan deras energiavtryck mäts.
Systematisk testning av Android applikationer
Att testa en applikations pålitlighet är ett teoretiskt svårt problem. För att testa tillförlitlighet (med avseende på funktionella krav) måste i princip ska alla möjliga inputs till applikationen genereras för att kunna säkerställa att appen gör vad den ska. Men tidigare arbeten finns som hjålper utvecklaren att testa funktionella krav genom att generera slumpmässiga inputs därmed kan man mäta testningens kvalitet med hjälp av metrik som "coverage" (täckning). Desssa metoder dock lämpar sig inte direkt för att säkerställa att applikationen är säker. Tester som fokuserar på säkerhet måste utgå ifrån det värsta oönskade aktiveringar och input sekvenser (så kallad "adversary thinking"). I detta projekt ska ni undersöka vilka metoder lämpar sig för systematiska säkerhetstester av programvara, och välja den mest lämpliga för att testa er applikation med avseende på säkerhetskrav (konfidentialitet och integritet). Bland dessa ingår så kallade penetrationstester, och en del av arbetet utförs genom att konstruera attack scenarier. Ni måste kunna samla belägg/evidens (genom testning) att obehörig inte kommer åt/ändra data som de inte har rätt till. Data kan vara lagrad på klient eller server sidan så att testerna behöver inkludera båda sidorna. Om applikationen måste färdigställas innan olika funktioner kan testas så ingår detta också i arbetet. Men applikationen ska vara tillräckligt komplext att själva attack/test konstruktionen ska bli icke-trivial.
Denial of service
Major threats to availability of a system like the one you have developed is that the server at the back end can be subject to crashes and overloads. The system can also suffer a kind of overload that is intentionally created with denial of service attacks. In order to understand how an application can be hardened to resist such threats, you are expected to make a comparison between how two alternatives survive in presence of these kinds of threats: 1) with a system that you implemented in your project, and 2) with a system where the backend is placed in a cloud service. The project should first study which research (and best practice) approaches exist to model an adversary that intends to render the system unavailable. Then use this systematic study to quantify which of the two alternatives above is more resilient when carefully prepared adversary scenarios to breach availability. These may include denial of service attacks if you are able to create a safe environment for studying such attacks and measure durations of "time to failure".
Performance Evaluation of Security Monitoring Framework
Description: Managing of security policies for heterogeneous network devices remains a challenge with big demand from the industry for efficient solutions. Examples of proposed architectures include Interface for Metadata Access Points (IF-MAP) which is an open specification for a client/server protocol developed by the Trusted Computing Group (TCG), and Security Automation and Continuous Monitoring (sacm) at IETF. IF-MAP is like Twitter for network devices running over HTTPS/SOAP interface.
Plan: You start by performing a survey of the area of security monitoring based on available open standards. Google scholar and IEEE Xplore can be utilized for that. Then you construct a prototype system based on open-source IF-MAP components and measure performance of basic operations such as adding/removing a device or collecting policy information. You can also contribute to open-source development. Since most IF-MAP software dates back to 2012, you should analyze if it has been superseded by new software based on latest IETF standards.
omapd is an open source IF-MAP Server. It currently implements the IF-MAP v1.1 and v2.0 specifications published by the Trusted Computing Group (TCG). Irond is another alternative. Several open-source clients are available for different OS e.g. in Python or C such as DECOIT. Thus the first goal is to select most suitable and client and server software and test its compatibility. The testbed should include several client devices and at least one server.
Qualifications: Knowledge of network and security protocols, software skills
Implementation and Evaluation of IEEE 802.15.9 Key Management Protocol for Sensors
Description: New IEEE specification IEEE 802.15.9 provides Key Management Support for IEEE 802.15.4 short-range wireless radios used between sensors (similar to Zigbee). It can be seen as a "pairing" protocol for secure connection of two devices like Bluetooth. Such protocols can be used for example while constructing a smart home e.g. to make sure only your light switch can turn bulbs on and off.
First you should study 802.15.9 architecture and relevant protocols. Different variants for key exchange protocols which are available as open-source. However, the complete architecture has not yet been implemented or evaluated. As a first step you should look for which open-source software is available for protocols (HIP, IKEv2, 802.1X, PANA). Then you can start constructing a testbed including a desktop as a server and some sensor node, a UAV drone or Android phone as a client.
Plan: The goal is to deploy and improve some KMPs implementations from the standard (especially HIP DEX and BEX) and evaluate performance on small sensor platform e.g. imote2 (3 devices can provided by supervisor). Imote2 runs a simplified version of Linux. Then you can measure e.g. connection establishment time, battery consumption, connection RTT for various KMPs.
Qualifications: Network and security protocols, Linux skills
Performance evaluation of branched video streaming using HTTP2, server push
We have designed and implemented a novel media player that allows interactive video to be stitched together in ways that allow users to interactively select different non-linear media paths through the media. Our current framework uses Adobe's Open Source Media Framework (OSMF) and was originally presented at ACM Multimedia 2014 (http://www.ida.liu.se/~nikca89/papers/mm14.pdf). Last year, two groups of IT students implemented similar solutions in Dash.js, with somewhat different focuses and degrees of success. In this year's project, the idea is to build on their success (possibly also leverage their code) to perform a careful performance evaluation of performance differences seen when using server push (feature in HTTP/2) vs. client-driven prefetching (as described in the above paper), in the context of branched streaming. At a high level, the extensions will include careful prefetching and improved buffer management. Good program skills and some systems work to setup an experimental testbed using dummynet or other multi-machine setup that allow accurate network emulation are expected to be needed to successfully complete the project.
Note that with the server-push features in HTTP2 (rather than with client-driven prefetching), the client would still need to present the user with playback path options and where to switch to different alternatives. However, the intelligence would now be placed on the server side which actively push content that the client is likely to need in the future. Ideally, we would like to look at hybrid approaches too, which involves both client-based prefetching (potentially with some sharing of buffer conditions) and server-based push.
This project is system oriented and will require some good systems skills. For example, you will need to setup and configure servers to work with HTTP2 and server-push. Also, a testbed need to be implemented for the performance tests (involving server, client, and network).
We are planning to sign an agreement that ensure that we keep the intellectual property rights to the design and the software. The goal would be to create a demonstrator of our software, which eventually will be made available with the next academic publication. Your contributions will be properly acknowledged and the publication process should not hinder you from publishing your thesis. (Explanation: The code is expected to be non-public until a research article eventually is published based on the software, at which time we would plan to release the source code (and acknowledge the people that have contributed and helped with the code). Until that point in time, the code and any technical solutions and ideas should remain non-public.)
Longitudinal measurements of popularity dynamics (and influencers, for example) on Twitter
You will help extend and improve a data collection framework that we have started to develop, so to answer a number of research questions related to popularity dynamics in news content (as seen with the help of twitter and Bit.ly). The research questions will be discussed offline, but the framework, use both the twitter API and the Bit.ly API (for url shorteners). In this project, you will further improve crawling and parsing tools that extract bit.ly URLs from Twitter posts and then compare/classify the contents of the observed websites. You should work with the Twitter API to identify pointers to newspaper websites (of a selected set of news websites, which we will pick during the project) and all bit.ly URL shorteners (from which we will again focus on links posted to news article of selected sets of news websites). For selected links, your tool should carefully (and periodically, as per a design we will discuss) collect retweet and follower statistics, and the Bit.ly API will be used to carefully (and periodically - again using a specific periodicity) collect statistics about how often the links are clicked. The final product of the project is a crawling/parsing framework that (i) identifies tweets/retweets to news websites, (ii) classifies the corresponding articles (especially to identify similar/identical articles), (iii) collects periodic retweet/follower statistics about these posts, and (iv) uses the bit.ly URLs to collect click statistics about as large fraction of these as possible. As a proof of concept, we would ideally like to see a good few weeks long dataset collected and some preliminary analysis that shows that the methodology is sound and the methodology works. Again, it is important to note that the focus will be on the design and implementation of a careful methodology (that include a mix of longitudinal aspects, the Bit.ly API, similarity matching of text from news articles, and retweet/follower statistics from the Twitter API) and the collection of a large longitudinal dataset (according to a methodology that we can discuss in person).
The goals is that the datasets collected with the tool (collected hopefully already during the thesis) can be used to help answer some example research questions. As with the code projects above, the dataset and tools should not be shared publically until we potentially publish a research article using these tools and datasets. This project require good programming skills and willingness to work with other people's code. Familiarity with using web APIs is also recommended. (Current scripts are mostly in python.)
Spreading of "fake" news and the social (?) network that propagate them
Fake news and other misinformation distributed through social media currently threatens and undermines our entire society. However, measuring and modeling how fake news and biased content is shared is non-trivial. In this project, you will develop a methodology (likely involving both manual classification of articles and automated tracking of tweets and retweets related to these articles, including the number of followers of tweeters and re-tweeters (ideally we would like to have the social network of the influencers), and all other information that may help model and understand the information propagation on twitter) using the Twitter API. Ideally, we would like to be able to identify multiple layers of information propagation and influencers. You should therefore try to collect as much information as possible about the propagation of a set of "fake news" and a set of "regular news" articles and the tweeters/retweeters of these news. Also, try to be precise in your definition of what is "fake news" (e.g., fact related, biased, etc.) and try to develop the methodology and tools so that we can use it/them, at larger scale, after the course. (Note: Ideally, we would like to identify the news early and track the articles over time, as in the above Twitter/bitly project. Some manual classification may therefore have to take place after the collection.) A solid initial dataset together with some preliminary analysis would also be expected. The goals is that the datasets collected with the tool (collected during the thesis) can be used to help answer some example research questions. (As with the code and dataset projects above, the dataset and tools should not be shared publically until we potentially publish a research article using these tools and datasets.) This project likely require some significant manual effort, good programming skills, and willingness to work with other people's code. Familiarity with using web APIs is also recommended. (Current scripts are mostly in python.)
To distribute or not to distribute?
With the help of Software Defined Networking (SDN) we can start building a network OS. Some of the functionalities one would like to implement may be best implemented close to the edge, while other functionalities may be best implemented closer to the core. In this project, we are interested in the understanding how to best build network-wide system utility functions for machine learning. In particular, in this project, you will study machine learning (ML) algorithms in the networking context. First, you should identify statistical machine learning methods (e.g., Bayesian) that are more or less suitable to distribute the processing. We are interested in solutions that split the processing both through parallelism and through serialism (allowing aggregation of pre-processed information from the prior step, for example). Second, you will design simple experiments that aim to answer questions related to when to distribute (e.g., in a tree network) and when not to distribute the processing. As a motivating use-case scenario, we will likely use a computer network in which the operator would like to perform basic traffic classification (e.g., Jeff Erman's 2006 paper: https://pages.cpsc.ucalgary.ca/~mahanti/papers/clustering.pdf) or ideally like to learn clients buffer conditions (e.g., our MMSys 2017 paper: http://www.ida.liu.se/~nikca89/papers/mmsys17.pdf), for example. Third, you will perform experiments and present performance evaluation results comparing and contrasting alternative designs and their scaling properties.
Distributed Client-driven Certification Transparency Log
Effective and secure communication is essential in everyday life, but perhaps even more so in emergency situations. In a recent initiative, Google and other organizations have pushed for the use of Certificate Transparency (CT). In this initiative, certificates are expected to be logged in publically audible CT logs. Furthermore, servers are expected to provide clients (i.e., the browsers) proofs that the certificates have been logged in CT logs. Already today, Chrome demands that Extended Validation (EV) certificates (issued after Jan. 1, 2015) are logged before displaying visual cues to the user that normally come with EV certificates. To prove that a certificate have been logged, Signed Certificate Timestamps (SCTs) are used. In this thesis project, you are expected to create distributed CT log, possibly even as a complementary alternative to CT, which users run in the background while browsing the web. The system is expected to be implemented as a Distributed Hash Table (DHT) in which either certificates, SCTs, or both. This idea is still half-baked and some work would be needed to compare and contrast alternative system designs. (E.g., May be difficult to get the Merkle tree aspect sorted out if shooting to provide the append-only aspect. Yet, it would be nice to incorporate the Merkle trees aspect too. Again, more thought is needed regarding alternative design choices and what is best here.) However, we would like to leverage existing solutions such as open source code for trackerless BitTorrent clients (that already provide good DHT implementations), for example. You are also expected to design the system (with similar goals as CT, but with a distributed design in mind), implement a proof-of-concept solution, and use performance experiments to demonstrate that the solution works. (As with the code projects above, the tools should not be shared publically until we potentially publish a research article using these tools and datasets.) For additional information about CT we suggest reading our PAM 2017 (http://www.ida.liu.se/~nikca89/papers/pam17.pdf) and PAM 2018 (http://www.ida.liu.se/~nikca89/papers/pam18.pdf) papers, and watch the following YouTube video for some motivation why CT is important (https://www.youtube.com/watch?v=tJFfDOQT46k), or both.
Virtual and real worlds are becoming increasingly integrated. For example, in the context of disaster recovery (e.g., think earth quake or avalanche scenarios) emergency workers may be provided with virtual environments of the original city/house design while exploring a disaster area in its current shape. As users interact with these environments in remote areas with limited or unreliable network connectivity, it is important to design delivery protocols that provide the best possible service given the limited available bandwidth. As an important first step in designing such protocols it is important to understand how users interact with the virtual environment and the (virtual) objects. Here, users' distance to these objects play a particular important role in the type and amount of optimizations that can be done if wanting to optimize the delivery of tomorrow's virtual reality (VR), augment reality (AR), or hybrids user experiences thereof.
In this project, you will work with the Oculus SDK to develop a measurement methodology that collects as much information as possible about head movements, positioning information about the user, and information about the objects that a user looks at within different virtual environments and the depth/distance at which they occur, and the fraction of the viewing field they take up, for example. To do this, you also need to extract depth information and information about the objects and views that are presented to the user as they explore an environment. Ideally, it would be good if a large set of environments of different classes are used and explored in the study. (E.g., this would allow fever users and instead allow focus to be on comparing different environments ...) If time, processing statistics regarding rendering information about the different environments and different components should be collected.
The goals is that the datasets collected with the tool (collected hopefully already during the thesis) can be used to help answer some example research questions. As with the code projects above, the dataset and tools should not be shared publically until we potentially publish a research article using these tools and datasets. This project require good programming skills and willingness to work with other people's code.
Sidansvarig: Marcus Bendtsen
Senast uppdaterad: 2018-02-15