Scalable and Efficient Content Distribution
Funded by: Center for Industrial Information Technology (CENIIT) (Start date: Jan. 2011)
Project leader: Niklas Carlsson,
Associate Professor (docent; universitetslektor)
Department of Computer and Information Science (IDA), Linköping University
Email: niklas.carlsson@liu.se
Overview:
Content-based services (such as video streaming) play a central role in society
and our everyday lives. Today, content delivery applications (such as YouTube,
Netflix and Spotify, for example) consume a majority of the Internet bandwidth
and the demand is continually increasing. This research project provides efficient
and improved ways to disseminate content and information. We explore the best protocols
and architectures to serve large numbers of users/clients. Of particular interest is
design, modeling, and performance evaluation that provides solid insights towards the
best possible performance, scalability, efficiency, and/or quality of service.
Background and Industrial Perspective
With tremendous improvements in network bandwidth and computer capabilities,
many new high-bandwidth content distribution services have emerged in the
entertainment, business, and scientific communities. In contrast to traditional
content distribution systems such as TV and radio broadcasts, many of these new
services operate in an on-demand basis and only serve clients when explicit
requests for service are made. Further, many of these new services allow
the delivered content to be personalized.
Today, content delivery applications (such as YouTube, Netflix and Spotify, for example)
consume a majority of the Internet
bandwidth. With continued growth in demand for such applications anticipated,
the problem of cost-efficient and/or sustainable content delivery becomes
increasingly important. For efficient delivery, protocols and architectures
must scale well with the request loads; i.e., it is important that protocols
are designed such that the marginal delivery costs reduce with increasing demands.
Using scalable techniques can allow a content distributor to handle higher demands
more efficiently, and/or to offer its existing customers better service while
reducing its resource requirements and/or delivery costs.
A variety of techniques have been studied to improve the scalability and efficiency
of content delivery, including replication, service aggregation,
and peer-to-peer techniques. With replication, multiple servers
(possibly geographically distributed as in a CDN) share the load of processing
client requests and may enable delivery from a nearby server. With aggregation,
multiple client requests are served together in a manner that is more efficient
than individual service. Finally, with peer-to-peer techniques, clients may contribute
to the total service capacity of the system by providing service to other clients.
While much work have considered various scalable solutions, there is a lack of literature
considering the problem of cost-efficient content delivery, in which the application
incurs both a network delivery cost (e.g., from cross ISP traffic or, more generally,
operation/energy costs at Internet routers) and costs at the servers
(e.g., due to cost of ownership, energy, or disk bandwidth).
While the problem is complicated by the fact that the cost objective and the absolute
cost tradeoff may be different from case to case, contributions that help identify
the most efficient delivery architectures and protocols are becoming increasingly
important for organizations distributing content over the Internet. This importance
may be further augmented by the potential of increasing energy costs and carbon taxes, for example.
There are also many other trends that make content and service delivery a topic with
increasing commercial impact. For example, increased bandwidth capacities, improved
wireless access, and new high-capacity devices, such as the Apple iPhone, have enabled
the emergence of many new, popular, and exciting applications, which themselves have the
potential to attract new customer audiences. While this step towards greater mobility
may be seen as the actualization of anything-anywhere-anytime services (that ideally
allow users access to anything they want, whenever and wherever they are), the trend
has also had great performance implications, as much more bandwidth is being used for
data traffic generated by smartphones and other mobile devices. These and other
trends have gathered significant interest from the industry, which currently is
investing in research of new technologies that help disseminate content efficiently.
In summary, there is much potential for new services and technologies of content delivery.
Research Vision and Project Goals
We will build a highly recognized research group that specializes in distributed systems and networks.
For this goal to be achieved it is important that the group remains aligned with the most important
problems, helps educate the next generation of top-scientists, and consistently produces high-quality
research, shared through publications in premier conferences/journals or more targeted high-profile
venues. For greater impact we also plan to release open source implementations and datasets.
Overall, we want to be known for doing great research! This will involve cooperation with both
academic and industrial partners.
At a technical level, the primary long-term goal of this research project is to contribute
towards providing simple and resource effective ways to disseminate content and information.
Ultimately, content/information should be easily accessible whenever and wherever you are,
at a low environmental and monetary cost.
Our research will primarily focus on performance and resource usage/cost aspects of protocols,
services, and systems. Our efforts will be focused towards the problems that we believe have
the potential to yield the most significant contributions, and we should always try to use the
best tools for the problem at hand. Following the direction of the applicant, the research
will include a combination of analytic modeling, simulations, measurements, system implementation,
as well as real-world experiments.
Project Description
While the research group has a relatively broad interest, we are very much
interested in exploring the best protocols and architectures to serve large
numbers of users/clients. Of particular interest is design, modeling, and
performance evaluation that provides solid insights towards the best possible
performance, scalability, efficiency, and/or quality of service.
The long-term goal of this research project is to contribute towards providing
simple and effective ways to disseminate content and information.
We will design, model, and evaluate candidate solutions, such as to
provide solid insights towards the best possible performance, scalability,
efficiency, and/or quality of service. Our work will use a combination of
measurements, mathematical modeling, system implementation, and real-world
experiments. When possible, analytic lower bounds will be developed for
the purpose of rigorous protocol and system evaluation.
-
To better understand the underlying conditions under which protocols and systems operate,
we perform measurement-based studies
in which network dynamics, workloads, and/or various system performance aspects are characterized.
These results help answer questions such as what are the current limitations and bottlenecks;
what makes some content, services, and systems more popular and/or successful than others; etc.
However, these studies also gives insight towards more fundamental questions regarding access patterns,
content selection, etc. Insights, datasets, and models created during this part of the project
is also leveraged when we design and evaluate new distributed systems and protocols.
-
We design and evaluate the performance of various new scalable protocols and
architectures that help improve the quality of service and resource usage.
Ongoing work include the design of hybrid system solutions that minimize the
total service cost associated with large-scale systems, in which information
and content are replicated at many locations and with the help of different technologies
(e.g., peer-to-peer, cloud, proxies, and in datacenters).
We are particularly interested in solutions that scale with both the number clients and with
the diversity of service that these clients are provided (e.g., a large catalogue of contents),
not only one or the other.
Project Environment and Relevance
Industrial Relevance
Content-based services (such as video streaming)
is a very big part in society and our everyday lives.
In addition to our initial observations that content delivery contributes a majority of
current Internet traffic, and that the portion is continually increasing, we note that
more and more users expect to watch whatever content they want to watch, online, whenever
and wherever they are. This trend towards anything-anytime-anywhere service is likely to
continue, and companies are trying to take advantage of these trends.
Simultaneously, companies are becoming increasingly aware of their energy
consumption and overall delivery costs. As carbon taxes and energy-price
differentiation (due to differences in energy source, for example) becomes common,
energy efficiency and sustainable content delivery approaches are likely to have a
very direct impact of the bottom line of many companies.
Finally, we note that there is a very wide range of companies that deliver
(or plan to deliver) content over the Internet. Some companies are using
peer-assisted solutions, while others are selecting server-based solutions.
We expect both approaches generating commercial interest and revenue;
our research will not be restricted to either approach.
Research Environment
Our new networks and systems group has begun as a subgroup under ADIT.
During the building of the group, this will ensure access
to some local resources. The group is lead by
Niklas Carlsson,
Associate Professor (docent; universitetslektor). Niklas is supervisor
(or co-supervisor) for four local PhD students (Vengatanathan Krishnamoorti, Rahul Hiran, Anna Vapen,
and Mats Gustafsson), one researcher (Cyriac James), and a number of thesis students.
Through thesis opportunities, for example,
we are actively trying to identify good candidate members for the group.
As of today we have also hosted a visiting PhD student for six months
(from NICTA, Australia, that the applicant has assisted in the supervision
for the last two years of her thesis),
hired a postdoc for three months, two researchers
(one of which has been upgraded to PhD student in the group).
The work by these individuals has contributed greatly to the project.
These works include the design of throughput optimal second-spectrum auctions
to better utilize the wireless spectrum, models and statistical analysis to
gain insights into what makes some content more popular than others, an analytic
framework that allow for the design and evaluation of optimal replica selection
policies for content delivery systems with replicated service, and the evaluation
framework and performance evaluation of proxy-assisted HTTP-based adaptive streaming
protocols.
Within this framework, we have also hosted senior researchers from NICTA
(Australia) and University of Calgary (Canada), hosted a visiting undergraduate
internship student from France, and helped graduate two PhD thesis projects
(Youmna Boghol, NICTA, and Aniket Mahanti, University of Calgary).
Cooperation and Industrial Partnership
Much of the proposed research is done in cooperation with researchers in both
academia and industry. Collaboration and partners include both Swedish and international
companies and institutions.
-
We are actively sharing findings, datasets, and discussing various research projects with
our Swedish industry partners: Ericsson (Kista and Linkoping), Peerialism (Kista), and Spotify (Stockholm).
-
We are actively writing reserach papers, organizing workshops, and have regular research visits
with our primary international industry partners and their employees: National ICT Australia (NICTA)
and HP Labs (Palo Alto, CA).
-
We are actively writing research papers and have regular research visits/meetings
with our (primary) academic collaborators:
Derek Eager (University of Saskatchewan, Canada), Carey Williamson (University of Calgary, Canada),
Aniket Mahanti (University of Auckland, New Zeeland),
and György Dan (Royal Institute of Technology, Sweden).
While much of the existing work has done with international partners, we are actively
increasing our collaboration and discussions with Swedish industry.
This includes master theses projects and discussions about future PhD student internships.
However, we are also interested in other forms of collaboration with existing and future
industry partners.
Relationship to Other CENIIT Projects
We believe that computer networks research is a very important research area that the
Department and the University must strengthen its research in. For this reason, the
primary applicant was hired (in Sept 2010) and is currently in the process of building a new research
group. We further believe that the direction of the project align very well with CENIIT's goals.
Recruitment and Outreach
Research opportunities:
If you are interested in working on this project (or related projects), we are looking for
hardworking and ambitious people that are interested in research and/or problem solving. If you
fit this description, please send an email to Niklas Carlsson (niklas.carlsson@liu.se).
More information about the research,
thesis projects, and positions can be found here.
Companies and organizations:
If you are a company or organization that is interested in any of the topics included within this
project (and/or the general research interest of the group), we would be interested to discuss
potential mutual interests. Please send an email to Niklas Carlsson (niklas.carlsson@liu.se).
Publications
This project started in January 2011. Since then we have had two successful and productive years,
with many publications published in premier journals
(e.g., IEEE/ACM ToN, ACM TWEB, ACM TOIT, and Performance Evaluation),
magazine (IEEE Network), and selective conferences (including including ACM SIGKDD and IFIP Performance).
For an up-to-date list of publications related to the project we refer to
Niklas Carlsson's publication list.