Scalable and Efficient Content Distribution
Start date of CENIIT Project: Jan. 2011
Niklas Carlsson (project leader)
Assistant Professor, IDA, Linköping University
niklas.carlsson@liu.se
1. Background and Industrial Perspective
With tremendous improvements in network bandwidth and computer capabilities, many new high-bandwidth content distribution services have emerged in the entertainment, business, and scientific communities. In contrast to traditional content distribution systems such as TV and radio broadcasts, many of these new services operate in an on-demand basis and only serve clients when explicit requests for service are made. Further, many of these new services allow the delivered content to be personalized.
Today, content delivery applications consume a majority of the Internet bandwidth. With continued growth in demand for such applications anticipated, the problem of cost-efficient and/or sustainable content delivery becomes increasingly important. For efficient delivery, protocols and architectures must scale well with the request loads; i.e., it is important that protocols are designed such that the marginal delivery costs reduce with increasing demands. Using scalable techniques can allow a content distributor to handle higher demands more efficiently, and/or to offer its existing customers better service while reducing its resource requirements and/or delivery costs.
A variety of techniques have been studied to improve the scalability and efficiency of content delivery, including replication, service aggregation, and peer-to-peer techniques. With replication, multiple servers (possibly geographically distributed as in a CDN) share the load of processing client requests and may enable delivery from a nearby server. With aggregation, multiple client requests are served together in a manner that is more efficient than individual service. Finally, with peer-to-peer techniques, clients may contribute to the total service capacity of the system by providing service to other clients.
While much work have considered various scalable solutions, there is a lack of literature considering the problem of cost-efficient content delivery, in which the application incurs both a network delivery cost (e.g., from cross ISP traffic or, more generally, operation/energy costs at Internet routers) and costs at the servers (e.g., due to cost of ownership, energy, or disk bandwidth). While the problem is complicated by the fact that the cost objective and the absolute cost tradeoff may be different from case to case, contributions that help identify the most efficient delivery architectures and protocols are becoming increasingly important for organizations distributing content over the Internet. This importance may be further augmented by the potential of increasing energy costs and carbon taxes, for example.
There are also many other trends that make content and service delivery a topic with increasing commercial impact. For example, increased bandwidth capacities, improved wireless access, and new high-capacity devices, such as the Apple iPhone, have enabled the emergence of many new, popular, and exciting applications, which themselves have the potential to attract new customer audiences. While this step towards greater mobility may be seen as the actualization of anything-anywhere-anytime services (that ideally allow users access to anything they want, whenever and wherever they are), the trend has also had great performance implications, as much more bandwidth is being used for data traffic generated by smartphones and other mobile devices. These and other trends have gathered significant interest from the industry, which currently is investing in research of new technologies that help disseminate content efficiently. In summary, there is much potential for new services and technologies of content delivery.
2. Project Description
Within this project, we plan to build a highly recognized research group that specializes in distributed systems and networks. While the research group has a relatively broad interest, we are very much interested in exploring the best protocols and architectures to serve large numbers of users/clients. Of particular interest is design, modeling, and performance evaluation that provides solid insights towards the best possible performance, scalability, efficiency, and/or quality of service.
The long-term goal of this research project is to contribute towards providing simple and effective ways to disseminate content and information.
There are many interesting avenues for future research that we would like to pursue. Below, is a brief description of some of the example directions this project will investigate in the near future:
-
Energy and cost efficient delivery: Energy costs to operate large distributed systems can be substantial. We are very interested in new protocols and architectures that take advantage of heterogeneous workloads, power consumption laws, and varying energy prices to reduce the total energy costs and energy consumption (carbon footprint) in such systems.
Building on a strong knowledge and interest in performance evaluation of systems, much of our recent work on sustainable ICT focuses on how to reduce the environmental footprint of systems and services (such as a server farm used to deliver content to millions of users), and/or how to operate these systems in a way that minimizes their environmental footprint. However, we are also interested in new ways to leverage ICT to reduce the footprint of other (non-traditional ICT) tasks.
-
Peer-assisted delivery: One highly scalable approach to content delivery is to harness the upload bandwidth of the clients to offload the original content source. With peer-assisted protocols, each file is typically split into many small pieces, each of which may be downloaded from different peers. While these techniques are very flexible, they also come with their unique challenges. Through our research on scalable content delivery we have identified many interesting problems for future research, including topics such as how to best serve the long tail
of mildly popular content that does not achieve sufficient request rates to be served efficiently using peer-assisted technologies.
On this topic, we are also very much interested in catchup services with which the content provider (such as SVT) may offer customers to watch full length episodes/shows more efficiently at any time after they were first aired.
-
Locality of service: Locality of service is a very important for quality of service, but also when minimizing the overall delivery cost of distributed services (e.g., the aggregate energy consumption). We are designing locality- and energy-efficient protocols for peer-assisted systems, and are also planning a wide-area implementation of a peer-assisted streaming system, which can be used to compare the locality achieved by different policies and protocols. Potential future work includes characterizing the topology of the Internet and its impact on energy-efficient server placement and server selection, as well as its impact on the scalability and quality of service of various distributed systems (e.g., content distribution systems, peer-to-peer file-sharing systems, and distributed network games).
-
Smartphones and mobile data traffic: While the emergence of mobile data traffic creates opportunities for application developers, this trend also brings numerous challenges for systems and networking architects and designers. We are currently characterizing the mobile data traffic at a large university and tracking all the usage (both over the Internet and over the voice network) of a group of users. As part of this project, we plan to investigate how both new and existing solutions, such as cooperative caching, can help in improving the scalability, efficiency and quality of these applications.
3. Research Vision and Project Goals
We plan to build a new strong research group which specializes in design, modeling, and performance evaluation of distributed systems and networks.
The overarching goal of this project is to establish a highly recognized research program that helps educate the next generation of scientists and that consistently produces high-quality research, which findings are shared through publications in premier conferences/journals or more targeted high-profile venues. For greater impact we are also planning to release open source implementations of our solutions.
At a technical level, the primary long-term goal of this research project is to contribute towards providing simple and resource effective ways to disseminate content and information in large networks such as the Internet. Ultimately, content and information should be easily accessible whenever and wherever you are, at a very low environmental and monetary cost.
Our research will focus on performance and resource usage/cost aspects of protocols, services, and systems. Furthermore, we will focus our efforts towards the problems that we believe have the potential to yield the most significant contributions, and always try to use the best tools for the problem at hand. Following the direction of the primary applicant, the research will include a combination of analytic modeling, simulations, measurements, system implementation, as well as real-world experiments.
4. Project Environment and Relevance
4.1. Industrial Relevance
In addition to our initial observations that content delivery contributes a majority of current Internet traffic, and that the portion is continually increasing, we note that more and more users expect to watch whatever content they want to watch, online, whenever and wherever they are. This trend towards anything-anytime-anywhere service is likely to continue, and companies are trying to take advantage of these trends. Simultaneously, companies are becoming increasingly aware of their energy consumption and overall delivery costs. As carbon taxes and energy-price differentiation (due to differences in energy source, for example) becomes common, energy efficiency and sustainable content delivery approaches are likely to have a very direct impact of the bottom line of many companies.
Finally, we note that there is a very wide range of companies that deliver (or plan to deliver) content over the Internet. Some companies are using peer-assisted solutions, while others are selecting server-based solutions.
We expect both approaches generating commercial interest and revenue;
our research will not be restricted to either approach.
4.2. Research Environment
The new computer networks and systems group will initially begin as a subgroup under the larger computer security and database group. During the building of the group, this will ensure access to some local resources. The primary people involved in this project are:
-
Niklas Carlsson, assistant professor and primary applicant (start date: September 1, 2010)
-
Nahid Shahmehri, professor and head of the ADIT group
While building the group, additional resources and collaborations will be leveraged around the world, as well as within Sweden (e.g., KTH, Ericsson, and Peerialism).
4.3. Relationship to Other CENIIT Projects
We believe that computer networks research is a very important research area that the Department and the University must strengthen its research in. For this reason, the primary applicant was hired and is currently in the process of building a new research group. We further believe that the direction of the project align very well with CENIIT's goals.
4.4. Cooperation and Industrial Partnership
Much of the proposed research will be done in cooperation with researchers in both academia and industry. Collaboration and partners include both Swedish and international companies and institutions.
While the primary applicant recently re-allocated to Sweden, Swedish collaborators have already been initiated and mutual interest for collaboration and technology has been established. We believe this further illustrates the commercial value of this research. We are committed to making sure that these relationships are maintained, strengthened, and help towards our partners, as well as our own, success.
Swedish industry partners:
-
Ericsson: Through initial meetings we have established overlap in research interest, with both sides believing that there is much room for mutual benefit. We have planned future meetings and intend to collaborate in the general area of computer networks research.
Peerialism: Through initial meetings, we have established a common ground for our interest in peer-assisted content delivery systems and now plan to leverage the combined knowledge to devise new and improved protocols and services.
International industry partners:
-
HP Labs: Niklas Carlsson has strong ongoing collaborations with HP Labs. In addition to numerous research projects, for the past two years, the primary applicant has organized ACM GreenMetrics (a workshop collocated with ACM SIGMETRICS, ACM.s flagship conference on computer systems and networks performance) together with Martin Arlitt (Senior Research Scientist, HP Labs, Palo Alto, CA) and Jerry Rolia (Principal Scientist, HP Labs, Bristol, UK).
-
National ICT Australia (NICTA): Niklas Carlsson has worked (and is actively working)
with Anirban Mahanti (Senior Researcher)
on numerous papers and projects (including two PhD students at NICTA, Sydney, Australia,
and two student theses which won best thesis awards at IIT Delhi, India).
Primary academic collaborations:
-
University of Saskatchewan, Canada:
Niklas has a PhD from University of Saskatchewan;
have worked as a Postdoctoral Fellow (January 2007 to June 2008) there;
and have ongoing projects with researchers there (including with Derek Eager, Professor).
-
University of Calgary, Canada:
Niklas has worked as a Research Associate (July 2008 to August 2010) at the University of Calgary and
have ongoing projects with researchers there (including with Carey Williamson, iCore Chair and Department Head).
-
Royal institute of Technology (KTH), Sweden:
Niklas is also actively publishing research papers with researchers at KTH
(including with György Dan, Assistant Professor).
Recruitment
If you are interested in working on this project (or a related projects), we are looking for hardworking and ambitious people that are interested in doing high-quality research. If you fit this description, please send Niklas an email. There are opportunities for both new PhD positions (doktorand in Swedish) and masters projects (exjobb in Swedish). More information about the research and positions can be found here.
Publications
For an up-to-date list of publications related to the project we refer to Niklas Carlsson's publication list.