Skip to main content
Publications lead hero image abstract pattern


Cloud computing is an innovative technology that relies on sharing of resources similar to a utility over the Internet. The cloud allows one to acquire computing resources such as processing power and storage on demand according to a pay-as-you-go model. In cloud computing, services are offered according to three different models: Infrastructure as a Service (Iaas) (e.g., Virtual Machines, Storage), Platform as a Service (PaaS) (e.g., Web Servers, Databases), and Software as a Service (SaaS) (e.g., Email, Virtual Desktop). There are multiple ways of deploying a cloud to offer services to the cloud clients: Private Cloud (intra-organization), Public Cloud (open for public use over the Internet), and Hybrid Cloud (a combination of private and public cloud).

Though cloud-computing technology is fairly mature and adopted by various industries, there are still plenty of opportunities for research. We believe that this Best Readings on Cloud Computing will serve as a valuable bibliographical resource to those starting to work on cloud computing and who look for a single point of access to technical papers spread across a large number of conferences and journals. The list of papers is grouped under a broad range of technical topics --- including cloud services, QoS (quality of service), data centers, and security --- so that readers can easily browse through the list based on their interests.

Issued June 2015


Kaliappa Ravindran, City University of New York
Masum Z Hasan, Cisco Systems Inc.

Gerard Parr, University of Ulster
Arun Adiththan, City University of New York


R.K. Gullapalli, C. Muthusamy, and A.V. Babu, “Control Systems Application in Java-based Enterprise and Cloud Environments: A Survey,” International Journal of Advanced Computer Science and Applications (IJACSA), vol. 2, no. 8, pp. 103-113, 2011.

The paper explores the use of "feedback control systems" theory for Web and Application Server environments hosted on data-centers and clouds. The paper first presents a review of how the control systems theory has been employed in CPU load balancing and power management applications. It then discusses how the control systems theory can be employed in Java-based Web, Application and Enterprise Server environments.

Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, “Heterogeneity in Mobile Cloud Computing: Taxonomy and Open Challenges,” IEEE Communications Surveys & Tutorials, vol. 16, no. 1, pp. 369-392, May 2013.

This paper discusses the taxonomy of technical challenges in mobile cloud computing (MCC). Component heterogeneity is analyzed at various abstraction levels: hardware, platform, feature, API, and network. The role of heterogeneity handling approaches like virtualization, middleware, and service-oriented architecture are discussed in the context of MCC.

N. Fernando, S.W. Loke, and W. Rahayu, “Mobile Cloud Computing: A Survey,” Journal of Future Generation Computer Systems (FGCS), vol. 29, no. 1, pp. 84-106, January 2013.

The paper discusses a taxonomy of issues in executing mobile applications on resource providers (i.e., clouds) external to the mobile device. Major challenges are the resource sharing and frequent disconnection of devices. The discussion surveys different approaches to tackle these issues, and analyzes the technical challenges to be met.

S. Abolfazli, Z. Sanaei, E. Ahmed, A. Gani, and R. Buyya, “Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges,” IEEE Communications Surveys & Tutorials, vol. 16, no. 1, pp. 337-368, July 2013.

The paper describes Cloud-based Mobile Augmentation (CMA) approaches by employing resource-rich clouds to enhance the computing capabilities of mobile devices to execute resource-intensive mobile applications. The paper presents taxonomy of CMA approaches, highlighting the effects of remote resources on the quality and reliability of augmentation processes. State-of-the-art CMA approaches are discussed in the dimensions of device proximity to a cloud and device mobility across different clouds.


Mobile Cloud Computing has papers that highlight challenges in utilizing cloud-based computational resources to enable execution of complex tasks in resource-constrained mobile devices. Cloud Service-Level Agreements lists papers on creating and managing contracts between a cloud service provider and its customers according to metrics such as availability, performance, and security. Quality of Service (QoS) in the Cloud lists papers on QoS management, which focus on the problem of allocating resources to meet the QoS targets specified in the service-level agreement. Cloud Performance Management has papers that consider challenges related to monitoring the capability of cloud components in delivering expected service. Finally, papers on security issues/concerns associated with cloud computing are given in Cloud Security.

Topic: Data Centers

M.Z. Hasan, M. Morrow, L. Tucker, S.L.D. Gudreddi, and S. Figueira, “Seamless Cloud Abstraction, Model and Interfaces,” in Proc. ITU Kaleidoscope 2011: The Fully Networked Human? - Innovations for Future Networks and Services (K-2011),December 2011.

This paper presents a seamless cloud framework that enables cloud service consumers (end users or another cloud service provider) to seamlessly integrate resources controlled by private, public, and hybrid cloud service providers on demand. The framework also allows isolation of different tenants in a multi-tenant environment and provides differentiated quality of service.

V. Mann, A. Vishnoi, and S. Bidkar, “Living on the Edge: Monitoring Network Flows at the Edge in Cloud Data Centers,” in Proc. International Conference on Communication Systems and Networks (COMSNETS), January 2013.

The paper presents a network-wide flow monitoring service that generates key inputs to the network control plane for efficient traffic engineering. This work considers the challenges in network monitoring in the context of virtualization in cloud data centers such as the introduction of “vswitch” (virtual switch). This paper highlights the need for redesigning some of the flow-based monitoring techniques developed for hardware switches for software implementations.

H. Nguyen, Z. Shen, X. Gu, S. Subbiah, and J. Wilkes, “AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service,” in Proc. International Conference on Autonomic Computing (ICAC), USENIX, ACM SIGARCH, June 2013.

The paper studies the topic of dynamically adjusting the number of VMs assigned for a cloud application to keep up with load changes and interference from other users. The paper employs two complementary techniques: i) resource demand prediction to start new application server instances before a performance degradation occurs, and ii) dynamic VM cloning to reduce application startup times.

V. Mann, A. Gupta, P. Dutta, A. Vishnoi, P. Bhattacharya, R. Poddar, and A. Iyer, “Remedy: Network-Aware Steady State VM Management for Data Centers,” in Proc. IFIP Conference on Networking, May 2012.

The paper describes methods for network-aware management of VMs in data centers. The idea is to ensure that VM migrations do not degrade the network performance experienced by other flows in the network. The parameters considered are the cost of migration, the bandwidth available for migration, and the bandwidth savings achieved by migration. The authors describe an OpenFlow controller application, Remedy, that detects the most congested links in the network and migrates one or more VMs in a network-aware manner to decongest these links.

V. Mann, A. Vishnoi, A. Iyer, and P. Bhattacharya, “VMPatrol: Dynamic and Automated QoS for Virtual Machine Migrations,” in Proc. IEEE/IFIP International Conference on Network and Service Management (CNSM'12), October 2012.

The paper presents a QoS framework -- VMPatrol -- for VM migrations in a data center. VMPatrol uses a cost of migration model to allocate a minimal bandwidth for migrating a task flow such that it completes within the specified time limit while causing minimal interference to other flows in the network. The idea of VMPatrol has been tested on real and virtual software testbeds.

J. A. Wickboldt, L. Z. Granville, F. Schneider, D. Dudkowski, and M. Brunner, “A New Approach to the Design of Flexible Cloud Management Platforms,” in Proc. IEEE/IFIP International Conference on Network and Service Management (CNSM) and Workshop on Systems Virtualization Management (SVM), October 2012.

The paper presents a conceptual architecture for cloud platforms that adds flexible and robust network configuration support. It complements the existing work on cloud-management platforms that deal mainly with computing and storage resources but do not support advanced requirements such as delay and bandwidth guarantees.

A. Nahir, A. Orda, and D. Raz, “Distributed Oblivious Load Balancing Using Prioritized Job Replication,” in Proc. IEEE/IFIP International Conference on Network and Service Management (CNSM'12), October 2012.

The paper discusses the load balancing of computational jobs in large distributed server systems such as Amazon Elastic Compute Cloud (EC2). The authors’ scheme assigns regular job requests to randomly chosen servers, and then creates replicas to be sent to different servers and executed at low priority such that there is no burden on the servers. The elimination of job scheduling overhead improves the overall system performance.

Topic: Mobile Cloud Computing

T. Verbelen, P. Simoens, F.D. Turck, and B. Dhoedt, “Cloudlets: Bringing the Cloud to the Mobile User,” in Proc. ACM Workshop on Mobile Cloud Computing and Services (MCS '12), June 2012.

The paper suggests methods to bring a cloud closer to the mobile devices (in a logical sense), so that complex computations spawned by mobile applications can be offloaded to the cloud. The paper introduces an abstract object, cloudlet, that can be instantiated on a variety of cloud components. With cloudlets, a mobile application can be flexibly composed with any cloud device and with any available resources, and they do not require a fixed infrastructure. A mobile real-time augmented reality application implemented with cloudlets is also discussed.

M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The Case for VM-based Cloudlets in Mobile Computing,” IEEE Pervasive Computing, vol. 8, no.4, pp. 14-23, October 2009.

The paper discusses the challenges in enhancing the computational capabilities of mobile users with cloud-based back-end VMs. It proposes an architecture that allows a mobile user to rapidly instantiate customized service software on a nearby cloudlet (e.g., speech recognition tool), which can then be accessed over a wireless LAN. The mobile device functions as a thin client for the service, interacting with a trusted resource-rich cloudlet running on the VMs. The cloudlet-based architecture enables meeting the high resource demands of applications such as high-definition video and high-resolution images.

M. Joselli, M. Zamith, J.R. Silva, L. Valente, and E. Clua, “An Architecture for Mobile Games with Cloud Computing Module,” in Proc. Brazilian Symposium on Computer Games and Digital Entertainment (SBGames 2012), November 2012.

The paper presents an architecture to support mobile games with a cloud back-end. In this architecture, a cloud module interacts with compute-intensive services hosted on a cloud such as image & speech recognition and game display rendering. The architecture allows for robust connections to social networks for publishing the game progress and statistics.

B. G. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti, “CloneCloud: Elastic Execution between Mobile Device and Cloud,” in Proc. Sixth Conference on Computer Systems (EuroSys '11), April 2011.

The paper presents a system design and implementation of a system, "CloneCloud", that allows mobile applications to benefit from a back-end cloud; namely, the offloading of computational tasks. The tasks run on device clones operating in a computational cloud by migrating the device-level threads. The CloneCloud allows the decomposition of an application at finer granularity, while optimizing the execution time and energy use.

P. Bahl, R.Y. Han, L.E. Li, and M. Satyanarayanan, “Advancing the State of Mobile Cloud Computing,” in Proc. ACM Workshop on Mobile Cloud Computing and Services (MCS '12), June 2012.

The paper explores fundamental research questions at the intersection of mobile and cloud computing. It discusses the rationale for offloading computational tasks spawned by mobile applications, the induction of a middle tier in the cloud architecture to support mobility, and the system-level programming models such as thread granularity and RMI. Apple iCloud and Amazon Silk browser are example mobile applications considered.

A.P. Miettinen and J.K. Nurminen, “Energy Efficiency of Mobile Clients in Cloud Computing,” in Proc. USENIX Workshop on Hot Topics in Cloud Computing (HotCloud'10), June 2010.

The paper discusses the critical factors affecting the energy consumption of mobile clients in cloud-computing environments. The idea is that the savings from offloading the computation should exceed the energy cost incurred for additional communication. The characteristics of mobile handheld devices that determine the balance between local and remote computing is also discussed.

I. Giurgiu, O. Riva, D. Juric, I. Krivulev, and G. Alonso, “Calling the Cloud: Enabling Mobile Phones as Interfaces to Cloud Applications,” in Proc. ACM/IFIP/USENIX International Middleware Conference (Middleware'09), November/December 2009.

The paper presents a middleware platform that can automatically distribute different layers of an application between the mobile phone and the remote server (e.g., running on a cloud). It offers a flexible middle ground for the software partitioning between the phone and the server, when compared to the existing approaches where applications either run on the phone or run on the server and are remotely accessed by the phone. The approach optimizes a variety of objective functions: such as the latency, data transferred, and cost. At the software platform level, the middleware approach builds on existing technology for distributed module management.

P. Angin and B.K. Bhargava, “Real-time Mobile-Cloud Computing for Context-Aware Blind Navigation,” International Journal of Next-Generation Computing, vol. 2, no. 2, pp. 1-13, July 2011.

The paper describes an approach for context-aware navigation by exploiting the computational power of resources from Cloud providers as well as the location-specific resources on the Internet. The paper describes an extensible system architecture that minimizes reliance on the infrastructure, thus allowing for wide usability, in particular, for navigation in unfamiliar environments (especially for the blind and visually impaired).

E. Cuervoy, A. Balasubramanian, D. K. Cho, A. Wolmanx, S. Saroiux, R. Chandra, and P. Bahl, “MAUI: Making Smartphones Last Longer with Code Offload,” in Proc. International Conference on Mobile Systems, Applications, and Services (MobiSys'10), June 2010.

This paper presents a system, MAUI, that allows energy-conscious offload of mobile code to the infrastructure. MAUI does not rely on programmer support to partition an application, and the migration can be fine-grained without requiring full VM migration. Sample applications are the resource-intensive face recognition, voice-based language translation, and arcade games. MAUI decides at runtime which methods should be remotely executed, using an optimization engine that achieves the best energy savings possible under the mobile device’s connectivity constraints.

E. Miluzzo, R. Caceres, and Y.F. Chen, “Vision: mClouds – Computing on Clouds of Mobile Devices,” in Proc. ACM Workshop on Mobile Cloud Computing and Services (MCS'12), June 2012.

This paper presents a vision in which mobile devices become a core component of mobile cloud-computing architectures. Mobile devices can be capable of forming mobile clouds themselves, which is in contrast with the current ideas on empowering mobile devices with the capabilities of stationary resources residing in large data centers. The new vision, "mClouds", allows mobile devices to accomplish tasks locally without backend communications with remote resources.

T. Xing, D. Huang, S. Ata, and D. Medhi, “MobiCloud: a Geo-distributed Mobile Cloud Computing Platform,” in Proc. International Conference on Network and Service Management (CNSM), October 2012.

The paper describes a general service and resource-provisioning platform, "MobiCloud", designated for mobile devices. MobiCloud is a geo-distributed mobile cloud-computing platform made up of system components, infrastructure resources, and software services. The MobiCloud platform allows users to migrate their hitherto locally processed workloads onto the cloud with more resources for better performance and functionality.

Topic: Cloud Service-Level Agreements

K. Ravindran, “Model-based Engineering Methods for Certification of Cloud-based Network Systems,” in Proc. IEEE International Conference on Communication Systems and Networks (COMSNETS), January 2013.

This paper presents model-based engineering techniques to assess how well a cloud-based application system meets the QoS objectives under uncontrolled external environment conditions. In the model-based dependability assessment approach for a cloud system, QoS, timeliness, and fault-tolerance attributes are considered.

C. Chen, P. Maniatis, A. Perrig, A. Vasudevan, and V. Sekar, “Towards Verifiable Resource Accounting for Outsourced Computation,” in Proc. ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE '13), March 2013.

The paper describes a system management tool, ALIBI, that offers verifiable resource accounting for computations outsourced to a cloud. The absence of verifiable mechanisms in current systems leads to an undesirable mutual mistrust for the service providers and consumers. ALIBI places a trusted reference monitor underneath the service provider’s software platform to observe the resource allocation to customer guest VMs. The tool tracks the guest VM's memory use and CPU-cycle consumption.

K. Stamou, J.H. Morin, B. Gateau, and J. Aubert, “Service Level Agreements as a Service - Towards Security Risks Aware SLA Management,” in Proc. International Conference on Cloud Computing and Services Science (CLOSER), April 2012.

The paper advocates that SLAs for cloud services should be more customer-oriented and aware of security and risk management. This is in contrast with current practices, where cloud SLAs provide limited support outside of basic Quality of Service (QoS) parameters. The paper suggests a design that decouples the SLA formulation process from the actual cloud service provisioning. The idea of "SLA as a Service in itself" offers customers the flexibility of more customized and fine-grained agreements than the current methods.

N. H. Shirazi, A.S. Filho, and D. Hutchison, “Service Level Agreement Monitoring for Resilience in Computer Networks,” in Proc. Annual Postgraduate Symposium on the Convergence of Telecommunications, Networking and Broadcasting, June 2012.

The paper describes methods to investigate the resilience and security requirements of network service providers that have been defined in a Service-level Agreement (SLA). The goal is to have the service providers use these methods to check SLA compliance before delivering a service, and thus increase the service-level performance and quality. The methods are useful in today's cloud environments, which are based on a service-based network abstraction (in contrast with the traditional resource-based view).

K. Ravindran, “QoS Auditing for Evaluation of SLA in Cloud-based Distributed Services,” in Proc. IEEE World Congress on Services, June/July 2013.

The paper describes methods for QoS auditing under various security threats and resource depletions faced by applications running on a cloud-based distributed service. Given the less-than-100% trust between various sub-systems that an application is composed of, the paper advocates a probabilistic analysis of application behavior relative to the negotiated service guarantees.

Q. Huang, L. Ye, X. Liu, and X. Du, “Auditing CPU Performance in Public Cloud,” in Proc. IEEE World Congress on Services, June/July 2013.

The paper describes measurement tools to verify the CPU speed of VMs leased from a semi-trusted cloud relative to a negotiated SLA (service-level agreement). It describes experiments with a measurement algorithm that can detect cloud cheating on CPU speed (i.e., SLA violations) in a stealthy way.

D. Breitgand, Z. Dubitzky, A. Epstein, A. Glikson, and I. Shapira, “SLA-aware Resource Over-Commit in an IaaS Cloud,” in Proc. IEEE/IFIP International Conference on Network and Service Management (CNSM'12), October 2012.

The paper describes a model for resource over-commit in clouds, also known as statistical sharing of cloud resources. It discusses the tradeoff between a higher resource utilization by increasing the over-commit ratio and a risk of resource congestion. An SLA to express the probability of launching a VM to support an expanded workload is proposed. The paper describes an algorithmic framework to estimate the total physical capacity required for SLA compliance under over-commit.

A. Chilwan, “>Dependability Differentiation in Cloud Services,” M.S. Thesis, Dept. of Telematics, Norwegian University of Science and Technology, July 2011.

The thesis discusses cloud-provisioning strategies to offer different levels of service dependability for users based on the SLAs between them. The focus is on a key dependability attribute for clouds; namely, service availability as perceived by a cloud user. The work develops analytical models for differentiating cloud availability by replicating Virtual Machines (VMs), using diverse replication schemes. It discusses the resource and financial aspects of cloud services to achieve dependability differentiation.

C. Wang, Q. Wang, K. Ren, N. Cao, and W. Lou, “Towards Secure and Dependable Storage Services in Cloud Computing,” IEEE Transactions on Services Computing, vol. 5, no. 2, pp. 220-232, April 2012.

The paper studies a distributed storage integrity auditing mechanism for an environment where users remotely store their data on a cloud. With users relinquishing the physical possession of their outsourced data, the audit mechanism allows users to verify strong storage correctness guarantee and achieve fast error localization.

Topic: Quality of Service (QoS) in the Cloud

C.C. Lamb, P.A. Jamkhedkar, G.L. Heileman, and C.T. Abdallah, “Managed Control of Composite Cloud Systems,” in Proc. International Conference on System of Systems Engineering (SoSE), June 2011.

The paper describes a functionality that enables users to easily configure and manage cloud infrastructure resources at a service level. The paper suggests an automated measurement of QoS metrics, and then using the collected metrics within control loops to manage and provision cloud resources.

T. Wood, E. Cecchet, K.K. Ramakrishnan, P. Shenoy, J.V. Merwey, and A. Venkataramani, “ Disaster Recovery as a Cloud Service: Economic Benefits & Deployment Challenges,” in Proc. USENIX Conference on Hot Topics in Cloud Computing (HotCloud'10), June 2010.

The paper describes the issues and challenges in providing Disaster Recovery (DR) as a service offered by clouds. The paper provides DR solutions on automated virtual platforms that incur lower costs while minimizing the recovery time after a failure. This is in contrast with current DR services that come at very high cost, and offers only weak guarantees on data loss and the restart time after a failure.  

S. Choy, B. Wong, G. Simon, and C. Rosenberg, “The Brewing Storm in Cloud Gaming: A Measurement Study on Cloud to End-User Latency,” in Proc. Annual Workshop on Network and Systems Support for Games (NetGames), November 2012.

The paper examines the technical issues in supporting large gaming applications over clouds. The feasibility of cloud gaming rests on the ability of cloud infrastructure to meet the strict latency requirements for acceptable perception-quality in a fast multi-player interactive setting. The paper reports a measurement study to assess the performance aspects of a cloud-hosted game application; namely, the number of players, game servers, and network bandwidth.

A. Klein, F. Ishikawa, and S. Honiden, “Towards Network-aware Service Composition in the Cloud,” in Proc. ACM Conference on Worldwide Web (WWW 2012), April 2012.

The paper suggests methods to select an optimal set of services for a composition, in terms of QoS, from a large pool of functionally equivalent services. The work is relevant in Cloud Computing where both the number of distinct services and their distribution across the network are on the rise, increasing the impact of the network on the quality of a composition. The approach distinguishes between the QoS of services themselves and the QoS of the network, which allows the estimated latency to match closely with the actual latency, resulting in near-optimal service compositions in the cloud.

T. Hobfeld, R. Schatz, M. Varela, and C. Timmerer, “Challenges of QoE Management for Cloud Applications,” IEEE Communications Magazine, vol. 50, no. 4, pp. 28-36, April 2012.

The paper makes a case that quality of experience as perceived by users (QoE) will be a guiding paradigm for cloud management. With more personal and business applications migrating to the cloud, the service-level quality that strongly impacts the QoE becomes the key differentiator between providers. The paper discusses the technical challenges in shifting services to the cloud, from a standpoint of how this shift impacts QoE management. Multimedia cloud applications are discussed as a key driver.

T. Woody, H.A. Lagar-Cavilla, K.K. Ramakrishnan, P. Shenoy, and J.V. Merwe, “PipeCloud: Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery,” in Proc. ACM Symposium on Cloud Computing (SOCC'11), October 2011.

This paper describes the methods for cloud-based disaster recovery (DR). The methods are based on pipelined synchronous replication of computation/data to recover an application to the point of crash. The economies of scale and on-demand provisioning enabled by Cloud hosting allow meeting the infrequent, yet urgent, needs of DR. The disaster failover methods are demonstrated on Amazon EC2 platform.

Topic: Cloud Performance Management

F. Wuhib, R. Stadler, and H. Lindgren, “Dynamic Resource Allocation with Management Objectives -- Implementation for an OpenStack Cloud,” in Proc. IEEE/IFIP International Conference on Network and Service Management (CNSM'12),pp.  309-315, Oct. 22-26, October 2012.

The paper reports the design, implementation and evaluation of a resource management system that builds upon OpenStack, an open-source cloud platform for private and public clouds. The design supports a broad set of management objectives among which the system can switch at runtime. The management objectives related to load-balancing and energy efficiency are mapped onto the controllers of a resource allocation subsystem, which may include live VM migration.

G. Jung, K.R. Joshi, M.A. Hiltunen, R.D. Schlichting, and C. Pu, “Performance and Availability Aware Regeneration For Cloud Based Multitier Applications,” in Proc. IEEE/IFIP International Conference on Dependable Systems and Networks (DSN),June 2010.

The paper examines how the cloud technology that allows easier migration, replication, and allocation of VM resources can be leveraged to provide high system availability while maintaining performance. A key idea is the dynamic redundancy by regeneration of software components whenever failures occur (in contrast with existing solutions based on a fixed redundancy level). This is augmented with a smart control of component placement and resource allocation (based on application-layer information) to minimize performance degradation.

A. Li, X. Yanhg, S. Kandula, and M. Zhang, “CloudCmp: Comparing Public Cloud Providers,” in Proc. ACM SIGCOMM Conference on Internet Measurement (IMC),November 2010.

The paper describes the methods to compare different cloud providers in the face of their varying approaches to the infrastructure, virtualization, and software services. The authors develop a measurement tool, CloudCmp, which compares the performance and cost of cloud providers using a common set of metrics. The tool is useful for customers to select the best-performing providers for their applications.

N.R. Herbst, S. Kounev, and R. Reussner, “Elasticity in Cloud Computing: What It Is, and What It Is Not,” in Proc. International Conference on Autonomic Computing (ICAC), June 2013.

The paper gives a concrete definition of the term "elasticity", which is heavily used in the context of cloud computing. It suggests representative metrics of elasticity, and a benchmarking methodology to capture them, to enable the comparison of different cloud systems. It also separates the elasticity notion from the other well-known system attributes such as scalability and efficiency.

R. Krebs, C. Momm, and S. Kounev, “Metrics and Techniques for Quantifying Performance Isolation in Cloud Environments,” in Proc. ACM SIGSOFT Conference on Quality of Software Architectures (QoSA), June 2012.

The paper presents different types of metrics to quantify the level of performance isolation of cloud-based systems when resources are shared among multiple services. The metrics are useful for the cloud providers to make different service offerings that meet the performance isolation capabilities needed by applications. The metrics are measurable externally by running benchmarks from the outside, treating the cloud as a black box.

Topic: Cloud Security

A. Srivastava, H. Raj, J. Giffin, and P. England, “Trusted VM Snapshots in Untrusted Cloud Infrastructures,” in Proc. International Conference on Research in Attacks, Intrusions, and Defenses (RAID), September 2012.

The paper discusses the verifiability of trust bestowed on a cloud infrastructure from the customer side. The creation of customer trust requires integrity measurements of the leased VMs at various run-time points: i.e., “snapshots” of VM states. To ensure the trustworthiness of snapshots itself, the paper advocates against the common methods of employing privileged VMs entrusted with the snapshot generation task. The paper instead suggests a hypervisor-based measurement tool whose integrity cannot be compromised by a rogue privileged VM or its administrators.

T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, “Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds,” in Proc. ACM Conference on Computer and Communications Security (CCS), November 2009.

The paper describes how the business model of cloud infrastructure resource sharing across multiple applications exposes new vulnerabilities. The paper shows that it is possible to map the internal cloud infrastructure, and identify therein where a particular target VM is likely to reside. Such a vulnerability, when exploited, can manifest in cross-VM side-channel attacks for information snooping on a target VM. The paper reports a study of potential vulnerabilities using the Amazon EC2 service as a case study.