Cloud computing provides massive computation power and storage capacity which enable users to deploy computation and data intensive applications without infrastructure investment. Along the processing of such applications, a large volume of intermediate datasets will be generated and often stored them to save the cost of recomputing them. In this paper, toward achieving the minimum cost benchmark and for cost effectively storing large volume of generated application datasets in the cloud, we propose a novel highly cost effective and practical storage strategy that can automatically decide whether a generated dataset should be stored or not at runtime in the cloud and from that stored dataset, inorder to provide security to the sensitive dataset, we propose a novel upper bound privacy leakage constraint-based approach to identify which intermediate data sets need to be encrypted and which do not, so that privacy-preserving cost can be saved and also the privacy requirements of data holders can be satisfied.
 M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R. Katz, A.Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M.Zaharia, ―A View of Cloud Computing,‖ Comm. ACM, vol. 53,no. 4, pp. 50-58, 2010.  R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, and I. Brandic,―Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the Fifth Utility,‖ Future Generation Computer Systems, vol. 25, no. 6, pp. 599-616, 2009.  L. Wang, J. Zhan, W. Shi, and Y. Liang, ―In Cloud, Can Scientific Communities Benefit from the Economies of Scale?,‖ IEEE Trans.Parallel and Distributed Systems, vol. 23, no. 2, pp. 296-303, Feb. 2012.  H. Takabi, J.B.D. Joshi, and G. Ahn, ―Security and Privacy Challenges in Cloud Computing Environments,‖ IEEE Security & Privacy, vol. 8, no. 6, pp. 24-31, Nov./Dec. 2010.  D. Zissis and D. Lekkas, ―Addressing Cloud Computing Security Issues,‖ Future Generation Computer Systems, vol. 28, no. 3, pp. 583-592, 2011.  D. Yuan, Y. Yang, X. Liu, and J. Chen, ―On-Demand Minimum Cost Benchmarking for Intermediate Data Set Storage in Scientific Cloud Workflow Systems,‖ J. Parallel Distributed Computing,vol. 71, no. 2, pp. 316-332, 2011.  S.Y. Ko, I. Hoque, B. Cho, and I. Gupta, ―Making Cloud Intermediate Data Fault-Tolerant,‖ Proc. First ACM Symp. Cloud Computing (SoCC ’10), pp. 181-192, 2010.  H. Lin and W. Tzeng, ―A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding,‖ IEEE Trans.Parallel and Distributed Systems, vol. 23, no. 6, pp. 995-1003, June 2012.  N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, ―Privacy-Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data,‖ Proc.IEEE INFOCOM ’11, pp. 829-837, 2011.  M. Li, S. Yu, N. Cao, and W. Lou, ―Authorized Private Keyword Search over Encrypted Data in Cloud Computing,‖ Proc. 31st Int’l Conf. Distributed Computing Systems (ICDCS ’11), pp. 383-392, 2011.  C. Gentry, ―Fully Homomorphic Encryption Using Ideal Lattices,‖Proc. 41st Ann. ACM Symp. Theory of Computing (STOC ’09),pp. 169-178, 2009.  B.C.M. Fung, K. Wang, and P.S. Yu, ―Anonymizing Classification Data for Privacy Preservation,‖ IEEE Trans. Knowledge and Data Eng., vol. 19, no. 5, pp. 711-725, May 2007.  B.C.M. Fung, K. Wang, R. Chen, and P.S. Yu, ―Privacy-Preserving Data Publishing: A Survey of Recent Developments,‖ ACM Computing Survey, vol. 42, no. 4, pp. 1-53, 2010.  X. Zhang, C. Liu, J. Chen, and W. Dou, ―An Upper-Bound Control Approach for Cost-Effective Privacy Protection of Intermediate Data Set Storage in Cloud,‖ Proc.Ninth IEEE Int’l Conf. Dependable,Autonomic and Secure Computing (DASC ’11), pp. 518-525, 2011.  I. Roy, S.T.V. Setty, A. Kilzer, V. Shmatikov, and E. Witchel,―Airavat: Security and Privacy for Mapreduce,‖ Proc. Seventh USENIX Conf. Networked Systems Design and Implementation (NSDI’10), p. 20, 2010.  K.P.N. Puttaswamy, C. Kruegel, and B.Y. Zhao, ―Silverline:Toward Data Confidentiality in Storage-Intensive Cloud Applications,‖Proc. Second ACM Symp. Cloud Computing (SoCC ’11), 2011.  K. Zhang, X. Zhou, Y. Chen, X. Wang, and Y. Ruan, ―Sedic:Privacy-Aware Data Intensive Computing on Hybrid Clouds,‖Proc. 18th ACM Conf. Computer and Comm. Security (CCS ’11),pp. 515-526, 2011.  V. Ciriani, S.D.C.D. Vimercati, S. Foresti, S. Jajodia, S. Paraboschi,and P. Samarati, ―Combining Fragmentation and Encryption to Protect Privacy in Data Storage,‖ ACM Trans. Information and System Security, vol. 13, no. 3, pp. 1-33, 2010.  S.B. Davidson, S. Khanna, T. Milo, D. Panigrahi, and S. Roy,―Provenance Views for Module Privacy,‖ Proc. 30th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS ’11), pp. 175-186, 2011.  S.B. Davidson, S. Khanna, S. Roy, J. Stoyanovich, V. Tannen, and Y. Chen, ―On Provenance and Privacy,‖ Proc. 14th Int’l Conf.Database Theory, pp. 3-10, 2011.
Datasets storage, computation, cloud computing, data storage privacy, privacy preserving, intermediate dataset, privacy upper bound.