Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. To protect the confidentiality of sensitive data while supporting deduplication, the convergent encryption technique has been proposed to encrypt the data before outsourcing. To better protect data security, this project makes the first attempt to formally address the problem of authorized data deduplication. Different from traditional deduplication systems, the differential privileges of users are further considered in duplicate check besides the data itself. Several new deduplication constructions supporting authorized duplicate check in a cloud architecture is proposed. Security analysis demonstrates that is secure in terms of the definitions specified in the proposed security model. Improving the performance of primary storage systems and minimizing performance overhead of deduplication.
1. Choi J, D. Lee, and Sam H. Noh. Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems. In FAST’12, Feb. 2012.
2. Clements A.T, I. Ahmad, M. Vilayannur, and J. Li. Decentralized Deduplication in SAN Cluster File Systems. In USENIX ATC’09, Jun. 2009.
3. Jiang H, S. Wu, Y. Fu, and L. Tian. Read Performance Optimization for Deduplication-based Storage Systems in the Cloud. ACM Transactions on Storage, 10(2):1–22, 2014.
4. Huge Patterson R, G. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. Informed prefetching and caching. In SOSP’95, Dec. 1995.
5. Kaiser J, A. Brinkmann, T. Cortes, M. Kuhn, and J. Kunkel. A Study on Data Deduplication in HPC Storage Systems. In SC’12, Nov.2012.
6. Koller R and R. Rangaswami. I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance. In FAST’10, pages 1–14, Feb. 2010.
7. Lofstead J, M. Polte, G. Gibson, S. Klasky, K. Schwan, R. Oldfield, M. Wolf, and Q. Liu. Six Degrees of Scientific Data: Reading Patterns for Extreme Scale Science IO. In HPDC’11, Jun. 2011.
8. Savage Sand J. Wilkes. AFRAID: A Frequently Redundant Array of Independent Disks. In USENIX ATC’96, Jan. 1996.
9. Shilane P, F. Douglis, H. Shim, S. Smaldone, and G. Wallace. Nitro: A Capacity-Optimized SSD Cache for Primary Storage. In USENIX’14, Jun. 2014.
10. Srinivasan K, T. Bisson, G. Goodson, and K. Voruganti. iDedup: Latency-aware, Inline Data Deduplication
I/O Deduplication, Data Redundancy, Performance, Storage Capacity