Cost-Aware Big Data Processing Across-2
Cost-Aware Big Data Processing Across-2
Cost-Aware Big Data Processing Across-2
GUIDE
Mr.M.Krishnaraj B.E, M.Tech.,
Assistant Professor
Department of Information Technology,
Panimalar Institute of Technology.
HARDWARE REQUIREMENTS
Hard Disk : 80GB and Above
RAM : 4GB and Above
Processor : P IV and Above
TECHNOLOGY USED
• J2EE
• Cloud computing
• Framework: Apache
ARCHITECTURE
DATA-FLOW DIAGRAM
LEVEL 0
DATA-FLOW DIAGRAM
LEVEL 1
DATA-FLOW DIAGRAM
LEVEL 2
USE CASE DIAGRAM
COLLABRATION DIAGRAM
ACTIVITY DIAGRAM
Coding Part
ADMIN LOGIN
USER CREATION
USER LOGIN
UPLOAD DATA
LINK GENERATED
LOGOUT
CONCLUSION
• The proposed approach is predicted to be with
widespread application prospects in those globally-
serving companies since analyzing the geographically
dispersed datasets is an efficient way to support their
marketing decision.
• As the subproblems in the algorithm MiniBDP are
with analytical or efficient solutions that guarantee
the algorithm running in an online manner, the
proposed approach can be easily implemented in the
real system to reduce the operation cost
Future Work
• Deploying the proposed algorithm in the real
systems such as Amazon EC2
• Cost minimization, introducing data replication
will add additional cost of replicating data
across datacenters.
• Extending the original model to support other
types of jobs like astronomic image
processing.
REFERENCE
1. “Square kilometre array,” http://www.skatelescope.org/
2. A. Vulimiri, C. Curino, B. Godfrey, T. Jungblut, J. Padhye, and G. Varghese, “Global analytics in the face
of bandwidth and regulatory constraints,” in Proceedings of the USENIX NSDI’15, 2015.
3. Global analytics in the face of bandwidth and regulatory constraints, Ashish Vulimiri,Carlo Curino,2020
4. J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” Communications of
the ACM, vol. 51, no. 1, pp. 107–113, 2008.
5. M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster computing with
working sets,” in Proceedings of the USENIX HotCloud’10, 2010.
6. E. E. Schadt, M. D. Linderman, J. Sorenson, L. Lee, and G. P.Nolan, “Computational solutions to large-
scale data management and analysis,” Nature Reviews Genetics, vol. 11, no. 9, pp. 647–657, 2010.
7. M. Cardosa, C. Wang, A. Nangia et al., “Exploring mapreduce efficiency with highly-distributed data,” in
Proceedings of the second international workshop on MapReduce and its applications, 2011.
8. L. Zhang, C. Wu, Z. Li, C. Guo, M. Chen, and F. C. M. Lau, “Moving big data to the cloud: An online cost-
minimizing approach,” IEEE Journal on Selected Areas in Communications, vol. 31, pp. 2710–2721, 2013.
9. W. Yang, X. Liu, L. Zhang, and L. T. Yang, “Big data real-time processing based on storm,” in Proceedings
of the IEEE TrustCom’13, 2013.
10. Y. Zhang, S. Chen, Q. Wang, and G. Yu, “i2mapreduce: Incremental mapreduce for mining evolving big
data,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, pp. 1906–1919, 2015.
11. D. Lee, J. S. Kim, and S. Maeng, “Large-scale incremental processing with mapreduce,” Future
Generation Computer Systems, vol. 36, no. 7, pp. 66–79, 2014.
12. B. Heintz, A. Chandra, R. K. Sitaraman, and J. Weissman, “End-toend optimization for geo-distributed
mapreduce,” IEEE Transactions on Cloud Computing, 2014
13. C. Jayalath, J. Stephen, and P. Eugster, “From the cloud to the atmosphere: Running mapreduce across
data centers,” IEEE Transactions on Computers, vol. 63, no. 1, pp. 74–87, 2014. [13] P. Li, S. Guo, S.
Yu, and W. Zhuang, “Cross-cloud mapreduce for big data,” IEEE Transactions on Cloud Computing,
2015, dOI:10.1109/TCC.2015.2474385.
14. A. Sfrent and F. Pop, “Asymptotic scheduling for many task computing in big data platforms,”
Information Sciences, vol. 319, pp. 71–91, 2015.
15. L. Zhang, Z. Li, C. Wu, and M. Chen, “Online algorithms for uploading deferrable big data to the
cloud,” in Proceedings of the IEEE INFOCOM, 2014, pp. 2022–2030. [16] Q. Zhang, L. Liu, A.
Singhand et al., “Improving hadoop service provisioning in a geographically distributed cloud,” in
Proceedings of IEEE Cloud’14, 2014.
16. 2017, [online] Available: http://www.datacenterknowledge.com/archives/2008/11/18/where- amazons-
data-centers-are-located/.
THE END