Search in the document preview
COLLECTION ANALYSIS FOR CLOUD USERS
ABHISHEK MEHTA 17BCE002
& ANUJ SANGHAVI
DEPARTMENT OF COMPUTER ENGINEERING Ahmedabad 382481
COLLECTION ANALYSIS FOR CLOUD
Mini Project - I
Submitted in fulfillment of the requirements
For the degree of
Bachelor of Technology in Computer Engineering
ABHISHEK MEHTA 17BCE002
& ANUJ SANGHAVI
Guided By PROF. VIVEK PRASAD
[DEPARTMENT OF COMPUTER ENGINEERING]
DEPARTMENT OF COMPUTER ENGINEERING Ahmedabad 382481
This is to certify that the project entitled “COLLECTION ANALYSIS FOR CLOUD USERS” submitted by ABHISHEK MEHTA (17BCE002) & ANUJ SANGHAVI (17BCE005), towards the partial fulfillment of the requirements for the degree of Bachelor of Technology in Computer Engineering of Nirma University is the record of work carried out by them under my supervision and guidance. In my opinion, the submitted work has reached a level required for being accepted for examination.
Prof. Vivek Kumar Prasad Dr. Madhuri Bhavsar Assistant Professor HOD, Dept. of Computer Engineering Department of Computer Engineering, Institute of Technology, Institute of Technology, Nirma University, Nirma University, Ahmedabad Ahmedabad
I would like to express my deepest appreciation to all those who provided me the possibility to complete this report. I acknowledge with thanks, the support rendered by Prof. Vivek Kumar Prasad, under whose aegis I was able to complete the task in a given period of time. I also appreciate the constructive suggestions given by our friends to further enhance content of the report.
This report discusses about cloud computing and different aspects of cloud computing like its features, types. The problem of cloud resource management and allocation is solved by the use of cloud computing and analytics. Later, different techniques like ANN, SVM and random forest for prediction of cloud resource management are also discussed.
1 Introduction 1.1 General 1.2 Scope of Study 2 Literature Survey 2.1 General 3 Cloud Computing 3.1 General 3.2 Features of Cloud Computing 3.3 Types of Cloud Computing 3.4 Types of Clouds3.5 Pros and Cons 4. Analytics 4.1Applications of Analytics 5.Cloud Resource Management 6 Linear Regression 7 ANN 7.1 Working of ANN 7.2 Activation Function 8 SVM 9. Random Forest 10.Result 11.Summary and Conclusion 11.1 Summary 11.2 Conclusion 12 References 13 Appendix
Cloud computing involves using server and network for allocating resources dynamically to facilitate end users that are remote in nature. Due to increase in the number of users, there is an increase in the amount of requests which requires proper and efficient allocation of finite resources. To cope up with large number of inquiry of resources we use scheduling algorithms that are efficient for allocating resources. Thus reducing the cost and improvement in reliability and availability is the prime motive. We compare different scheduling algorithms on the basis of their efficiency, working, cost and feasibility to implant into cloud computing environment.
1.2 Scope of Study Our aim is to manage available resource in an efficient manner in order to be able to solve problems related to price, availability and over-loading . There is a requirement of an optimized model to solve these issues. So our motive is to use machine-learning and deep-learning techniques in construction of such model. Deep-learning mainly focuses on learning data representations that follow an equivalent approach as that of the humans. We make use of Linear Regression, ANN algorithm optimization, SVM algorithm and Random forest algorithm.
2 Literature Survey
2.1 General Literature Survey contains mainly abstracts from book and research paper. Papers that gave an insight on resource management using cloud computing and deep-learning algorithms were chosen in accordance to the topic that were to be covered in the project.
3 Cloud Computing
Cloud computing means instead of all computer hardware or software we are using sitting on our desktop it is provided to us as service by another company which is then accessed over internet. Exactly where hardware and software is located and how it all works doesn't matter to the user,it's just somewhere up in the "Cloud" that Internet represents.
3.2 Features of Cloud Computing Its Managed indicating that user no longer needs to worry how the service he/she is using is provided, user simply concentrates on whatever the job is and leaves the problem of providing reliable computing to someone else.
On Demand indicating that it is based on pay as you use model wherein user typically buys only the required amount of cloud services and modifies the service used depending upon requirement.
3.3 Types of Cloud Computing Infrastructure as a Service (IaaS) means providing the user with computer infrastructure such as virtual machines. Software as a Service (SaaS) means user uses complete application running on someone else's system. Platform as a Service (PaaS) means you develop applications using Web-based tools so they run on systems software and hardware provided by someone.
3.4 Types of Clouds
Private clouds service a single company and are either managed exclusively by that company or by one of the big cloud providers on their behalf. Public Cloud are provided to user companies such as Amazon,
Google, and IBM, all users share space and time on the same cloud and access it the same way. Hybrid Cloud combines the best of both the worlds public and private cloud.
3.5 Pros and Cons
• Lowers upfront costs and reduces infrastructure costs. • Only pay for what user uses. • Overall environmental benefit.
• Greater dependency on service providers. • Potential privacy and security risks of putting valuable data . • Dependency on a reliable Internet connection.
4 Analytics Itis finding out patterns in data and applying those patterns for effective decision making. Analytics can be understood as connector between data and effective decision making.
4.1 Applications of Analytics
Cloud Analytics: It can be defined as analysis using cloud computing. It uses tools and techniques to help companies extract information from massive data and present it in a way that is easily categorised and readily available via a web browser
Software Analytics: Software analytics is the process of collecting information about the way a piece of software is used and produced.
Digital Analytics: Digital analytics is an activity that transforms digital data into recommendations and optimizations. The keyword searched is tracked and that data is used for marketing purposes. A growing number of brands and marketing firms rely on digital analytics for their digital marketing assignments.
5. Cloud Resource Management It is challenging job since the complexity of the system makes it impossible to have accurate global state information. It is affected by unpredictable interactions with the environment, e.g., system failures, attacks. Cloud service providers are faced with large fluctuating loads which pose a problem to the cloud elasticity. Cloud Resource Management can be optimized using deep learning techniques. Here techniques like random forest, ANN, SVM have been discussed.
6 Linear Regression
Linear regression is a basic and commonly used type of predictive analysis. The overall idea of regression is to examine two things: (1) does a set of predictor variables do a good job in predicting an outcome (dependent) variable (2) Which variables in particular are significant predictors of the outcome variable, and in what way do they–indicated by the magnitude and sign of the beta estimates–impact the outcome variable. These regression estimates are used to explain the relationship between one dependent variable and one or more independent variables. The simplest form of the regression equation with one dependent and one independent variable is defined by the formula y = c + b*x, where y =
estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable.
ANN work on two basic building blocks of biological neural nets they are neurons and synapse. The structure of a node is shown as follows.
7.1 Working of ANN Flow of data is in strictly in one direction. Data is given into the input layer. It is then made to pass through hidden layer where an activation function is used and finally transferred to output
7.2 Activation Function
Activation function uses knowledge gained by the neuron into a meaningful quantity which is then propagated to other neurons in the network. It ranges between 0 and 1 and is widely used for probabilistic approach
f(x) = 1 / 1 + exp(-x)
SVM are learning techniques used to analyse data for analysis of regression. SVMs work on the concept of separating data in space by the use of a hyperplane with the max margin. Thus, SVM helps finding the most optimum hyperplane for the prediction of other data points.
9 Random Forests
It is a supervised learning algorithm that applies decision trees for prediction and classification. It provides great results even if hyper parameter tuning is not provided. It merges many decision tree and combines them for accurate prediction. Random Forests are one of the best when it comes to performance and accuracy. There training time is much less when compared with neural networks.
The epoch versus loss graph is given above. This graph makes it evident that when the epochs increase a drop in loss is observed.
11. Summary and Conclusions 11.1 Summary: Firstly, we have discussed about what cloud computing is. We also discussed features and types of cloud computing. Later analytics was discussed and as a result we merged analytics and cloud computing to solve the problem of cloud resource management and allocation. Then we
discussed the different techniques like ANN, SVM and random forest for prediction of cloud resource management. 11.2 Conclusion: Cloud computing is nowadays the most trending technology. But there are some flaws to this technique like inefficient resource allocation. Solution to this problem was found by the deep learning techniques like ANN, SVM and Random Forest.
1. Zhen Xiao, Senior Member, IEEE, Weijia Song, and Qi Chen, “Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment”, VOL. 24, NO. 6, JUNE 2013
2. L. Siegele, “Let It Rise: A Special Report on Corporate IT,” The Economist, vol. 389, pp. 3-16, Oct. 2008.
3. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the Art of Virtualization,” Proc ACM Symp. Operating Systems Principles (SOSP ’03), Oct. 2003.
4. “Amazon elastic compute cloud (Amazon EC2),” 2012. 5. C. Clark, K. Fraser, S. Hand, J.G. Hansen, E. Jul, C. Limpach, I. Pratt, and
A. Warfield, “Live Migration of Virtual Machines,” Proc. Symp.Networked Systems Design and Implementation (NSDI ’05), May 2005.
13 Appendix 1. https://azure.microsoft.com/en-in/overview/what-is-cloud-computing/ 2. https://intellipaat.com/blog/what-is-data-analytics/ 3. http://www.cs.jhu.edu/~terzis/twotier.html 4. https://towardsdatascience.com/the-random-forest-algorithm-d457d499ffcd
5. https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector- machine-example-code/