















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
It is machine learning project which checks the quality of red wine.
Typology: Study Guides, Projects, Research
1 / 23
This page cannot be seen from the preview
Don't miss anything!
















7th SEM. DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
The work embodied in the training report entitled, “Red Wine Quality” submitted to the department of Computer Science and Engineering at UIET, Panjab University Swami Sarvanand Giri Regional Centre, Hoshiarpur for the award of degree of Bachelor of Engineering, has been done by me. The training report is entirely based on my own work and not submitted elsewhere for the award of any other degree. All ideas and references have been duly acknowledged. Deepshikha Countersigned by: (Supervisor)
This project is based on the application of Logistic Regression. This project checks the quality of red wine. It trains the data model and then predicts the output. It also provides the accuracy of the predicted data. The dataset for model is downloaded from www.kaggle.com. Our project includes
- 1 Declaration S.No. Title Page No. - 2 Acknowledgement - 3 Abstract Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda® distribution that allows you to launch applications and easily manage conda packages, environments, and channels without using command-line commands. Navigator can search for packages on Anaconda Cloud or in a local Anaconda Repository. It is available for Windows, macOS, and Linux. To get Navigator, get the Navigator Cheat Sheet and install Anaconda. The Getting started with Navigator section shows how to start Navigator from the shortcuts or from a terminal window.
In order to run, many scientific packages depend on specific versions of other packages. Data scientists often use multiple versions of many packages and use multiple environments to separate these different versions.
The command-line program conda is both a package manager and an environment manager. This helps data scientists ensure that each version of each package has all the dependencies it requires and works correctly. Navigator is an easy, point-and-click way to work with packages and environments without needing to type conda commands in a terminal window. You can use it to find the packages you want, install them in an environment, run the packages, and update them – all inside Navigator.
The following applications are available by default in Navigator: JupyterLab Jupyter Notebook Spyder VSCode Glueviz Orange 3 App
Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects. Python is dynamically typed and garbage-collected. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python is often described as a "batteries included" language due to its comprehensive standard library. Most Python implementations (including CPython) include a read–eval–print loop (REPL), permitting them to function as a command line interpreter for which the user enters statements sequentially and receives results immediately. Other shells, including IDLE and IPython, add further abilities such as auto-completion, session state retention and syntax highlighting. As well as standard desktop integrated development environments, there are Web browser-based IDEs; SageMath (intended for developing science and math-related Python programs); PythonAnywhere, a browser-based IDE and hosting environment; and Canopy IDE, a commercial Python IDE emphasizing scientific computing.
What is Unsupervised Learning? The model learns through observation and finds structures in the data. Once the model is given a dataset, it automatically finds patterns and relationships in the dataset by creating clusters in it. What it cannot do is add labels to the cluster, like it cannot say this a group of apples or mangoes, but it will separate all the apples from mangoes. Suppose we presented images of apples, bananas and mangoes to the model, so what it does, based on some patterns and relationships it creates clusters and divides the dataset into those clusters. Now if a new data is fed to the model, it adds it to one of the created clusters. What is Reinforcement Learning? It is the ability of an agent to interact with the environment and find out what is the best outcome. It follows the concept of hit and trial method. The agent is rewarded or penalized with a point for a correct or a wrong answer, and on the basis of the positive reward points gained the model trains itself. And again once trained it gets ready to predict the new data presented to it.
What is Regression? Regression analysis is a powerful statistical analysis technique. A dependent variable of our interest is used to predict the values of other independent variables in a data-set. We come across regression in an intuitive way all the time. Like predicting the weather using the data-set of the weather conditions in the past. It uses many techniques to analyse and predict the outcome, but the emphasis is mainly on relationship between dependent variable and one or more independent variable. Logistic regression analysis predicts the outcome in a binary variable which has only two possible outcomes. Logistic Regression In Python It is a technique to analyse a data-set which has a dependent variable and one or more independent variables to predict the outcome in a binary variable, meaning it will have only two outcomes. The dependent variable is categorical in nature. Dependent variable is also referred as target variable and the independent variables are called the predictors.
The task here is to predict the quality of red wine on a scale of 0–10 given a set of features as inputs. I have solved it as a regression problem using Logistic Regression. Input variables are fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol. And the output variable (based on sensory data) is quality (score between 0 and 10). Dependencies The code is in python. Other than this, I installed the following libraries using pip.