Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad


Week 2: Distribution of a variable, Apuntes de Negocios Internacionales

Asignatura: data analysis, Profesor: walter walter, Carrera: International Business Economics, Universidad: UPF

Tipo: Apuntes

2012/2013

Subido el 20/10/2013

pepito-145
pepito-145 🇪🇸

4

(28)

10 documentos

1 / 6

Toggle sidebar

Esta página no es visible en la vista previa

¡No te pierdas las partes importantes!

bg1
Week 2: Distribution of a variable
EXPLORING DATA
1. Begin by examining each variable by itself. Then move on to study the
relationships among the variables.
2. Begin with a graph or graphs. Then add numerical summaries of
specic aspects of the data.
The proper choice of graph depends on the nature of the variable. To
examine a single variable, we usually want to display its distribution.
DISTRIBUTION OF A VARIABLE
The distribution of a variable tells us what values it takes and how often
it takes these values.
The values of a categorical variable are labels for the categories. The
distribution of a categorical variable lists the categories and gives either
the count or the percent of individuals who fall in each category. (It tells
us which values a variable takes and in which frequencies.)
Distribution of categorical variables
Pie charts show the distribution of a categorical variable as a “pie”
whose slices are sized by the counts or percents for the categories. Pie
charts are awkward to make by hand, but software will do the job for
you. A pie chart must include all the categories that make up a whole.
Use a pie chart only when you want to emphasize each category’s
relation to the whole.
Bar graphs represent each category as a bar. The bar heights show the
category counts or percents. Bar graphs are easier to make than pie
charts and also easier to read.The bars can ber ordered alphabetically by
eld of study (with “Other” at the end). Although it is often better to
arrange the bars in order of height. This helps us immediately see which
majors appear most often. Bar graphs are more exible than pie charts.
pf3
pf4
pf5

Vista previa parcial del texto

¡Descarga Week 2: Distribution of a variable y más Apuntes en PDF de Negocios Internacionales solo en Docsity!

Week 2: Distribution of a variable

EXPLORING DATA

  1. Begin by examining each variable by itself. Then move on to study the relationships among the variables.
  2. Begin with a graph or graphs. Then add numerical summaries of specific aspects of the data.

The proper choice of graph depends on the nature of the variable. To examine a single variable, we usually want to display its distribution.

DISTRIBUTION OF A VARIABLE

The distribution of a variable tells us what values it takes and how often it takes these values. The values of a categorical variable are labels for the categories. The distribution of a categorical variable lists the categories and gives either the count or the percent of individuals who fall in each category. (It tells us which values a variable takes and in which frequencies.)

Distribution of categorical variables

Pie charts show the distribution of a categorical variable as a “pie” whose slices are sized by the counts or percents for the categories. Pie charts are awkward to make by hand, but software will do the job for you. A pie chart must include all the categories that make up a whole. Use a pie chart only when you want to emphasize each category’s relation to the whole.

Bar graphs represent each category as a bar. The bar heights show the category counts or percents. Bar graphs are easier to make than pie charts and also easier to read.The bars can ber ordered alphabetically by field of study (with “Other” at the end). Although it is often better to arrange the bars in order of height. This helps us immediately see which majors appear most often. Bar graphs are more flexible than pie charts.

Both graphs can display the distribution of a categorical variable, but a bar graph can also compare any set of quantities that are measured in

the same units.

Bar graphs and pie charts are mainly tools for presenting data: they help your audience grasp data quickly. They are of limited use for data analysis because it is easy to understand data on a single categorical variable without a graph. We will move on to quantitative variables, where graphs are essential tools.

Distribution of numerical variables

Quantitative variables often take many values. The distribution tells us what values the variable takes and how often it takes these values. A graph of the distribution is clearer if nearby values are grouped together. The most common graph of the distribution of one quantitative variable is a histogram.

Making an histogram Step 1. Choose the classes. Divide the range of the data into classes of equal width. (intervals) Step 2. Count the individuals in each class Step 3. Draw the histogram. Mark the scale for the variable whose distribution you are displaying on the horizontal axis. The vertical axis contains the scale of counts. Each bar represents a class. The

To make a stemplot:

1. Separate the last digit (leaf) from the firsts (steam). Stems may have as many digits as needed, but each leaf contains only a single digit. 2. Write the stems in a vertical column with the smallest at the top, and draw a vertical line at the right of this column. Be sure to include all the stems needed to span the data, even when some will have no leaves. 3. Write each leaf in the row to the right of its stem, in increasing order out from the stem.

If the numbers given aren’t from the decimal unit system, they are decimals, we have to specify that it is mesured in 0.1 decimal system if