# Data Analysis - Thesis - Project Seminar - Lecture Slides, Slides for Sociology. Aligarh Muslim University

PDF (165 KB)
23 pages
1000+Number of visits
Description
The course will concentrate on helping students to identify a social work problem that meets the expectation of adding to social work practice and knowledge. Key points of these lecture slides are: Data Analysis, Statis...
20 points
this document
Preview3 pages / 23
Data Analysis

Data Analysis

Statistics

docsity.com

Levels of Measurement

• Nominal – Categorical; no implied rankings among the categories. Also includes written observations and written responses from qualitative interviews or open- ended survey questions.

• Ordinal – Categorical data with implied rankings or data obtained through respondent ranking of categories. In some cases, a ranking process may be set up for a particular variable.

• Interval – No fixed zero point. Data is numerical, not categorical. Rank order among variables is explicit with an equal distance between points in the data set: -2, -1, 0, +1, + 2

• Ratio – Fixed zero point; otherwise the same as interval.

docsity.com

In general, type of data can be

inferred using the following the

criteria • Nominal – Categorical; no implied rankings among the categories. Also

includes written observations and written responses from qualitative interviews or open-ended survey questions.

• Ordinal – Categorical data with implied rankings or data obtained through respondent ranking of categories. In some cases, a ranking process may be set up for a particular variable.

• Interval – No fixed zero point. Data is numerical, not categorical. Rank order among variables is explicit with an equal distance between points in the data set: -2, -1, 0, +1, + 2

• Ratio – Fixed zero point; otherwise the same as interval.

• Any categorical data is either nominal or ordinal.

• All qualitative data is nominal.

• All scores on standardized scales are either interval or ratio. (Note: almost all the scales we use in social work, except IQ scores are ratio).

• The level of measurement determines what statistical method we can use.

docsity.com

In some cases, we can covert a

variable into another level of

measurement

We can change a variable from

ratio to either ordinal or nominal

docsity.com

Coverting Data (Use Recode in

SPSS)

Data Set Categories Occurrence

s

5 1 to 2 2

8 3 to 5 3

4 6 to 8 3

2 9 to 10 2

9

6

10

7

3

1

docsity.com

• We can covert it to another level of data;

we can’t do this with nominal data.

• People can simply write down information

about how they fit a particular attribute

(age, income).

• We have more statistical options with ratio

data. Inferential statistics requires that

dependent variables always be ratio.

docsity.com

Primary types of data analysis

are: • Qualitative

• Descriptive. Used to describe the distribution of a single variable or the relationship between two nominal variables (mean, frequencies, cross- tabulation)

• Inferential (Used to establish relationships among variables; assumes random sampling and a normal distribution)

• Nonparametric (Used to establish causation for small samples or data sets that are not normally distributed)

docsity.com

Much of what you will use in your

research will be descriptive

statistics. For example, the most basic type of descriptive

statistic is the frequency. Frequencies are the number of times a specific value or data within a specific category occurs.

Most often we convert frequencies to percentages – Formula is f/n, where f = frequency and n = the total number of values in a data set. For example, the if the age 25 occurs 5 times in a data set of 50 = 5/50 = 10%.

docsity.com

Examples of use of frequency data

• 40% of respondents are male.

• The mean level of income was \$35,000

• 40% of all female voters cast their vote for Arnold compared to 52% of the male voters.

*Note: the other descriptive statistic we use is the standard deviation. It describes the degree to which data points vary from the mean of a distribution. In a research article, you will see the standard deviation included with the mean.

docsity.com

Application of Standard Deviation

(SD)

• Mean income was \$35,000 with SD = \$ 5,000

• M = \$23,000, SD = \$500

• This is interpreted as there being less variability in income among members of the second data set. That is scores are grouped more tightly around the mean.

docsity.com

Normal Distribution

• Mean=median=mode

• Bell shape curve

• 50% of scores fall below and 50% fall above the mean.

• Data set can be assessed in terms of how much data falls within one, two or three standard deviations from the mean.

• Generally is unimodal although some distributions may be bimodal or trimodal.

• Theoretically, at least, inferential statistics may only be used when a set of scores conform to a normal distribution. However, this assumption is often violated.

docsity.com

Frequencies used in almost all types of data analysis.

Frequency tables can be formatted in a variety of ways.

(Some analysis add value and cumulative percent)

Age Number Percent

0-18 10 20.0%

19-34 15 30.0%

35-64 15 30.0%

65 &

over

10 20.0%

Total 50 100% docsity.com

We can also use tables to determine if there

is a relationship between two nominal

variables, although we can not assess the

strength of the relationship. This is called a

cross-tabulation

Starting Salary Female Male

\$20,000 to

\$29,999

19 (70%) 5 (23%)

\$30,000 to

\$39,999

7 (26%) 14 (64%)

\$40,000 to

\$54,999

1 (4%) 3 (13%)

Total 27 (100%) 22 (100%) docsity.com

Categories in both Qualitative

Analysis must be:

• Mutually exclusive (no overlap)

• Exhaustive (all possible categories should

be included)

docsity.com

Cross-tabulation is the basis for

chi-square. Chi-square:

• Measures the strength of the relationship between the two variables in the table.

• Is not technically a inferential statistic – does not require a normal distribution – but is often grouped with inferential statistics.

• Usually requires a random sample although data collected from everyone in a population group is usually considered sufficient for a chi-square analysis.

docsity.com

Means can also be used to make

comparisons among groups.

Income Male

M = \$35,000 SD =

\$5,000

Female

M = \$22,000 SD =

\$750

docsity.com

You may use means on your

project

• If your variables include ratio data

• If you want to compare groups on a ratio

variable

• If you want to summarize scores on a

standardized instrument or a likert scale

docsity.com

Some inferential statistics look at the strength of

the relationship between mean scores on ratio

level variables and membership in particular

demographic group

• T-tests (two group comparisons)

• Analysis of variance (compares three or

more groups)

Answers question: Is the difference in

means between the two (or more) groups

large enough to be statistically significant?

docsity.com

We also use correlations to measure the

strength of a relationship between two

variables. Correlations can only be used

• To assess the strength of two ratio level

variables.

• To measure associations rather than

cause and effect relationships.

• With data sets in which there are 30 or

more observations.

docsity.com

Inferential statistics commonly used

include: • Independent T-test (compares two groups on one

variable). (Test statistic = T)

• Paired sampled t-test (compares ratio level scores on pre and post test data). (Test statistic = T)

• ANOVA – compares three or more groups on ratio data (Test statistic = F)

• Correlation – measures the association between two ratio level variables (Test statistic = R)

• Regression analysis (dependent ratio variable – can include more than one independent variable (can be a combination of ratio, ordinal, and nominal data in the regression model). (Test statistic is R2, F, or partial correlation coefficients)

docsity.com

Inferential Statistics require that we assess

the probability that there is actually a causal

relationship between two variables. • We state the research & null hypotheses.

• State the degree to which we will risk being wrong about whether or not a relationship actually exists between two variables (level of significance – usually under .10)

• Choose an appropriate statistical test and compute it.

• Compare the probability level on your computer print out to the level of significance. If the p. value is lower than your confidence level, then reject the null hypothesis. If the p value is higher than the confidence level, accept the null hypothesis.

docsity.com

For example:

• There is a positive relationship between scores on the self-esteem scale and depression. Level of significance is .05. R = .75, p = .01. Reject Null Hypothesis and accept the Research Hypothesis.

• Women will have higher test scores than men. Level of significance = .10. T = .30, p. = .60. Accept the Null Hypothesis and Reject the Research Hypothesis.

docsity.com

Other info

• Chi-square is interpreted in the same way as inferential statistics.

• Most statistics books contain tables that let you determine p values if you calculate test statistics by hand.

• SPSS print outs always contain p values for inferential statistics.

• Theoretical assumptions are often violated in research articles.

• Sample size determines if a relationship between two or more variables is large enough to be statistically significant.

• Relationships between two variables can be either positive or negative. High positive relationships are close to +1.00 and high negative relationships are close to – 1.00.

docsity.com