





































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Analysis planning can be an invaluable investment of time. It can help you select the most appropriate research methods and statistical tools. It will ensure ...
Typology: Study Guides, Projects, Research
1 / 45
This page cannot be seen from the preview
Don't miss anything!






































Measures ofassociation
table
Descriptive analysis
univariable
assess calculate
testing
bivariable design justify confidence intervals
software
confounding statistical
Created: 2013
Creating an Analysis Plan. Atlanta, GA: Centers for Disease Control and Prevention (CDC), 2013.
Introduction
The Creating an Analysis Plan training module is one of three modules that will provide you with the skills needed to analyze and interpret quantitative 1 noncommunicable disease (NCD) data. When you apply these quantitative analysis skills, you will turn data into information that can be used to make informed decisions on public health program and policy recommendations.
An analysis plan helps you think through the data you will collect, what you will use it for, and how you will analyze it. Creating an analysis plan is an important way to ensure that you collect all the data you need and that you use all the data you collect. Analysis planning can be an invaluable investment of time. It can help you select the most appropriate research methods and statistical tools. It will ensure that the way you collect your data and structure your database will help you get reliable analytic results.
Given information about a noncommunicable (NCD) health problem and a request for health-related information, you will be able to create an analysis plan that includes the following:
(^1) Collecting, analyzing, and reporting qualitative data is a valuable epidemiologic skill that requires careful consideration but will not be covered in this module.
Data into Action
Analyzing and Interpreting Large Datasets
Managing Data
Creating an Analysis Plan
o Calculating measures of association, and, o Assessing for confounding and effect measure modification.
The workbook should take between 6 and 7 hours to complete.
The workbook is designed for FETP residents who specialize in NCDs; however, you can also complete the module if you are working in the infectious disease area.
Before participating in this training module, you must complete training in:
The format of the Participant Workbook consists of 3 sections. You will read information about creating an analysis plan and complete 2 exercises to practice the skills and knowledge learned. At the end of the training module you will access the Activity Workbook and complete a skill assessment which combines all skills taught.
The following icons are used in this workbook:
Image Type Image Meaning
Activity Icon
Activity, exercise, assessment or case study that you will complete
Stop Icon
Stop and consult with your facilitator/mentor for further instruction
Tip Icon
Supplemental information, or key idea to note and remember
Section 1 : Overview of Data Analysis
There are several steps you must complete before you analyze data. For this training, these steps have been divided into two modules – Create an Analysis Plan and Manage Data. The main tasks are as follows:
1. Create an analysis plan - Identify research questions and/or hypotheses. - Select and access a dataset. - List inclusion/exclusion criteria. - Review the data to determine the variables to be used in the main analysis. - Select the appropriate statistical methods and software. - Create table shells. 2. Manage the data - Create a data dictionary. - Create a working copy of the dataset. - Clean the data in the working file. - Create an analysis file.
The focus of this and subsequent data analysis modules is on analyzing NCD data. In order to accurately create an analysis plan, you should be familiar with the steps in analyzing NCD data, for example, from a large complex survey. The steps are as follows:
1. Conduct univariable analyses : i. Review characteristics of the population Describe the sample population by person, place, and time characteristics. Summarize variables using population-level frequencies and calculate stratified frequencies across important sub- groups (if any). Determine the frequency distribution of these characteristics.
ii. Determine frequency of outcome variables The objective of many surveys is to estimate the prevalence of one or more health-related behaviors, practices, or outcomes, such as seat- belt use, smoking, hypertension, or diabetes. Univariable analyses should include the frequency distribution of these variables and also confidence intervals for the prevalence estimates.
2. Conduct bivariable analyses:
i. Display the data in two-variable tables: Sometimes the characterization of the population can be extended to two-variable tables, such as age by sex. For surveys in which the objective is to estimate prevalence, the data are often analyzed by population characteristics. For example, you can use two-variable tables to determine whether the prevalence varies by sex or education level. For analytic studies in which the objective is to quantify associations between exposures and outcomes, the two- variable table displays the core result, with rows representing levels of exposure and columns representing presence or absence of the outcome. ii. Compute and interpret measures of association: Determine the magnitude of association between an exposure variable and an outcome variable. If there are two or more populations, consider comparing their demographic data to determine whether they were different before the study/analysis was conducted. iii. Calculate confidence intervals and/or statistical significance: Utilize confidence intervals to quantify the variability of the data in your analysis. Use t-tests for continuous data, chi-square tests for categorical data, and other statistical tests as appropriate for the data to determine whether the results are “statistically significant.” iv. Assess for effect measure modifcation: Effect measure modification (also known as “effect modification”) is present when an effect measure such as sex, age or geographic location is different at several levels in an exposure-disease relationship. This is evaluated through statistical assessment of interaction between variables. v. Assess the effect of potential confounders:
a. Nominal: (define)_____________________________________
i. Nominal data with just two values can be called
ii. Example:
iii. Nominal data can be assigned a ___________________________ to facilitate analysis. b. Ordinal: (define)_______________________________________________
i. Example:__________________________________________
are:______________________________________________ a. Interval (define):
i. Example:
b. Ratio (define) :
Example:
The databases typically are representative of a population either through a census (all persons included) or a sample (number of people selected to represent the population). For example, NHANES 1999–2000 interviewed 9,965 persons in the United States, and the database includes hundreds of variables. Before attempting data analysis for large datasets, it is very important you locate the survey sampling methodology, questionnaire, data variable dictionary and any other supporting documentation.
Tip Because you most likely did not create the dataset, you must take the time to understand the dataset in its entirety.
Some of the questions you should answer about the dataset are:
(^2) Use weights to account for complex survey design (including oversampling), survey
non-response, and post-stratification. When a sample is weighted, it is representative of the population.
Determine the original purpose of the data and the sponsor or collector of the data. Then determine the study design and methods. Identify whether the data include:
Determine if the dataset contains the variables you need to answer the research questions. Assess how complete and recent are the data. Determine if you need to conduct a new survey to obtain the required data.
Verify that appropriate instruments were used to collect the data. Keep the questionnaire and codebook (data dictionary) accessible. You can use the data dictionary to learn the coding scheme and the variable names. (In the Managing Data module you will learn how to create a data dictionary.)
Describe the criteria you will use to determine which records to analyze. For example, if you have data from an entire country or region but you work in a particular district, your inclusion criteria might include “all records of participants residing in District X.” Similarly, if you are assigned to the Diabetes Unit and you are analyzing hospital discharge data, your inclusion criteria might be “all hospital discharge records with ICD-10 codes E10 to E14.” You might exclude readmissions within 3 days of a previous discharge (which is likely a continuation of the previous problem or a complication from the previous hospitalization rather than a new episode). If your intention is to look at discharge planning, you would exclude any patient that died while hospitalized.
The Analysis Plan should contain a list of variables 3 to analyze that will be kept in the analysis file (a computer file derived from the original data). For
(^3) The listing of variables (i.e., data dictionary) is taught in the Managing Data module.
There are different statistical methods you will use depending on the research questions. For example, if you want to estimate the prevalence of a behavioral risk factor such as smoking or an outcome such as hypertension, you would first conduct a univariable analysis, then stratify by subgroups. If you need to determine the magnitude of association between an exposure variable and an outcome variable, you will conduct bivariable analysis. There are many quantitative statistical software packages to use for the analysis. Some examples are:
Suppose, for example, you are planning to conduct descriptive analysis on the most recent BRFSS study. Because analysis of BRFSS can involve weights for clustering of samples, you will likely need statistical software that can account for this weighting. You will also need a person with training in complex analyses or someone who can help you learn these analysis techniques.
Stop
Let the facilitator or mentor know you are ready for the group discussion.
Use the space below to record any key points from the facilitator-led discussion:
Hypertension case study The past few decades have brought a new global phenomenon called the “nutrition transition” in many low and middle income countries^4 This transition includes a large shift from traditional diets and lifestyles to one
(^4) Popkin, Barry. (2002) Stages of the Nutrition Transition: Dynamic Global Shifts Appear
to be Accelerating. Available online as of 5/7/2008 at: http://www.cgdev.org/doc/events/9.10.07/Barry_Popkin_Presentation.pdf
deaths and about 50% of all NCDs (WHO, 2011a). In 2008, CVD caused an estimated 17 million deaths and led to 151 million DALYs. Common behavioral risk factors, including tobacco use, physical inactivity, unhealthy diet and the harmful use of alcohol, are responsible for approximately 80% of the global CVD burden. 8
Source _: Adapted from Global health risks: mortality and burden of disease attributable to selected major risks. Geneva, World Health Organization,
*Includes communicable diseases, maternal and prenatal conditions, and nutritional deficiencies._
Raised blood pressure, or hypertension 9 , is the leading risk factor for mortality and is ranked third as a cause of disability-adjusted life-years. 10 It
(^8) Gaziano TA, Bitton A, Anand S, Abrahams-Gessel S, Murphy A. Curr Probl Cardiol.
2010 Feb;35(2):72-115. (^9) Hypertension is defined as blood pressure 140/90mmHg or above most of the time. If
blood pressure is > 120/80 mmHg and < 140/90 mmHg, it is called pre-hypertension. Normal blood pressure is 120/80. (Chobanian, A, Bakris, G. et al. The Seventh Report of
0% 10% 20% 30% 40%
Diabetes
Chronic respiratory diseases
Other chronic diseases
Injuries
Cancer
Other conditions*
Cardiovascular diseases
Disease Mortality
Conditions
Global Burden of Disease
has been estimated that hypertension resulted in 51% of stroke deaths and 45% of coronary heart disease deaths in 2008. 11 Mean blood pressure has decreased significantly in nearly all high-income countries due to widespread diagnosis and treatment along with access to low-cost medications. In contrast, mean blood pressure has been stable or increasing in most African countries; approximately 40% (and up to 50%) of adults in many of these countries are estimated to have high blood pressure. Most of these people remain undiagnosed, although many could be treated with low-cost medications; this would significantly reduce the risk of death and disability from heart disease and stroke. 12
Effective prevention strategies for NCDs, and specifically for reducing the burden of hypertension and CVDs, do exist. For example, efforts to reduce sodium consumption have been identified as a cost-effective means to reducing and reversing hypertension. Increasing awareness and education of the consumer through campaigns to encourage dietary change within households with low-sodium alternatives as well as use of salt substitutes is one approach. In addition, working with industry to encourage voluntary reduction of salt content of processed foods and condiments by manufacturers is another proven strategy. Recent estimates indicate that implementation of a salt reduction program could avert 8.5 million deaths globally. Combining this with implementing the WHO Framework Convention on Tobacco Control could save an additional 5.5 million lives. This would be at a cost of less than USD $0.40 per person per year in low-income and lower middle-income countries, and
the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. JAMA. 2003;289:2560-91. www.nhlbi.nih.gov/guidelines/ hypertension / jnc 7full.pdf) (^10) Asaria P, Chisholm D, Mathers C, Ezzati M, Beaglehole R. Chronic disease prevention:
health effects and financial costs of strategies to reduce salt intake and control tobacco use. The Lancet 2007; 370(9604):2044-2053. (^11) World Health Statistics: A Snapshot of Global Health. World Health Organization, 2012.
Geneva, Switzerland. http://who.int/gho/publications/world_health_statistics/2012/en/index.html (^12) Yach D, Hawkes C, Gould C, Hofman KJ. The Global Burden of Chronic Diseases: Overcoming Impediments to Prevention and Control. JAMA. 2004;291(21):2616-2622. doi:10.1001/jama.291.21.2616.