Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center
  • Introduction

Data collection

Data analysis at the Armstrong Flight Research Center in Palmdale, California

data analysis

Our editors will review what you’ve submitted and determine whether to revise the article.

  • Academia - Data Analysis
  • U.S. Department of Health and Human Services - Office of Research Integrity - Data Analysis
  • Chemistry LibreTexts - Data Analysis
  • IBM - What is Exploratory Data Analysis?
  • Table Of Contents

Data analysis at the Armstrong Flight Research Center in Palmdale, California

data analysis , the process of systematically collecting, cleaning, transforming, describing, modeling, and interpreting data , generally employing statistical techniques. Data analysis is an important part of both scientific research and business, where demand has grown in recent years for data-driven decision making . Data analysis techniques are used to gain useful insights from datasets, which can then be used to make operational decisions or guide future research . With the rise of “ big data ,” the storage of vast quantities of data in large databases and data warehouses, there is increasing need to apply data analysis techniques to generate insights about volumes of data too large to be manipulated by instruments of low information-processing capacity.

Datasets are collections of information. Generally, data and datasets are themselves collected to help answer questions, make decisions, or otherwise inform reasoning. The rise of information technology has led to the generation of vast amounts of data of many kinds, such as text, pictures, videos, personal information, account data, and metadata, the last of which provide information about other data. It is common for apps and websites to collect data about how their products are used or about the people using their platforms. Consequently, there is vastly more data being collected today than at any other time in human history. A single business may track billions of interactions with millions of consumers at hundreds of locations with thousands of employees and any number of products. Analyzing that volume of data is generally only possible using specialized computational and statistical techniques.

The desire for businesses to make the best use of their data has led to the development of the field of business intelligence , which covers a variety of tools and techniques that allow businesses to perform data analysis on the information they collect.

For data to be analyzed, it must first be collected and stored. Raw data must be processed into a format that can be used for analysis and be cleaned so that errors and inconsistencies are minimized. Data can be stored in many ways, but one of the most useful is in a database . A database is a collection of interrelated data organized so that certain records (collections of data related to a single entity) can be retrieved on the basis of various criteria . The most familiar kind of database is the relational database , which stores data in tables with rows that represent records (tuples) and columns that represent fields (attributes). A query is a command that retrieves a subset of the information in the database according to certain criteria. A query may retrieve only records that meet certain criteria, or it may join fields from records across multiple tables by use of a common field.

Frequently, data from many sources is collected into large archives of data called data warehouses. The process of moving data from its original sources (such as databases) to a centralized location (generally a data warehouse) is called ETL (which stands for extract , transform , and load ).

  • The extraction step occurs when you identify and copy or export the desired data from its source, such as by running a database query to retrieve the desired records.
  • The transformation step is the process of cleaning the data so that they fit the analytical need for the data and the schema of the data warehouse. This may involve changing formats for certain fields, removing duplicate records, or renaming fields, among other processes.
  • Finally, the clean data are loaded into the data warehouse, where they may join vast amounts of historical data and data from other sources.

After data are effectively collected and cleaned, they can be analyzed with a variety of techniques. Analysis often begins with descriptive and exploratory data analysis. Descriptive data analysis uses statistics to organize and summarize data, making it easier to understand the broad qualities of the dataset. Exploratory data analysis looks for insights into the data that may arise from descriptions of distribution, central tendency, or variability for a single data field. Further relationships between data may become apparent by examining two fields together. Visualizations may be employed during analysis, such as histograms (graphs in which the length of a bar indicates a quantity) or stem-and-leaf plots (which divide data into buckets, or “stems,” with individual data points serving as “leaves” on the stem).

meaning of analysis in research

Data analysis frequently goes beyond descriptive analysis to predictive analysis, making predictions about the future using predictive modeling techniques. Predictive modeling uses machine learning , regression analysis methods (which mathematically calculate the relationship between an independent variable and a dependent variable), and classification techniques to identify trends and relationships among variables. Predictive analysis may involve data mining , which is the process of discovering interesting or useful patterns in large volumes of information. Data mining often involves cluster analysis , which tries to find natural groupings within data, and anomaly detection , which detects instances in data that are unusual and stand out from other patterns. It may also look for rules within datasets, strong relationships among variables in the data.

Data Analysis

  • Introduction to Data Analysis
  • Quantitative Analysis Tools
  • Qualitative Analysis Tools
  • Mixed Methods Analysis
  • Geospatial Analysis
  • Further Reading

Profile Photo

What is Data Analysis?

According to the federal government, data analysis is "the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data" ( Responsible Conduct in Data Management ). Important components of data analysis include searching for patterns, remaining unbiased in drawing inference from data, practicing responsible  data management , and maintaining "honest and accurate analysis" ( Responsible Conduct in Data Management ). 

In order to understand data analysis further, it can be helpful to take a step back and understand the question "What is data?". Many of us associate data with spreadsheets of numbers and values, however, data can encompass much more than that. According to the federal government, data is "The recorded factual material commonly accepted in the scientific community as necessary to validate research findings" ( OMB Circular 110 ). This broad definition can include information in many formats. 

Some examples of types of data are as follows:

  • Photographs 
  • Hand-written notes from field observation
  • Machine learning training data sets
  • Ethnographic interview transcripts
  • Sheet music
  • Scripts for plays and musicals 
  • Observations from laboratory experiments ( CMU Data 101 )

Thus, data analysis includes the processing and manipulation of these data sources in order to gain additional insight from data, answer a research question, or confirm a research hypothesis. 

Data analysis falls within the larger research data lifecycle, as seen below. 

( University of Virginia )

Why Analyze Data?

Through data analysis, a researcher can gain additional insight from data and draw conclusions to address the research question or hypothesis. Use of data analysis tools helps researchers understand and interpret data. 

What are the Types of Data Analysis?

Data analysis can be quantitative, qualitative, or mixed methods. 

Quantitative research typically involves numbers and "close-ended questions and responses" ( Creswell & Creswell, 2018 , p. 3). Quantitative research tests variables against objective theories, usually measured and collected on instruments and analyzed using statistical procedures ( Creswell & Creswell, 2018 , p. 4). Quantitative analysis usually uses deductive reasoning. 

Qualitative  research typically involves words and "open-ended questions and responses" ( Creswell & Creswell, 2018 , p. 3). According to Creswell & Creswell, "qualitative research is an approach for exploring and understanding the meaning individuals or groups ascribe to a social or human problem" ( 2018 , p. 4). Thus, qualitative analysis usually invokes inductive reasoning. 

Mixed methods  research uses methods from both quantitative and qualitative research approaches. Mixed methods research works under the "core assumption... that the integration of qualitative and quantitative data yields additional insight beyond the information provided by either the quantitative or qualitative data alone" ( Creswell & Creswell, 2018 , p. 4). 

  • Next: Planning >>
  • Last Updated: Aug 28, 2024 1:41 PM
  • URL: https://guides.library.georgetown.edu/data-analysis

Creative Commons

meaning of analysis in research

Statistical Analysis in Research: Meaning, Methods and Types

Home » Videos » Statistical Analysis in Research: Meaning, Methods and Types

The scientific method is an empirical approach to acquiring new knowledge by making skeptical observations and analyses to develop a meaningful interpretation. It is the basis of research and the primary pillar of modern science. Researchers seek to understand the relationships between factors associated with the phenomena of interest. In some cases, research works with vast chunks of data, making it difficult to observe or manipulate each data point. As a result, statistical analysis in research becomes a means of evaluating relationships and interconnections between variables with tools and analytical techniques for working with large data. Since researchers use statistical power analysis to assess the probability of finding an effect in such an investigation, the method is relatively accurate. Hence, statistical analysis in research eases analytical methods by focusing on the quantifiable aspects of phenomena.

What is Statistical Analysis in Research? A Simplified Definition

Statistical analysis uses quantitative data to investigate patterns, relationships, and patterns to understand real-life and simulated phenomena. The approach is a key analytical tool in various fields, including academia, business, government, and science in general. This statistical analysis in research definition implies that the primary focus of the scientific method is quantitative research. Notably, the investigator targets the constructs developed from general concepts as the researchers can quantify their hypotheses and present their findings in simple statistics.

When a business needs to learn how to improve its product, they collect statistical data about the production line and customer satisfaction. Qualitative data is valuable and often identifies the most common themes in the stakeholders’ responses. On the other hand, the quantitative data creates a level of importance, comparing the themes based on their criticality to the affected persons. For instance, descriptive statistics highlight tendency, frequency, variation, and position information. While the mean shows the average number of respondents who value a certain aspect, the variance indicates the accuracy of the data. In any case, statistical analysis creates simplified concepts used to understand the phenomenon under investigation. It is also a key component in academia as the primary approach to data representation, especially in research projects, term papers and dissertations. 

Most Useful Statistical Analysis Methods in Research

Using statistical analysis methods in research is inevitable, especially in academic assignments, projects, and term papers. It’s always advisable to seek assistance from your professor or you can try research paper writing by CustomWritings before you start your academic project or write statistical analysis in research paper. Consulting an expert when developing a topic for your thesis or short mid-term assignment increases your chances of getting a better grade. Most importantly, it improves your understanding of research methods with insights on how to enhance the originality and quality of personalized essays. Professional writers can also help select the most suitable statistical analysis method for your thesis, influencing the choice of data and type of study.

Descriptive Statistics

Descriptive statistics is a statistical method summarizing quantitative figures to understand critical details about the sample and population. A description statistic is a figure that quantifies a specific aspect of the data. For instance, instead of analyzing the behavior of a thousand students, research can identify the most common actions among them. By doing this, the person utilizes statistical analysis in research, particularly descriptive statistics.

  • Measures of central tendency . Central tendency measures are the mean, mode, and media or the averages denoting specific data points. They assess the centrality of the probability distribution, hence the name. These measures describe the data in relation to the center.
  • Measures of frequency . These statistics document the number of times an event happens. They include frequency, count, ratios, rates, and proportions. Measures of frequency can also show how often a score occurs.
  • Measures of dispersion/variation . These descriptive statistics assess the intervals between the data points. The objective is to view the spread or disparity between the specific inputs. Measures of variation include the standard deviation, variance, and range. They indicate how the spread may affect other statistics, such as the mean.
  • Measures of position . Sometimes researchers can investigate relationships between scores. Measures of position, such as percentiles, quartiles, and ranks, demonstrate this association. They are often useful when comparing the data to normalized information.

Inferential Statistics

Inferential statistics is critical in statistical analysis in quantitative research. This approach uses statistical tests to draw conclusions about the population. Examples of inferential statistics include t-tests, F-tests, ANOVA, p-value, Mann-Whitney U test, and Wilcoxon W test. This

Common Statistical Analysis in Research Types

Although inferential and descriptive statistics can be classified as types of statistical analysis in research, they are mostly considered analytical methods. Types of research are distinguishable by the differences in the methodology employed in analyzing, assembling, classifying, manipulating, and interpreting data. The categories may also depend on the type of data used.

Predictive Analysis

Predictive research analyzes past and present data to assess trends and predict future events. An excellent example of predictive analysis is a market survey that seeks to understand customers’ spending habits to weigh the possibility of a repeat or future purchase. Such studies assess the likelihood of an action based on trends.

Prescriptive Analysis

On the other hand, a prescriptive analysis targets likely courses of action. It’s decision-making research designed to identify optimal solutions to a problem. Its primary objective is to test or assess alternative measures.

Causal Analysis

Causal research investigates the explanation behind the events. It explores the relationship between factors for causation. Thus, researchers use causal analyses to analyze root causes, possible problems, and unknown outcomes.

Mechanistic Analysis

This type of research investigates the mechanism of action. Instead of focusing only on the causes or possible outcomes, researchers may seek an understanding of the processes involved. In such cases, they use mechanistic analyses to document, observe, or learn the mechanisms involved.

Exploratory Data Analysis

Similarly, an exploratory study is extensive with a wider scope and minimal limitations. This type of research seeks insight into the topic of interest. An exploratory researcher does not try to generalize or predict relationships. Instead, they look for information about the subject before conducting an in-depth analysis.

The Importance of Statistical Analysis in Research

As a matter of fact, statistical analysis provides critical information for decision-making. Decision-makers require past trends and predictive assumptions to inform their actions. In most cases, the data is too complex or lacks meaningful inferences. Statistical tools for analyzing such details help save time and money, deriving only valuable information for assessment. An excellent statistical analysis in research example is a randomized control trial (RCT) for the Covid-19 vaccine. You can download a sample of such a document online to understand the significance such analyses have to the stakeholders. A vaccine RCT assesses the effectiveness, side effects, duration of protection, and other benefits. Hence, statistical analysis in research is a helpful tool for understanding data.

Sources and links For the articles and videos I use different databases, such as Eurostat, OECD World Bank Open Data, Data Gov and others. You are free to use the video I have made on your site using the link or the embed code. If you have any questions, don’t hesitate to write to me!

Support statistics and data, if you have reached the end and like this project, you can donate a coffee to “statistics and data”..

Copyright © 2022 Statistics and Data

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Basic statistical tools in research and data analysis

Zulfiqar ali.

Department of Anaesthesiology, Division of Neuroanaesthesiology, Sheri Kashmir Institute of Medical Sciences, Soura, Srinagar, Jammu and Kashmir, India

S Bala Bhaskar

1 Department of Anaesthesiology and Critical Care, Vijayanagar Institute of Medical Sciences, Bellary, Karnataka, India

Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

INTRODUCTION

Statistics is a branch of science that deals with the collection, organisation, analysis of data and drawing of inferences from the samples to the whole population.[ 1 ] This requires a proper design of the study, an appropriate selection of the study sample and choice of a suitable statistical test. An adequate knowledge of statistics is necessary for proper designing of an epidemiological study or a clinical trial. Improper statistical methods may result in erroneous conclusions which may lead to unethical practice.[ 2 ]

Variable is a characteristic that varies from one individual member of population to another individual.[ 3 ] Variables such as height and weight are measured by some type of scale, convey quantitative information and are called as quantitative variables. Sex and eye colour give qualitative information and are called as qualitative variables[ 3 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g001.jpg

Classification of variables

Quantitative variables

Quantitative or numerical data are subdivided into discrete and continuous measurements. Discrete numerical data are recorded as a whole number such as 0, 1, 2, 3,… (integer), whereas continuous data can assume any value. Observations that can be counted constitute the discrete data and observations that can be measured constitute the continuous data. Examples of discrete data are number of episodes of respiratory arrests or the number of re-intubations in an intensive care unit. Similarly, examples of continuous data are the serial serum glucose levels, partial pressure of oxygen in arterial blood and the oesophageal temperature.

A hierarchical scale of increasing precision can be used for observing and recording the data which is based on categorical, ordinal, interval and ratio scales [ Figure 1 ].

Categorical or nominal variables are unordered. The data are merely classified into categories and cannot be arranged in any particular order. If only two categories exist (as in gender male and female), it is called as a dichotomous (or binary) data. The various causes of re-intubation in an intensive care unit due to upper airway obstruction, impaired clearance of secretions, hypoxemia, hypercapnia, pulmonary oedema and neurological impairment are examples of categorical variables.

Ordinal variables have a clear ordering between the variables. However, the ordered data may not have equal intervals. Examples are the American Society of Anesthesiologists status or Richmond agitation-sedation scale.

Interval variables are similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. A good example of an interval scale is the Fahrenheit degree scale used to measure temperature. With the Fahrenheit scale, the difference between 70° and 75° is equal to the difference between 80° and 85°: The units of measurement are equal throughout the full range of the scale.

Ratio scales are similar to interval scales, in that equal differences between scale values have equal quantitative meaning. However, ratio scales also have a true zero point, which gives them an additional property. For example, the system of centimetres is an example of a ratio scale. There is a true zero point and the value of 0 cm means a complete absence of length. The thyromental distance of 6 cm in an adult may be twice that of a child in whom it may be 3 cm.

STATISTICS: DESCRIPTIVE AND INFERENTIAL STATISTICS

Descriptive statistics[ 4 ] try to describe the relationship between variables in a sample or population. Descriptive statistics provide a summary of data in the form of mean, median and mode. Inferential statistics[ 4 ] use a random sample of data taken from a population to describe and make inferences about the whole population. It is valuable when it is not possible to examine each member of an entire population. The examples if descriptive and inferential statistics are illustrated in Table 1 .

Example of descriptive and inferential statistics

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g002.jpg

Descriptive statistics

The extent to which the observations cluster around a central location is described by the central tendency and the spread towards the extremes is described by the degree of dispersion.

Measures of central tendency

The measures of central tendency are mean, median and mode.[ 6 ] Mean (or the arithmetic average) is the sum of all the scores divided by the number of scores. Mean may be influenced profoundly by the extreme variables. For example, the average stay of organophosphorus poisoning patients in ICU may be influenced by a single patient who stays in ICU for around 5 months because of septicaemia. The extreme values are called outliers. The formula for the mean is

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g003.jpg

where x = each observation and n = number of observations. Median[ 6 ] is defined as the middle of a distribution in a ranked data (with half of the variables in the sample above and half below the median value) while mode is the most frequently occurring variable in a distribution. Range defines the spread, or variability, of a sample.[ 7 ] It is described by the minimum and maximum values of the variables. If we rank the data and after ranking, group the observations into percentiles, we can get better information of the pattern of spread of the variables. In percentiles, we rank the observations into 100 equal parts. We can then describe 25%, 50%, 75% or any other percentile amount. The median is the 50 th percentile. The interquartile range will be the observations in the middle 50% of the observations about the median (25 th -75 th percentile). Variance[ 7 ] is a measure of how spread out is the distribution. It gives an indication of how close an individual observation clusters about the mean value. The variance of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g004.jpg

where σ 2 is the population variance, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g005.jpg

where s 2 is the sample variance, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. The formula for the variance of a population has the value ‘ n ’ as the denominator. The expression ‘ n −1’ is known as the degrees of freedom and is one less than the number of parameters. Each observation is free to vary, except the last one which must be a defined value. The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used. The square root of the variance is the standard deviation (SD).[ 8 ] The SD of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g006.jpg

where σ is the population SD, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g007.jpg

where s is the sample SD, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. An example for calculation of variation and SD is illustrated in Table 2 .

Example of mean, variance, standard deviation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g008.jpg

Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with symmetrical positive and negative deviations about this point.[ 1 ] The standard normal distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about 68% of the scores are within 1 SD of the mean. Around 95% of the scores are within 2 SDs of the mean and 99% within 3 SDs of the mean [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g009.jpg

Normal distribution curve

Skewed distribution

It is a distribution with an asymmetry of the variables about its mean. In a negatively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the right of Figure 1 . In a positively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the left of the figure leading to a longer right tail.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g010.jpg

Curves showing negatively skewed and positively skewed distribution

Inferential statistics

In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. Hypothesis tests are thus procedures for making rational decisions about the reality of observed effects.

Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty).

In inferential statistics, the term ‘null hypothesis’ ( H 0 ‘ H-naught ,’ ‘ H-null ’) denotes that there is no relationship (difference) between the population variables in question.[ 9 ]

Alternative hypothesis ( H 1 and H a ) denotes that a statement between the variables is expected to be true.[ 9 ]

The P value (or the calculated probability) is the probability of the event occurring by chance if the null hypothesis is true. The P value is a numerical between 0 and 1 and is interpreted by researchers in deciding whether to reject or retain the null hypothesis [ Table 3 ].

P values with interpretation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g011.jpg

If P value is less than the arbitrarily chosen value (known as α or the significance level), the null hypothesis (H0) is rejected [ Table 4 ]. However, if null hypotheses (H0) is incorrectly rejected, this is known as a Type I error.[ 11 ] Further details regarding alpha error, beta error and sample size calculation and factors influencing them are dealt with in another section of this issue by Das S et al .[ 12 ]

Illustration for null hypothesis

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g012.jpg

PARAMETRIC AND NON-PARAMETRIC TESTS

Numerical data (quantitative variables) that are normally distributed are analysed with parametric tests.[ 13 ]

Two most basic prerequisites for parametric statistical analysis are:

  • The assumption of normality which specifies that the means of the sample group are normally distributed
  • The assumption of equal variance which specifies that the variances of the samples and of their corresponding population are equal.

However, if the distribution of the sample is skewed towards one side or the distribution is unknown due to the small sample size, non-parametric[ 14 ] statistical techniques are used. Non-parametric tests are used to analyse ordinal and categorical data.

Parametric tests

The parametric tests assume that the data are on a quantitative (numerical) scale, with a normal distribution of the underlying population. The samples have the same variance (homogeneity of variances). The samples are randomly drawn from the population, and the observations within a group are independent of each other. The commonly used parametric tests are the Student's t -test, analysis of variance (ANOVA) and repeated measures ANOVA.

Student's t -test

Student's t -test is used to test the null hypothesis that there is no difference between the means of the two groups. It is used in three circumstances:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g013.jpg

where X = sample mean, u = population mean and SE = standard error of mean

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g014.jpg

where X 1 − X 2 is the difference between the means of the two groups and SE denotes the standard error of the difference.

  • To test if the population means estimated by two dependent samples differ significantly (the paired t -test). A usual setting for paired t -test is when measurements are made on the same subjects before and after a treatment.

The formula for paired t -test is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g015.jpg

where d is the mean difference and SE denotes the standard error of this difference.

The group variances can be compared using the F -test. The F -test is the ratio of variances (var l/var 2). If F differs significantly from 1.0, then it is concluded that the group variances differ significantly.

Analysis of variance

The Student's t -test cannot be used for comparison of three or more groups. The purpose of ANOVA is to test if there is any significant difference between the means of two or more groups.

In ANOVA, we study two variances – (a) between-group variability and (b) within-group variability. The within-group variability (error variance) is the variation that cannot be accounted for in the study design. It is based on random differences present in our samples.

However, the between-group (or effect variance) is the result of our treatment. These two estimates of variances are compared using the F-test.

A simplified formula for the F statistic is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g016.jpg

where MS b is the mean squares between the groups and MS w is the mean squares within groups.

Repeated measures analysis of variance

As with ANOVA, repeated measures ANOVA analyses the equality of means of three or more groups. However, a repeated measure ANOVA is used when all variables of a sample are measured under different conditions or at different points in time.

As the variables are measured from a sample at different points of time, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: The data violate the ANOVA assumption of independence. Hence, in the measurement of repeated dependent variables, repeated measures ANOVA should be used.

Non-parametric tests

When the assumptions of normality are not met, and the sample means are not normally, distributed parametric tests can lead to erroneous results. Non-parametric tests (distribution-free test) are used in such situation as they do not require the normality assumption.[ 15 ] Non-parametric tests may fail to detect a significant difference when compared with a parametric test. That is, they usually have less power.

As is done for the parametric tests, the test statistic is compared with known values for the sampling distribution of that statistic and the null hypothesis is accepted or rejected. The types of non-parametric analysis techniques and the corresponding parametric analysis techniques are delineated in Table 5 .

Analogue of parametric and non-parametric tests

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g017.jpg

Median test for one sample: The sign test and Wilcoxon's signed rank test

The sign test and Wilcoxon's signed rank test are used for median tests of one sample. These tests examine whether one instance of sample data is greater or smaller than the median reference value.

This test examines the hypothesis about the median θ0 of a population. It tests the null hypothesis H0 = θ0. When the observed value (Xi) is greater than the reference value (θ0), it is marked as+. If the observed value is smaller than the reference value, it is marked as − sign. If the observed value is equal to the reference value (θ0), it is eliminated from the sample.

If the null hypothesis is true, there will be an equal number of + signs and − signs.

The sign test ignores the actual values of the data and only uses + or − signs. Therefore, it is useful when it is difficult to measure the values.

Wilcoxon's signed rank test

There is a major limitation of sign test as we lose the quantitative information of the given data and merely use the + or – signs. Wilcoxon's signed rank test not only examines the observed values in comparison with θ0 but also takes into consideration the relative sizes, adding more statistical power to the test. As in the sign test, if there is an observed value that is equal to the reference value θ0, this observed value is eliminated from the sample.

Wilcoxon's rank sum test ranks all data points in order, calculates the rank sum of each sample and compares the difference in the rank sums.

Mann-Whitney test

It is used to test the null hypothesis that two samples have the same median or, alternatively, whether observations in one sample tend to be larger than observations in the other.

Mann–Whitney test compares all data (xi) belonging to the X group and all data (yi) belonging to the Y group and calculates the probability of xi being greater than yi: P (xi > yi). The null hypothesis states that P (xi > yi) = P (xi < yi) =1/2 while the alternative hypothesis states that P (xi > yi) ≠1/2.

Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov (KS) test was designed as a generic method to test whether two random samples are drawn from the same distribution. The null hypothesis of the KS test is that both distributions are identical. The statistic of the KS test is a distance between the two empirical distributions, computed as the maximum absolute difference between their cumulative curves.

Kruskal-Wallis test

The Kruskal–Wallis test is a non-parametric test to analyse the variance.[ 14 ] It analyses if there is any difference in the median values of three or more independent samples. The data values are ranked in an increasing order, and the rank sums calculated followed by calculation of the test statistic.

Jonckheere test

In contrast to Kruskal–Wallis test, in Jonckheere test, there is an a priori ordering that gives it a more statistical power than the Kruskal–Wallis test.[ 14 ]

Friedman test

The Friedman test is a non-parametric test for testing the difference between several related samples. The Friedman test is an alternative for repeated measures ANOVAs which is used when the same parameter has been measured under different conditions on the same subjects.[ 13 ]

Tests to analyse the categorical data

Chi-square test, Fischer's exact test and McNemar's test are used to analyse the categorical or nominal variables. The Chi-square test compares the frequencies and tests whether the observed data differ significantly from that of the expected data if there were no differences between groups (i.e., the null hypothesis). It is calculated by the sum of the squared difference between observed ( O ) and the expected ( E ) data (or the deviation, d ) divided by the expected data by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g018.jpg

A Yates correction factor is used when the sample size is small. Fischer's exact test is used to determine if there are non-random associations between two categorical variables. It does not assume random sampling, and instead of referring a calculated statistic to a sampling distribution, it calculates an exact probability. McNemar's test is used for paired nominal data. It is applied to 2 × 2 table with paired-dependent samples. It is used to determine whether the row and column frequencies are equal (that is, whether there is ‘marginal homogeneity’). The null hypothesis is that the paired proportions are equal. The Mantel-Haenszel Chi-square test is a multivariate test as it analyses multiple grouping variables. It stratifies according to the nominated confounding variables and identifies any that affects the primary outcome variable. If the outcome variable is dichotomous, then logistic regression is used.

SOFTWARES AVAILABLE FOR STATISTICS, SAMPLE SIZE CALCULATION AND POWER ANALYSIS

Numerous statistical software systems are available currently. The commonly used software systems are Statistical Package for the Social Sciences (SPSS – manufactured by IBM corporation), Statistical Analysis System ((SAS – developed by SAS Institute North Carolina, United States of America), R (designed by Ross Ihaka and Robert Gentleman from R core team), Minitab (developed by Minitab Inc), Stata (developed by StataCorp) and the MS Excel (developed by Microsoft).

There are a number of web resources which are related to statistical power analyses. A few are:

  • StatPages.net – provides links to a number of online power calculators
  • G-Power – provides a downloadable power analysis program that runs under DOS
  • Power analysis for ANOVA designs an interactive site that calculates power or sample size needed to attain a given power for one effect in a factorial ANOVA design
  • SPSS makes a program called SamplePower. It gives an output of a complete report on the computer screen which can be cut and paste into another document.

It is important that a researcher knows the concepts of the basic statistical methods used for conduct of a research study. This will help to conduct an appropriately well-designed study leading to valid and reliable results. Inappropriate use of statistical techniques may lead to faulty conclusions, inducing errors and undermining the significance of the article. Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of statistical tests are important. An appropriate knowledge about the basic statistical methods will go a long way in improving the research designs and producing quality medical research which can be utilised for formulating the evidence-based guidelines.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

meaning of analysis in research

Home Market Research Research Tools and Apps

Unit of Analysis: Definition, Types & Examples

A unit of analysis is what you discuss after your research, probably what you would regard to be the primary emphasis of your research.

The unit of analysis is the people or things whose qualities will be measured. The unit of analysis is an essential part of a research project. It’s the main thing that a researcher looks at in his research.

A unit of analysis is the object about which you hope to have something to say at the end of your analysis, perhaps the major subject of your research.

In this blog post, we will explore and clarify the concept of the “unit of analysis,” including its definition, various types, and a concluding perspective on its significance.

What is a unit of analysis?

A unit of analysis is the thing you want to discuss after your research, probably what you would regard to be the primary emphasis of your research.

The researcher plans to comment on the primary topic or object in the research as a unit of analysis. The research question plays a significant role in determining it. The “who” or “what” that the researcher is interested in investigating is, to put it simply, the unit of analysis.

In his 2001 book Man, the State, and War, Waltz divides the world into three distinct spheres of study: the individual, the state, and war.

Understanding the reasoning behind the unit of analysis is vital. The likelihood of fruitful research increases if the rationale is understood. An individual, group, organization, nation, social phenomenon, etc., are a few examples.

LEARN ABOUT: Data Analytics Projects

Types of “unit of analysis”

In business research, there are almost unlimited types of possible analytical units. Data analytics and data analysis are closely related processes that involve extracting insights from data to make informed decisions. Even though the most typical unit of analysis is the individual, many research questions can be more precisely answered by looking at other types of units. Let’s find out, 

1. Individual Level

The most prevalent unit of analysis in business research is the individual. These are the primary analytical units. The researcher may be interested in looking into:

  • Employee actions
  • Perceptions
  • Attitudes or opinions.

Employees may come from wealthy or low-income families, as well as from rural or metropolitan areas.

A researcher might investigate if personnel from rural areas are more likely to arrive on time than those from urban areas. Additionally, he can check whether workers from rural areas who come from poorer families arrive on time compared to those from rural areas who come from wealthy families.

Each time, the individual (employee) serving as the analytical unit is discussed and explained. Employee analysis as a unit of analysis can shed light on issues in business, including customer and human resource behavior.

For example, employee work satisfaction and consumer purchasing patterns impact business, making research into these topics vital.

Psychologists typically concentrate on research on individuals. This research may significantly aid a firm’s success, as individuals’ knowledge and experiences reveal vital information. Thus, individuals are heavily utilized in business research.

2. Aggregates Level

Social science research does not usually focus on people. However, by combining individuals’ reactions, social scientists frequently describe and explain social interactions, communities, and groupings. Additionally, they research the collective of individuals, including communities, groups, and countries.

Aggregate levels can be divided into Groups (groups with an ad hoc structure) and Organizations (groups with a formal organization).

The following levels of the unit of analysis are made up of groups of people. A group is defined as two or more individuals who interact, share common traits, and feel connected to one another. 

Many definitions also emphasize interdependence or objective resemblance (Turner, 1982; Platow, Grace, & Smithson, 2011) and those who identify as group members (Reicher, 1982) .

As a result, society and gangs serve as examples of groups. According to Webster’s Online Dictionary (2012), they can resemble some clubs but be far less formal.

Siblings, identical twins, family, and small group functioning are examples of studies with many units of analysis.

In such circumstances, a whole group might be compared to another. Families, gender-specific groups, pals, Facebook groups, and work departments can all be groups.

By analyzing groups, researchers can learn how they form and how age, experience, class, and gender affect them. When aggregated, an individual’s data describes the group they belong to.

Sociologists study groups like economists and businesspeople to form teams to complete projects. They continually research groups and group behavior.

Organizations

The next level of the unit of analysis is organizations, which are groups of people set up formally. Organizations could include businesses, religious groups, parts of the military, colleges, academic departments, supermarkets, business groups, and so on.

The social organization includes things like sexual composition, styles of leadership, organizational structure, systems of communication, and so on. (Susan & Wheelan, 2005; Chapais & Berman, 2004) . (Lim, Putnam, and Robert, 2010) say that well-known social organizations and religious institutions are among them.

Moody, White, and Douglas (2003) say social organizations are hierarchical. Hasmath, Hildebrandt, and Hsu (2016) say social organizations can take different forms. For example, they can be made by institutions like schools or governments.

Sociology, economics, political science, psychology, management, and organizational communication are some social science fields that study organizations (Douma & Schreuder, 2013) .

Organizations are different from groups in that they are more formal and have better organization. A researcher might want to study a company to generalize its results to the whole population of companies.

One way to look at an organization is by the number of employees, the net annual revenue, the net assets, the number of projects, and so on. He might want to know if big companies hire more or fewer women than small companies.

Organization researchers might be interested in how companies like Reliance, Amazon, and HCL affect our social and economic lives. People who work in business often study business organizations.

LEARN ABOUT: Data Management Framework

3. Social Level

The social level has 2 types,

Social Artifacts Level

Things are studied alongside humans. Social artifacts are human-made objects from diverse communities. Social artifacts are items, representations, assemblages, institutions, knowledge, and conceptual frameworks used to convey, interpret, or achieve a goal (IGI Global, 2017).

Cultural artifacts are anything humans generate that reveals their culture (Watts, 1981).

Social artifacts include books, newspapers, advertising, websites, technical devices, films, photographs, paintings, clothes, poems, jokes, students’ late excuses, scientific breakthroughs, furniture, machines, structures, etc. Infinite.

Humans build social objects for social behavior. As people or groups suggest a population in business research, each social object implies a class of items.

Same-class goods include business books, magazines, articles, and case studies. A business magazine’s quantity of articles, frequency, price, content, and editor in a research study may be characterized.

Then, a linked magazine’s population might be evaluated for description and explanation. Marx W. Wartofsky (1979) defined artifacts as primary artifacts utilized in production (like a camera), secondary artifacts connected to primary artifacts (like a camera user manual), and tertiary objects related to representations of secondary artifacts (like a camera user-manual sculpture).

The scientific study of an artifact reveals its creators and users. The artifact researcher may be interested in advertising, marketing, distribution, buying, etc.

Social Interaction Level

Social artifacts include social interaction. Such as:

  • Eye contact with a coworker
  • Buying something in a store
  • Friendship decisions
  • Road accidents
  • Airline hijackings
  • Professional counseling
  • Whatsapp messaging

A researcher might study youthful employees’ smartphone addictions. Some addictions may involve social media, while others involve online games and movies that inhibit connection.

Smartphone addictions are examined as a societal phenomenon. Observation units are probably individuals (employees).

Anthropologists typically study social artifacts. They may be interested in the social order. A researcher who examines social interactions may be interested in how broader societal structures and factors impact daily behavior, festivals, and weddings.

LEARN ABOUT: Level of Analysis

Even though there is no perfect way to do research, it is generally agreed that researchers should try to find a unit of analysis that keeps the context needed to make sense of the data.

Researchers should consider the details of their research when deciding on the unit of analysis. 

They should remember that consistent use of these units throughout the analysis process (from coding to developing categories and themes to interpreting the data) is essential to gaining insight from qualitative data and protecting the reliability of the results.

QuestionPro does much more than merely serve as survey software. We have a solution for every sector of the economy and every kind of issue. We also have systems for managing data, such as our research repository, Insights Hub.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

meaning of analysis in research

Velodu and QuestionPro: Connecting Data with a Human Touch

Aug 28, 2024

Cross-cultural research

Cross-Cultural Research: Methods, Challenges, & Key Findings

Aug 27, 2024

Qualtrics vs Microsoft Forms Comparative

Qualtrics vs Microsoft Forms: Platform Comparison 2024

meaning of analysis in research

Are We Asking the Right Things at the Right Time in the Right Way? — Tuesday CX Thoughts

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

Thematic Analysis: A Step by Step Guide

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

What is Thematic Analysis?

Thematic analysis is a qualitative research method used to identify, analyze, and interpret patterns of shared meaning (themes) within a given data set, which can be in the form of interviews , focus group discussions , surveys, or other textual data.

Thematic analysis is a useful method for research seeking to understand people’s views, opinions, knowledge, experiences, or values from qualitative data.

This method is widely used in various fields, including psychology, sociology, and health sciences.

Thematic analysis minimally organizes and describes a data set in rich detail. Often, though, it goes further than this and interprets aspects of the research topic.

Key aspects of Thematic Analysis include:

  • Flexibility : It can be adapted to suit the needs of various studies, providing a rich and detailed account of the data.
  • Coding : The process involves assigning labels or codes to specific segments of the data that capture a single idea or concept relevant to the research question.
  • Themes : Representing a broader level of analysis, encompassing multiple codes that share a common underlying meaning or pattern. They provide a more abstract and interpretive understanding of the data.
  • Iterative process : Thematic analysis is a recursive process that involves constantly moving back and forth between the coded extracts, the entire data set, and the thematic analysis being produced.
  • Interpretation : The researcher interprets the identified themes to make sense of the data and draw meaningful conclusions.

It’s important to note that the types of thematic analysis are not mutually exclusive, and researchers may adopt elements from different approaches depending on their research questions, goals, and epistemological stance.

The choice of approach should be guided by the research aims, the nature of the data, and the philosophical assumptions underpinning the study.

FeatureCoding Reliability TACodebook TAReflexive TA
Conceptualized as topic summaries of the data Typically conceptualized as topic summariesConceptualized as patterns of shared meaning that are underpinned by a central organizing concept
Involves using a coding frame or codebook, which may be predetermined or generated from the data, to find evidence for themes or allocate data to predefined topics. Ideally, two or more researchers apply the coding frame separately to the data to avoid contaminationTypically involves early theme development and the use of a codebook and structured approach to codingInvolves an active process in which codes are developed from the data through the analysis. The researcher’s subjectivity shapes the coding and theme development process
Emphasizes securing the reliability and accuracy of data coding, reflecting (post)positivist research values. Prioritizes minimizing subjectivity and maximizing objectivity in the coding processCombines elements of both coding reliability and reflexive TA, but qualitative values tend to predominate. For example, the “accuracy” or “reliability” of coding is not a primary concernEmphasizes the role of the researcher in knowledge construction and acknowledges that their subjectivity shapes the research process and outcomes
Often used in research where minimizing subjectivity and maximizing objectivity in the coding process are highly valuedCommonly employed in applied research, particularly when information needs are predetermined, deadlines are tight, and research teams are large and may include qualitative novices. Pragmatic concerns often drive its useWell-suited for exploring complex research issues. Often used in research where the researcher’s active role in knowledge construction is acknowledged and valued. Can be used to analyze a wide range of data, including interview transcripts, focus groups, and policy documents
Themes are often predetermined or generated early in the analysis process, either prior to data analysis or following some familiarization with the dataThemes are typically developed early in the analysis processThemes are developed later in the analytic process, emerging from the coded data
The researcher’s subjectivity is minimized, aiming for objectivity in codingThe researcher’s subjectivity is acknowledged, though structured coding methods are usedThe researcher’s subjectivity is viewed as a valuable resource in the analytic process and is considered to inevitably shape the research findings

1. Coding Reliability Thematic Analysis

Coding reliability TA emphasizes using coding techniques to achieve reliable and accurate data coding, which reflects (post)positivist research values.

This approach emphasizes the reliability and replicability of the coding process. It involves multiple coders independently coding the data using a predetermined codebook.

The goal is to achieve a high level of agreement among the coders, which is often measured using inter-rater reliability metrics.

This approach often involves a coding frame or codebook determined in advance or generated after familiarization with the data.

In this type of TA, two or more researchers apply a fixed coding frame to the data, ideally working separately.

Some researchers even suggest that at least some coders should be unaware of the research question or area of study to prevent bias in the coding process.

Statistical tests are used to assess the level of agreement between coders, or the reliability of coding. Any differences in coding between researchers are resolved through consensus.

This approach is more suitable for research questions that require a more structured and reliable coding process, such as in content analysis or when comparing themes across different data sets.

2. Codebook Thematic Analysis

Codebook TA, such as template, framework, and matrix analysis, combines elements of coding reliability and reflexive.

Codebook TA, while employing structured coding methods like those used in coding reliability TA, generally prioritizes qualitative research values, such as reflexivity.

In this approach, the researcher develops a codebook based on their initial engagement with the data. The codebook contains a list of codes, their definitions, and examples from the data.

The codebook is then used to systematically code the entire data set. This approach allows for a more detailed and nuanced analysis of the data, as the codebook can be refined and expanded throughout the coding process.

It is particularly useful when the research aims to provide a comprehensive description of the data set.

Codebook TA is often chosen for pragmatic reasons in applied research, particularly when there are predetermined information needs, strict deadlines, and large teams with varying levels of qualitative research experience

The use of a codebook in this context helps to map the developing analysis, which is thought to improve teamwork, efficiency, and the speed of output delivery.

3. Reflexive Thematic Analysis

This approach emphasizes the role of the researcher in the analysis process. It acknowledges that the researcher’s subjectivity, theoretical assumptions, and interpretative framework shape the identification and interpretation of themes.

In reflexive TA, analysis starts with coding after data familiarization. Unlike other TA approaches, there is no codebook or coding frame. Instead, researchers develop codes as they work through the data.

As their understanding grows, codes can change to reflect new insights—for example, they might be renamed, combined with other codes, split into multiple codes, or have their boundaries redrawn.

If multiple researchers are involved, differences in coding are explored to enhance understanding, not to reach a consensus. The finalized coding is always open to new insights and coding.

Reflexive thematic analysis involves a more organic and iterative process of coding and theme development. The researcher continuously reflects on their role in the research process and how their own experiences and perspectives might influence the analysis.

This approach is particularly useful for exploratory research questions and when the researcher aims to provide a rich and nuanced interpretation of the data.

Six Steps Of Thematic Analysis

The process is characterized by a recursive movement between the different phases, rather than a strict linear progression.

This means that researchers might revisit earlier phases as their understanding of the data evolves, constantly refining their analysis.

For instance, during the reviewing and developing themes phase, researchers may realize that their initial codes don’t effectively capture the nuances of the data and might need to return to the coding phase. 

This back-and-forth movement continues throughout the analysis, ensuring a thorough and evolving understanding of the data

thematic analysis

Step 1: Familiarization With the Data

Familialization is crucial, as it helps researchers figure out the type (and number) of themes that might emerge from the data.

Familiarization involves immersing yourself in the data by reading and rereading textual data items, such as interview transcripts or survey responses.

You should read through the entire data set at least once, and possibly multiple times, until you feel intimately familiar with its content.

  • Read and re-read the data (e.g., interview transcripts, survey responses, or other textual data) : The researcher reads through the entire data set (e.g., interview transcripts, survey responses, or field notes) multiple times to gain a comprehensive understanding of the data’s breadth and depth. This helps the researcher develop a holistic sense of the participants’ experiences, perspectives, and the overall narrative of the data.
  • Listen to the audio recordings of the interviews : This helps to pick up on tone, emphasis, and emotional responses that may not be evident in the written transcripts. For instance, they might note a participant’s hesitation or excitement when discussing a particular topic. This is an important step if you didn’t collect the data or transcribe it yourself.
  • Take notes on initial ideas and observations : Note-making at this stage should be observational and casual, not systematic and inclusive, as you aren’t coding yet. Think of the notes as memory aids and triggers for later coding and analysis. They are primarily for you, although they might be shared with research team members.
  • Immerse yourself in the data to gain a deep understanding of its content : It’s not about just absorbing surface meaning like you would with a novel, but about thinking about what the data  mean .

By the end of the familiarization step, the researcher should have a good grasp of the overall content of the data, the key issues and experiences discussed by the participants, and any initial patterns or themes that emerge.

This deep engagement with the data sets the stage for the subsequent steps of thematic analysis, where the researcher will systematically code and analyze the data to identify and interpret the central themes.

Step 2: Generating Initial Codes

Codes are concise labels or descriptions assigned to segments of the data that capture a specific feature or meaning relevant to the research question.

The process of qualitative coding helps the researcher organize and reduce the data into manageable chunks, making it easier to identify patterns and themes relevant to the research question.

Think of it this way:  If your analysis is a house, themes are the walls and roof, while codes are the individual bricks and tiles.

Coding is an iterative process, with researchers refining and revising their codes as their understanding of the data evolves.

The ultimate goal is to develop a coherent and meaningful coding scheme that captures the richness and complexity of the participants’ experiences and helps answer the research questions.

Coding can be done manually (paper transcription and pen or highlighter) or by means of software (e.g. by using NVivo, MAXQDA or ATLAS.ti).

qualitative coding

Decide On Your Coding Approach

  • Will you use predefined deductive codes (based on theory or prior research), or let codes emerge from the data (inductive coding)?
  • Will a piece of data have one code or multiple?
  • Will you code everything or selectively? Broader research questions may warrant coding more comprehensively.

If you decide not to code everything, it’s crucial to:

  • Have clear criteria for what you will and won’t code
  • Be transparent about your selection process in research reports
  • Remain open to revisiting uncoded data later in analysis

Do A First Round Of Coding

  • Go through the data and assign initial codes to chunks that stand out
  • Create a code name (a word or short phrase) that captures the essence of each chunk
  • Keep a codebook – a list of your codes with descriptions or definitions
  • Be open to adding, revising or combining codes as you go

After generating your first code, compare each new data extract to see if an existing code applies or a new one is needed.

Coding can be done at two levels of meaning:

  • Semantic:  Provides a concise summary of a portion of data, staying close to the content and the participant’s meaning. For example, “Fear/anxiety about people’s reactions to his sexuality.”
  • Latent:  Goes beyond the participant’s meaning to provide a conceptual interpretation of the data. For example, “Coming out imperative” interprets the meaning behind a participant’s statement.

Most codes will be a mix of descriptive and conceptual. Novice coders tend to generate more descriptive codes initially, developing more conceptual approaches with experience.

This step ends when:

  • All data is fully coded.
  • Data relevant to each code has been collated.

You have enough codes to capture the data’s diversity and patterns of meaning, with most codes appearing across multiple data items.

The number of codes you generate will depend on your topic, data set, and coding precision.

Step 3: Searching for Themes

Searching for themes begins after all data has been initially coded and collated, resulting in a comprehensive list of codes identified across the data set.

This step involves shifting from the specific, granular codes to a broader, more conceptual level of analysis.

Thematic analysis is not about “discovering” themes that already exist in the data, but rather actively constructing or generating themes through a careful and iterative process of examination and interpretation.

1 . Collating codes into potential themes :

The process of collating codes into potential themes involves grouping codes that share a unifying feature or represent a coherent and meaningful pattern in the data.

The researcher looks for patterns, similarities, and connections among the codes to develop overarching themes that capture the essence of the data.

By the end of this step, the researcher will have a collection of candidate themes and sub-themes, along with their associated data extracts.

However, these themes are still provisional and will be refined in the next step of reviewing the themes.

The searching for themes step helps the researcher move from a granular, code-level analysis to a more conceptual, theme-level understanding of the data.

This process is similar to sculpting, where the researcher shapes the “raw” data into a meaningful analysis.

This involves grouping codes that share a unifying feature or represent a coherent pattern in the data:
  • Review the list of initial codes and their associated data extracts
  • Look for codes that seem to share a common idea or concept
  • Group related codes together to form potential themes
  • Some codes may form main themes, while others may be sub-themes or may not fit into any theme

Thematic maps can help visualize the relationship between codes and themes. These visual aids provide a structured representation of the emerging patterns and connections within the data, aiding in understanding the significance of each theme and its contribution to the overall research question.

Example : Studying first-generation college students, the researcher might notice that the codes “financial challenges,” “working part-time,” and “scholarships” all relate to the broader theme of “Financial Obstacles and Support.”

Shared Meaning vs. Shared Topic in Thematic Analysis

Braun and Clarke distinguish between two different conceptualizations of  themes : topic summaries and shared meaning

  • Topic summary themes , which they consider to be underdeveloped, are organized around a shared topic but not a shared meaning, and often resemble “buckets” into which data is sorted.
  • Shared meaning themes  are patterns of shared meaning underpinned by a central organizing concept.
When grouping codes into themes, it’s crucial to ensure they share a central organizing concept or idea, reflecting a shared meaning rather than just belonging to the same topic.

Thematic analysis aims to uncover patterns of shared meaning within the data that offer insights into the research question

For example, codes centered around the concept of “Negotiating Sexual Identity” might not form one comprehensive theme, but rather two distinct themes: one related to “coming out and being out” and another exploring “different versions of being a gay man.”

Avoid : Themes as Topic Summaries (Shared Topic)

In this approach, themes simply summarize what participants mentioned about a particular topic, without necessarily revealing a unified meaning.

These themes are often underdeveloped and lack a central organizing concept.

It’s crucial to avoid creating themes that are merely summaries of data domains or directly reflect the interview questions. 

Example : A theme titled “Incidents of homophobia” that merely describes various participant responses about homophobia without delving into deeper interpretations would be a topic summary theme.

Tip : Using interview questions as theme titles without further interpretation or relying on generic social functions (“social conflict”) or structural elements (“economics”) as themes often indicates a lack of shared meaning and thorough theme development. Such themes might lack a clear connection to the specific dataset

Ensure : Themes as Shared Meaning

Instead, themes should represent a deeper level of interpretation, capturing the essence of the data and providing meaningful insights into the research question.

These themes go beyond summarizing a topic by identifying a central concept or idea that connects the codes.

They reflect a pattern of shared meaning across different data points, even if those points come from different topics.

Example : The theme “‘There’s always that level of uncertainty’: Compulsory heterosexuality at university” effectively captures the shared experience of fear and uncertainty among LGBT students, connecting various codes related to homophobia and its impact on their lives.

2. Gathering data relevant to each potential theme

Once a potential theme is identified, all coded data extracts associated with the codes grouped under that theme are collated. This ensures a comprehensive view of the data pertaining to each theme.

This involves reviewing the collated data extracts for each code and organizing them under the relevant themes.

For example, if you have a potential theme called “Student Strategies for Test Preparation,” you would gather all data extracts that have been coded with related codes, such as “Time Management for Test Preparation” or “Study Groups for Test Preparation”.

You can then begin reviewing the data extracts for each theme to see if they form a coherent pattern. 

This step helps to ensure that your themes accurately reflect the data and are not based on your own preconceptions.

It’s important to remember that coding is an organic and ongoing process.

You may need to re-read your entire data set to see if you have missed any data that is relevant to your themes, or if you need to create any new codes or themes.

The researcher should ensure that the data extracts within each theme are coherent and meaningful.

Example : The researcher would gather all the data extracts related to “Financial Obstacles and Support,” such as quotes about struggling to pay for tuition, working long hours, or receiving scholarships.

Here’s a more detailed explanation of how to gather data relevant to each potential theme:

  • Start by creating a visual representation of your potential themes, such as a thematic map or table
  • List each potential theme and its associated sub-themes (if any)
  • This will help you organize your data and see the relationships between themes
  • Go through your coded data extracts (e.g., highlighted quotes or segments from interview transcripts)
  • For each coded extract, consider which theme or sub-theme it best fits under
  • If a coded extract seems to fit under multiple themes, choose the theme that it most closely aligns with in terms of shared meaning
  • As you identify which theme each coded extract belongs to, copy and paste the extract under the relevant theme in your thematic map or table
  • Include enough context around each extract to ensure its meaning is clear
  • If using qualitative data analysis software, you can assign the coded extracts to the relevant themes within the software
  • As you gather data extracts under each theme, continuously review the extracts to ensure they form a coherent pattern
  • If some extracts do not fit well with the rest of the data in a theme, consider whether they might better fit under a different theme or if the theme needs to be refined

3. Considering relationships between codes, themes, and different levels of themes

Once you have gathered all the relevant data extracts under each theme, review the themes to ensure they are meaningful and distinct.

This step involves analyzing how different codes combine to form overarching themes and exploring the hierarchical relationship between themes and sub-themes.

Within a theme, there can be different levels of themes, often organized hierarchically as main themes and sub-themes.

  • Main themes  represent the most overarching or significant patterns found in the data. They provide a high-level understanding of the key issues or concepts present in the data. 
  • Sub-themes , as the name suggests, fall under main themes, offering a more nuanced and detailed understanding of a particular aspect of the main theme.

The process of developing these relationships is iterative and involves:

  • Creating a Thematic Map : The relationship between codes, sub-themes and main themes can be visualized using a thematic map, diagram, or table. Refine the thematic map as you continue to review and analyze the data.
  • Examine how the codes and themes relate to each other : Some themes may be more prominent or overarching (main themes), while others may be secondary or subsidiary (sub-themes).
  • Refining Themes : This map helps researchers review and refine themes, ensuring they are internally consistent (homogeneous) and distinct from other themes (heterogeneous).
  • Defining and Naming Themes : Finally, themes are given clear and concise names and definitions that accurately reflect the meaning they represent in the data.

Thematic map of qualitative data from focus groups W640

Consider how the themes tell a coherent story about the data and address the research question.

If some themes seem to overlap or are not well-supported by the data, consider combining or refining them.

If a theme is too broad or diverse, consider splitting it into separate themes or sub-theme.

Example : The researcher might identify “Academic Challenges” and “Social Adjustment” as other main themes, with sub-themes like “Imposter Syndrome” and “Balancing Work and School” under “Academic Challenges.” They would then consider how these themes relate to each other and contribute to the overall understanding of first-generation college students’ experiences.

Step 4: Reviewing Themes

The researcher reviews, modifies, and develops the preliminary themes identified in the previous step.

This phase involves a recursive process of checking the themes against the coded data extracts and the entire data set to ensure they accurately reflect the meanings evident in the data.

The purpose is to refine the themes, ensuring they are coherent, consistent, and distinctive.

According to Braun and Clarke, a well-developed theme “captures something important about the data in relation to the research question and represents some level of patterned response or meaning within the data set”.

A well-developed theme will:

  • Go beyond paraphrasing the data to analyze the meaning and significance of the patterns identified.
  • Provide a detailed analysis of what the theme is about.
  • Be supported with a good amount of relevant data extracts.
  • Be related to the research question.
Revisions at this stage might involve creating new themes, refining existing themes, or discarding themes that do not fit the data

Level One : Reviewing Themes Against Coded Data Extracts

  • Researchers begin by comparing their candidate themes against the coded data extracts associated with each theme.
  • This step helps to determine whether each theme is supported by the data and whether it accurately reflects the meaning found in the extracts. Determine if there is enough data to support each theme.
  • Look at the relationships between themes and sub-themes in the thematic map. Consider whether the themes work together to tell a coherent story about the data. If the thematic map does not effectively represent the data, consider making adjustments to the themes or their organization.
  • It’s important to ensure that each theme has a singular focus and is not trying to encompass too much. Themes should be distinct from one another, although they may build on or relate to each other.
  • Discarding codes : If certain codes within a theme are not well-supported or do not fit, they can be removed.
  • Relocating codes : Codes that fit better under a different theme can be moved.
  • Redrawing theme boundaries : The scope of a theme can be adjusted to better capture the relevant data.
  • Discarding themes : Entire themes can be abandoned if they do not work.

Level Two : Evaluating Themes Against the Entire Data Set

  • Once the themes appear coherent and well-supported by the coded extracts, researchers move on to evaluate them against the entire data set.
  • This involves a final review of all the data to ensure that the themes accurately capture the most important and relevant patterns across the entire dataset in relation to the research question.
  • During this level, researchers may need to recode some extracts for consistency, especially if the coding process evolved significantly, and earlier data items were not recoded according to these changes.

Step 5: Defining and Naming Themes

The themes are finalized when the researcher is satisfied with the theme names and definitions.

If the analysis is carried out by a single researcher, it is recommended to seek feedback from an external expert to confirm that the themes are well-developed, clear, distinct, and capture all the relevant data.

Defining themes  means determining the exact meaning of each theme and understanding how it contributes to understanding the data.

This process involves formulating exactly what we mean by each theme. The researcher should consider what a theme says, if there are subthemes, how they interact and relate to the main theme, and how the themes relate to each other.

Themes should not be overly broad or try to encompass too much, and should have a singular focus. They should be distinct from one another and not repetitive, although they may build on one another.

In this phase the researcher specifies the essence of each theme.

  • What does the theme tell us that is relevant for the research question?
  • How does it fit into the ‘overall story’ the researcher wants to tell about the data?
Naming themes  involves developing a clear and concise name that effectively conveys the essence of each theme to the reader. A good name for a theme is informative, concise, and catchy.
  • The researcher develops concise, punchy, and informative names for each theme that effectively communicate its essence to the reader.
  • Theme names should be catchy and evocative, giving the reader an immediate sense of what the theme is about.
  • Avoid using jargon or overly complex language in theme names.
  • The name should go beyond simply paraphrasing the content of the data extracts and instead interpret the meaning and significance of the patterns within the theme.
  • The goal is to make the themes accessible and easily understandable to the intended audience. If a theme contains sub-themes, the researcher should also develop clear and informative names for each sub-theme.
  • Theme names can include direct quotations from the data, which helps convey the theme’s meaning. However, researchers should avoid using data collection questions as theme names. Using data collection questions as themes often leads to analyses that present summaries of topics rather than fully realized themes.

For example, “‘There’s always that level of uncertainty’: Compulsory heterosexuality at university” is a strong theme name because it captures the theme’s meaning. In contrast, “incidents of homophobia” is a weak theme name because it only states the topic.

For instance, a theme labeled “distrust of experts” might be renamed “distrust of authority” or “conspiracy thinking” after careful consideration of the theme’s meaning and scope.

Step 6: Producing the Report

A thematic analysis report should provide a convincing and clear, yet complex story about the data that is situated within a scholarly field.

A balance should be struck between the narrative and the data presented, ensuring that the report convincingly explains the meaning of the data, not just summarizes it.

To achieve this, the report should include vivid, compelling data extracts illustrating the themes and incorporate extracts from different data sources to demonstrate the themes’ prevalence and strengthen the analysis by representing various perspectives within the data.

The report should be written in first-person active tense, unless otherwise stated in the reporting requirements.

The analysis can be presented in two ways :

  • Integrated Results and Discussion section:  This approach is suitable when the analysis has strong connections to existing research and when the analysis is more theoretical or interpretive.
  • Separate Discussion section:  This approach presents the data interpretation separately from the results.
Regardless of the presentation style, researchers should aim to “show” what the data reveals and “tell” the reader what it means in order to create a convincing analysis.
  • Presentation order of themes: Consider how to best structure the presentation of the themes in the report. This may involve presenting the themes in order of importance, chronologically, or in a way that tells a coherent story.
  • Subheadings: Use subheadings to clearly delineate each theme and its sub-themes, making the report easy to navigate and understand.

The analysis should go beyond a simple summary of participant’s words and instead interpret the meaning of the data.

Themes should connect logically and meaningfully and, if relevant, should build on previous themes to tell a coherent story about the data.

The report should include vivid, compelling data extracts that clearly illustrate the theme being discussed and should incorporate extracts from different data sources, rather than relying on a single source.

Although it is tempting to rely on one source when it eloquently expresses a particular aspect of the theme, using multiple sources strengthens the analysis by representing a wider range of perspectives within the data.

Researchers should strive to maintain a balance between the amount of narrative and the amount of data presented.

Potential Pitfalls to Avoid

  • Failing to analyze the data : Thematic analysis should involve more than simply presenting data extracts without an analytic narrative. The researcher must provide an interpretation and make sense of the data, telling the reader what it means and how it relates to the research questions.
  • Using data collection questions as themes : Themes should be identified across the entire dataset, not just based on the questions asked during data collection. Reporting data collection questions as themes indicates a lack of thorough analytic work to identify patterns and meanings in the data.
  • Conducting a weak or unconvincing analysis : Themes should be distinct, internally coherent, and consistent, capturing the majority of the data or providing a rich description of specific aspects. A weak analysis may have overlapping themes, fail to capture the data adequately, or lack sufficient examples to support the claims made.
  • Mismatch between data and analytic claims : The researcher’s interpretations and analytic points must be consistent with the data extracts presented. Claims that are not supported by the data, contradict the data, or fail to consider alternative readings or variations in the account are problematic.
  • Misalignment between theory, research questions, and analysis : The interpretations of the data should be consistent with the theoretical framework used. For example, an experiential framework would not typically make claims about the social construction of the topic. The form of thematic analysis used should also align with the research questions.
  • Neglecting to clarify assumptions, purpose, and process : A good thematic analysis should spell out its theoretical assumptions, clarify how it was undertaken, and for what purpose. Without this crucial information, the analysis is lacking context and transparency, making it difficult for readers to evaluate the research.

Reducing Bias

When researchers are both reflexive and transparent in their thematic analysis, it strengthens the trustworthiness and rigor of their findings.

The explicit acknowledgement of potential biases and the detailed documentation of the analytical process provide a stronger foundation for the interpretation of the data, making it more likely that the findings reflect the perspectives of the participants rather than the biases of the researcher.

Reflexivity

Reflexivity involves critically examining one’s own assumptions and biases, is crucial in qualitative research to ensure the trustworthiness of findings.

It requires acknowledging that researcher subjectivity is inherent in the research process and can influence how data is collected, analyzed, and interpreted.

Identifying and Challenging Assumptions:

Reflexivity encourages researchers to explicitly acknowledge their preconceived notions, theoretical leanings, and potential biases.

By actively reflecting on how these factors might influence their interpretation of the data, researchers can take steps to mitigate their impact.

This might involve seeking alternative explanations, considering contradictory evidence, or discussing their interpretations with others to gain different perspectives.

Transparency

Transparency refers to clearly documenting the research process, including coding decisions, theme development, and the rationale behind behind theme development.

This openness allows others to understand how the analysis was conducted and to assess the credibility of the findings

This transparency helps ensure the trustworthiness and rigor of the findings, allowing others to understand and potentially replicate the analysis.

Documenting Decision-Making:

Transparency requires researchers to provide a clear and detailed account of their analytical choices throughout the research process.

This includes documenting the rationale behind coding decisions, the process of theme development, and any changes made to the analytical approach during the study.

By making these decisions transparent, researchers allow others to scrutinize their work and assess the potential for bias.

Practical Strategies for Reflexivity and Transparency in Thematic Analysis:

  • Maintaining a reflexive journal:  Researchers can keep a journal throughout the research process to document their thoughts, assumptions, and potential biases. This journal serves as a record of the researcher’s evolving understanding of the data and can help identify potential blind spots in their analysis.
  • Engaging in team-based analysis:  Collaborative analysis, involving multiple researchers, can enhance reflexivity by providing different perspectives and interpretations of the data. Discussing coding decisions and theme development as a team allows researchers to challenge each other’s assumptions and ensure a more comprehensive analysis.
  • Clearly articulating the analytical process:  In reporting the findings of thematic analysis, researchers should provide a detailed account of their methods, including the rationale behind coding decisions, the process of theme development, and any challenges encountered during analysis. This transparency allows readers to understand the steps taken to ensure the rigor and trustworthiness of the analysis.
  • Flexibility:  Thematic analysis is a flexible method, making it adaptable to different research questions and theoretical frameworks. It can be employed with various epistemological approaches, including realist, constructionist, and contextualist perspectives. For example, researchers can focus on analyzing meaning across the entire data set or examine a particular aspect in depth.
  • Accessibility:  Thematic analysis is an accessible method, especially for novice qualitative researchers, as it doesn’t demand extensive theoretical or technical knowledge compared to methods like Discourse Analysis (DA) or Conversation Analysis (CA). It is considered a foundational qualitative analysis method.
  • Rich Description:  Thematic analysis facilitates a rich and detailed description of data9. It can provide a thorough understanding of the predominant themes in a data set, offering valuable insights, particularly in under-researched areas.
  • Theoretical Freedom:  Thematic analysis is not restricted to any pre-existing theoretical framework, allowing for diverse applications. This distinguishes it from methods like Grounded Theory or Interpretative Phenomenological Analysis (IPA), which are more closely tied to specific theoretical approaches

Disadvantages

  • Subjectivity and Interpretation:  The flexibility of thematic analysis, while an advantage, can also be a disadvantage. The method’s openness can lead to a wide range of interpretations of the same data set, making it difficult to determine which aspects to emphasize. This potential subjectivity might raise concerns about the analysis’s reliability and consistency.
  • Limited Interpretive Power:  Unlike methods like narrative analysis or biographical approaches, thematic analysis may not capture the nuances of individual experiences or contradictions within a single account. The focus on patterns across interviews could result in overlooking unique individual perspectives.
  • Oversimplification:  Thematic analysis might oversimplify complex phenomena by focusing on common themes, potentially missing subtle but important variations within the data. If not carefully executed, the analysis may present a homogenous view of the data that doesn’t reflect the full range of perspectives.
  • Lack of Established Theoretical Frameworks:  Thematic analysis does not inherently rely on pre-existing theoretical frameworks. While this allows for inductive exploration, it can also limit the interpretive power of the analysis if not anchored within a relevant theoretical context. The absence of a theoretical foundation might make it challenging to draw meaningful and generalizable conclusions.
  • Difficulty in Higher-Phase Analysis:  While thematic analysis is relatively easy to initiate, the flexibility in its application can make it difficult to establish specific guidelines for higher-phase analysis1. Researchers may find it challenging to navigate the later stages of analysis and develop a coherent and insightful interpretation of the identified themes.
  • Potential for Researcher Bias:  As with any qualitative research method, thematic analysis is susceptible to researcher bias. Researchers’ preconceived notions and assumptions can influence how they code and interpret data, potentially leading to skewed results.

Further Information

  • Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology . Qualitative Research in Psychology, 3 (2), 77–101.
  • Braun, V., & Clarke, V. (2013). Successful qualitative research: A practical guide for beginners. Sage.
  • Braun, V., & Clarke, V. (2019). Reflecting on reflexive thematic analysi s. Qualitative Research in Sport, Exercise and Health, 11 (4), 589–597.
  • Braun, V., & Clarke, V. (2021). One size fits all? What counts as quality practice in (reflexive) thematic analysis? Qualitative Research in Psychology, 18 (3), 328–352.
  • Braun, V., & Clarke, V. (2021). To saturate or not to saturate? Questioning data saturation as a useful concept for thematic analysis and sample-size rationales . Qualitative Research in Sport, Exercise and Health, 13 (2), 201–216.
  • Braun, V., & Clarke, V. (2022). Conceptual and design thinking for thematic analysis .  Qualitative psychology ,  9 (1), 3.
  • Braun, V., & Clarke, V. (2022b). Thematic analysis: A practical guide . Sage.
  • Braun, V., Clarke, V., & Hayfield, N. (2022). ‘A starting point for your journey, not a map’: Nikki Hayfield in conversation with Virginia Braun and Victoria Clarke about thematic analysis.  Qualitative research in psychology ,  19 (2), 424-445.
  • Finlay, L., & Gough, B. (Eds.). (2003). Reflexivity: A practical guide for researchers in health and social sciences. Blackwell Science.
  • Gibbs, G. R. (2013). Using software in qualitative analysis. In U. Flick (ed.) The Sage handbook of qualitative data analysis (pp. 277–294). London: Sage.
  • McLeod, S. (2024, May 17). Qualitative Data Coding . Simply Psychology. https://www.simplypsychology.org/qualitative-data-coding.html
  • Terry, G., & Hayfield, N. (2021). Essentials of thematic analysis . American Psychological Association.

Example TA Studies

  • Braun, V., Terry, G., Gavey, N., & Fenaughty, J. (2009). ‘ Risk’and sexual coercion among gay and bisexual men in Aotearoa/New Zealand–key informant accounts .  Culture, Health & Sexuality ,  11 (2), 111-124.
  • Clarke, V., & Kitzinger, C. (2004). Lesbian and gay parents on talk shows: resistance or collusion in heterosexism? .  Qualitative Research in Psychology ,  1 (3), 195-217.

Print Friendly, PDF & Email

  • Privacy Policy

Research Method

Home » Quantitative Research – Methods, Types and Analysis

Quantitative Research – Methods, Types and Analysis

Table of Contents

What is Quantitative Research

Quantitative Research

Quantitative research is a type of research that collects and analyzes numerical data to test hypotheses and answer research questions . This research typically involves a large sample size and uses statistical analysis to make inferences about a population based on the data collected. It often involves the use of surveys, experiments, or other structured data collection methods to gather quantitative data.

Quantitative Research Methods

Quantitative Research Methods

Quantitative Research Methods are as follows:

Descriptive Research Design

Descriptive research design is used to describe the characteristics of a population or phenomenon being studied. This research method is used to answer the questions of what, where, when, and how. Descriptive research designs use a variety of methods such as observation, case studies, and surveys to collect data. The data is then analyzed using statistical tools to identify patterns and relationships.

Correlational Research Design

Correlational research design is used to investigate the relationship between two or more variables. Researchers use correlational research to determine whether a relationship exists between variables and to what extent they are related. This research method involves collecting data from a sample and analyzing it using statistical tools such as correlation coefficients.

Quasi-experimental Research Design

Quasi-experimental research design is used to investigate cause-and-effect relationships between variables. This research method is similar to experimental research design, but it lacks full control over the independent variable. Researchers use quasi-experimental research designs when it is not feasible or ethical to manipulate the independent variable.

Experimental Research Design

Experimental research design is used to investigate cause-and-effect relationships between variables. This research method involves manipulating the independent variable and observing the effects on the dependent variable. Researchers use experimental research designs to test hypotheses and establish cause-and-effect relationships.

Survey Research

Survey research involves collecting data from a sample of individuals using a standardized questionnaire. This research method is used to gather information on attitudes, beliefs, and behaviors of individuals. Researchers use survey research to collect data quickly and efficiently from a large sample size. Survey research can be conducted through various methods such as online, phone, mail, or in-person interviews.

Quantitative Research Analysis Methods

Here are some commonly used quantitative research analysis methods:

Statistical Analysis

Statistical analysis is the most common quantitative research analysis method. It involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis can be used to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.

Regression Analysis

Regression analysis is a statistical technique used to analyze the relationship between one dependent variable and one or more independent variables. Researchers use regression analysis to identify and quantify the impact of independent variables on the dependent variable.

Factor Analysis

Factor analysis is a statistical technique used to identify underlying factors that explain the correlations among a set of variables. Researchers use factor analysis to reduce a large number of variables to a smaller set of factors that capture the most important information.

Structural Equation Modeling

Structural equation modeling is a statistical technique used to test complex relationships between variables. It involves specifying a model that includes both observed and unobserved variables, and then using statistical methods to test the fit of the model to the data.

Time Series Analysis

Time series analysis is a statistical technique used to analyze data that is collected over time. It involves identifying patterns and trends in the data, as well as any seasonal or cyclical variations.

Multilevel Modeling

Multilevel modeling is a statistical technique used to analyze data that is nested within multiple levels. For example, researchers might use multilevel modeling to analyze data that is collected from individuals who are nested within groups, such as students nested within schools.

Applications of Quantitative Research

Quantitative research has many applications across a wide range of fields. Here are some common examples:

  • Market Research : Quantitative research is used extensively in market research to understand consumer behavior, preferences, and trends. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform marketing strategies, product development, and pricing decisions.
  • Health Research: Quantitative research is used in health research to study the effectiveness of medical treatments, identify risk factors for diseases, and track health outcomes over time. Researchers use statistical methods to analyze data from clinical trials, surveys, and other sources to inform medical practice and policy.
  • Social Science Research: Quantitative research is used in social science research to study human behavior, attitudes, and social structures. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform social policies, educational programs, and community interventions.
  • Education Research: Quantitative research is used in education research to study the effectiveness of teaching methods, assess student learning outcomes, and identify factors that influence student success. Researchers use experimental and quasi-experimental designs, as well as surveys and other quantitative methods, to collect and analyze data.
  • Environmental Research: Quantitative research is used in environmental research to study the impact of human activities on the environment, assess the effectiveness of conservation strategies, and identify ways to reduce environmental risks. Researchers use statistical methods to analyze data from field studies, experiments, and other sources.

Characteristics of Quantitative Research

Here are some key characteristics of quantitative research:

  • Numerical data : Quantitative research involves collecting numerical data through standardized methods such as surveys, experiments, and observational studies. This data is analyzed using statistical methods to identify patterns and relationships.
  • Large sample size: Quantitative research often involves collecting data from a large sample of individuals or groups in order to increase the reliability and generalizability of the findings.
  • Objective approach: Quantitative research aims to be objective and impartial in its approach, focusing on the collection and analysis of data rather than personal beliefs, opinions, or experiences.
  • Control over variables: Quantitative research often involves manipulating variables to test hypotheses and establish cause-and-effect relationships. Researchers aim to control for extraneous variables that may impact the results.
  • Replicable : Quantitative research aims to be replicable, meaning that other researchers should be able to conduct similar studies and obtain similar results using the same methods.
  • Statistical analysis: Quantitative research involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis allows researchers to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.
  • Generalizability: Quantitative research aims to produce findings that can be generalized to larger populations beyond the specific sample studied. This is achieved through the use of random sampling methods and statistical inference.

Examples of Quantitative Research

Here are some examples of quantitative research in different fields:

  • Market Research: A company conducts a survey of 1000 consumers to determine their brand awareness and preferences. The data is analyzed using statistical methods to identify trends and patterns that can inform marketing strategies.
  • Health Research : A researcher conducts a randomized controlled trial to test the effectiveness of a new drug for treating a particular medical condition. The study involves collecting data from a large sample of patients and analyzing the results using statistical methods.
  • Social Science Research : A sociologist conducts a survey of 500 people to study attitudes toward immigration in a particular country. The data is analyzed using statistical methods to identify factors that influence these attitudes.
  • Education Research: A researcher conducts an experiment to compare the effectiveness of two different teaching methods for improving student learning outcomes. The study involves randomly assigning students to different groups and collecting data on their performance on standardized tests.
  • Environmental Research : A team of researchers conduct a study to investigate the impact of climate change on the distribution and abundance of a particular species of plant or animal. The study involves collecting data on environmental factors and population sizes over time and analyzing the results using statistical methods.
  • Psychology : A researcher conducts a survey of 500 college students to investigate the relationship between social media use and mental health. The data is analyzed using statistical methods to identify correlations and potential causal relationships.
  • Political Science: A team of researchers conducts a study to investigate voter behavior during an election. They use survey methods to collect data on voting patterns, demographics, and political attitudes, and analyze the results using statistical methods.

How to Conduct Quantitative Research

Here is a general overview of how to conduct quantitative research:

  • Develop a research question: The first step in conducting quantitative research is to develop a clear and specific research question. This question should be based on a gap in existing knowledge, and should be answerable using quantitative methods.
  • Develop a research design: Once you have a research question, you will need to develop a research design. This involves deciding on the appropriate methods to collect data, such as surveys, experiments, or observational studies. You will also need to determine the appropriate sample size, data collection instruments, and data analysis techniques.
  • Collect data: The next step is to collect data. This may involve administering surveys or questionnaires, conducting experiments, or gathering data from existing sources. It is important to use standardized methods to ensure that the data is reliable and valid.
  • Analyze data : Once the data has been collected, it is time to analyze it. This involves using statistical methods to identify patterns, trends, and relationships between variables. Common statistical techniques include correlation analysis, regression analysis, and hypothesis testing.
  • Interpret results: After analyzing the data, you will need to interpret the results. This involves identifying the key findings, determining their significance, and drawing conclusions based on the data.
  • Communicate findings: Finally, you will need to communicate your findings. This may involve writing a research report, presenting at a conference, or publishing in a peer-reviewed journal. It is important to clearly communicate the research question, methods, results, and conclusions to ensure that others can understand and replicate your research.

When to use Quantitative Research

Here are some situations when quantitative research can be appropriate:

  • To test a hypothesis: Quantitative research is often used to test a hypothesis or a theory. It involves collecting numerical data and using statistical analysis to determine if the data supports or refutes the hypothesis.
  • To generalize findings: If you want to generalize the findings of your study to a larger population, quantitative research can be useful. This is because it allows you to collect numerical data from a representative sample of the population and use statistical analysis to make inferences about the population as a whole.
  • To measure relationships between variables: If you want to measure the relationship between two or more variables, such as the relationship between age and income, or between education level and job satisfaction, quantitative research can be useful. It allows you to collect numerical data on both variables and use statistical analysis to determine the strength and direction of the relationship.
  • To identify patterns or trends: Quantitative research can be useful for identifying patterns or trends in data. For example, you can use quantitative research to identify trends in consumer behavior or to identify patterns in stock market data.
  • To quantify attitudes or opinions : If you want to measure attitudes or opinions on a particular topic, quantitative research can be useful. It allows you to collect numerical data using surveys or questionnaires and analyze the data using statistical methods to determine the prevalence of certain attitudes or opinions.

Purpose of Quantitative Research

The purpose of quantitative research is to systematically investigate and measure the relationships between variables or phenomena using numerical data and statistical analysis. The main objectives of quantitative research include:

  • Description : To provide a detailed and accurate description of a particular phenomenon or population.
  • Explanation : To explain the reasons for the occurrence of a particular phenomenon, such as identifying the factors that influence a behavior or attitude.
  • Prediction : To predict future trends or behaviors based on past patterns and relationships between variables.
  • Control : To identify the best strategies for controlling or influencing a particular outcome or behavior.

Quantitative research is used in many different fields, including social sciences, business, engineering, and health sciences. It can be used to investigate a wide range of phenomena, from human behavior and attitudes to physical and biological processes. The purpose of quantitative research is to provide reliable and valid data that can be used to inform decision-making and improve understanding of the world around us.

Advantages of Quantitative Research

There are several advantages of quantitative research, including:

  • Objectivity : Quantitative research is based on objective data and statistical analysis, which reduces the potential for bias or subjectivity in the research process.
  • Reproducibility : Because quantitative research involves standardized methods and measurements, it is more likely to be reproducible and reliable.
  • Generalizability : Quantitative research allows for generalizations to be made about a population based on a representative sample, which can inform decision-making and policy development.
  • Precision : Quantitative research allows for precise measurement and analysis of data, which can provide a more accurate understanding of phenomena and relationships between variables.
  • Efficiency : Quantitative research can be conducted relatively quickly and efficiently, especially when compared to qualitative research, which may involve lengthy data collection and analysis.
  • Large sample sizes : Quantitative research can accommodate large sample sizes, which can increase the representativeness and generalizability of the results.

Limitations of Quantitative Research

There are several limitations of quantitative research, including:

  • Limited understanding of context: Quantitative research typically focuses on numerical data and statistical analysis, which may not provide a comprehensive understanding of the context or underlying factors that influence a phenomenon.
  • Simplification of complex phenomena: Quantitative research often involves simplifying complex phenomena into measurable variables, which may not capture the full complexity of the phenomenon being studied.
  • Potential for researcher bias: Although quantitative research aims to be objective, there is still the potential for researcher bias in areas such as sampling, data collection, and data analysis.
  • Limited ability to explore new ideas: Quantitative research is often based on pre-determined research questions and hypotheses, which may limit the ability to explore new ideas or unexpected findings.
  • Limited ability to capture subjective experiences : Quantitative research is typically focused on objective data and may not capture the subjective experiences of individuals or groups being studied.
  • Ethical concerns : Quantitative research may raise ethical concerns, such as invasion of privacy or the potential for harm to participants.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Transformative Design

Transformative Design – Methods, Types, Guide

One-to-One Interview in Research

One-to-One Interview – Methods and Guide

Qualitative Research

Qualitative Research – Methods, Analysis Types...

Explanatory Research

Explanatory Research – Types, Methods, Guide

Triangulation

Triangulation in Research – Types, Methods and...

Research Methods

Research Methods – Types, Examples and Guide

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

Research Methods | Definitions, Types, Examples

Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design . When planning your methods, there are two key decisions you will make.

First, decide how you will collect data . Your methods depend on what type of data you need to answer your research question :

  • Qualitative vs. quantitative : Will your data take the form of words or numbers?
  • Primary vs. secondary : Will you collect original data yourself, or will you use data that has already been collected by someone else?
  • Descriptive vs. experimental : Will you take measurements of something as it is, or will you perform an experiment?

Second, decide how you will analyze the data .

  • For quantitative data, you can use statistical analysis methods to test relationships between variables.
  • For qualitative data, you can use methods such as thematic analysis to interpret patterns and meanings in the data.

Table of contents

Methods for collecting data, examples of data collection methods, methods for analyzing data, examples of data analysis methods, other interesting articles, frequently asked questions about research methods.

Data is the information that you collect for the purposes of answering your research question . The type of data you need depends on the aims of your research.

Qualitative vs. quantitative data

Your choice of qualitative or quantitative data collection depends on the type of knowledge you want to develop.

For questions about ideas, experiences and meanings, or to study something that can’t be described numerically, collect qualitative data .

If you want to develop a more mechanistic understanding of a topic, or your research involves hypothesis testing , collect quantitative data .

Qualitative to broader populations. .
Quantitative .

You can also take a mixed methods approach , where you use both qualitative and quantitative research methods.

Primary vs. secondary research

Primary research is any original data that you collect yourself for the purposes of answering your research question (e.g. through surveys , observations and experiments ). Secondary research is data that has already been collected by other researchers (e.g. in a government census or previous scientific studies).

If you are exploring a novel research question, you’ll probably need to collect primary data . But if you want to synthesize existing knowledge, analyze historical trends, or identify patterns on a large scale, secondary data might be a better choice.

Primary . methods.
Secondary

Descriptive vs. experimental data

In descriptive research , you collect data about your study subject without intervening. The validity of your research will depend on your sampling method .

In experimental research , you systematically intervene in a process and measure the outcome. The validity of your research will depend on your experimental design .

To conduct an experiment, you need to be able to vary your independent variable , precisely measure your dependent variable, and control for confounding variables . If it’s practically and ethically possible, this method is the best choice for answering questions about cause and effect.

Descriptive . .
Experimental

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Research methods for collecting data
Research method Primary or secondary? Qualitative or quantitative? When to use
Primary Quantitative To test cause-and-effect relationships.
Primary Quantitative To understand general characteristics of a population.
Interview/focus group Primary Qualitative To gain more in-depth understanding of a topic.
Observation Primary Either To understand how something occurs in its natural setting.
Secondary Either To situate your research in an existing body of work, or to evaluate trends within a research topic.
Either Either To gain an in-depth understanding of a specific group or context, or when you don’t have the resources for a large study.

Your data analysis methods will depend on the type of data you collect and how you prepare it for analysis.

Data can often be analyzed both quantitatively and qualitatively. For example, survey responses could be analyzed qualitatively by studying the meanings of responses or quantitatively by studying the frequencies of responses.

Qualitative analysis methods

Qualitative analysis is used to understand words, ideas, and experiences. You can use it to interpret data that was collected:

  • From open-ended surveys and interviews , literature reviews , case studies , ethnographies , and other sources that use text rather than numbers.
  • Using non-probability sampling methods .

Qualitative analysis tends to be quite flexible and relies on the researcher’s judgement, so you have to reflect carefully on your choices and assumptions and be careful to avoid research bias .

Quantitative analysis methods

Quantitative analysis uses numbers and statistics to understand frequencies, averages and correlations (in descriptive studies) or cause-and-effect relationships (in experiments).

You can use quantitative analysis to interpret data that was collected either:

  • During an experiment .
  • Using probability sampling methods .

Because the data is collected and analyzed in a statistically valid way, the results of quantitative analysis can be easily standardized and shared among researchers.

Research methods for analyzing data
Research method Qualitative or quantitative? When to use
Quantitative To analyze data collected in a statistically valid manner (e.g. from experiments, surveys, and observations).
Meta-analysis Quantitative To statistically analyze the results of a large collection of studies.

Can only be applied to studies that collected data in a statistically valid manner.

Qualitative To analyze data collected from interviews, , or textual sources.

To understand general themes in the data and how they are communicated.

Either To analyze large volumes of textual or visual data collected from surveys, literature reviews, or other sources.

Can be quantitative (i.e. frequencies of words) or qualitative (i.e. meanings of words).

Prevent plagiarism. Run a free check.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square test of independence
  • Statistical power
  • Descriptive statistics
  • Degrees of freedom
  • Pearson correlation
  • Null hypothesis
  • Double-blind study
  • Case-control study
  • Research ethics
  • Data collection
  • Hypothesis testing
  • Structured interviews

Research bias

  • Hawthorne effect
  • Unconscious bias
  • Recall bias
  • Halo effect
  • Self-serving bias
  • Information bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Is this article helpful?

Other students also liked, writing strong research questions | criteria & examples.

  • What Is a Research Design | Types, Guide & Examples
  • Data Collection | Definition, Methods & Examples

More interesting articles

  • Between-Subjects Design | Examples, Pros, & Cons
  • Cluster Sampling | A Simple Step-by-Step Guide with Examples
  • Confounding Variables | Definition, Examples & Controls
  • Construct Validity | Definition, Types, & Examples
  • Content Analysis | Guide, Methods & Examples
  • Control Groups and Treatment Groups | Uses & Examples
  • Control Variables | What Are They & Why Do They Matter?
  • Correlation vs. Causation | Difference, Designs & Examples
  • Correlational Research | When & How to Use
  • Critical Discourse Analysis | Definition, Guide & Examples
  • Cross-Sectional Study | Definition, Uses & Examples
  • Descriptive Research | Definition, Types, Methods & Examples
  • Ethical Considerations in Research | Types & Examples
  • Explanatory and Response Variables | Definitions & Examples
  • Explanatory Research | Definition, Guide, & Examples
  • Exploratory Research | Definition, Guide, & Examples
  • External Validity | Definition, Types, Threats & Examples
  • Extraneous Variables | Examples, Types & Controls
  • Guide to Experimental Design | Overview, Steps, & Examples
  • How Do You Incorporate an Interview into a Dissertation? | Tips
  • How to Do Thematic Analysis | Step-by-Step Guide & Examples
  • How to Write a Literature Review | Guide, Examples, & Templates
  • How to Write a Strong Hypothesis | Steps & Examples
  • Inclusion and Exclusion Criteria | Examples & Definition
  • Independent vs. Dependent Variables | Definition & Examples
  • Inductive Reasoning | Types, Examples, Explanation
  • Inductive vs. Deductive Research Approach | Steps & Examples
  • Internal Validity in Research | Definition, Threats, & Examples
  • Internal vs. External Validity | Understanding Differences & Threats
  • Longitudinal Study | Definition, Approaches & Examples
  • Mediator vs. Moderator Variables | Differences & Examples
  • Mixed Methods Research | Definition, Guide & Examples
  • Multistage Sampling | Introductory Guide & Examples
  • Naturalistic Observation | Definition, Guide & Examples
  • Operationalization | A Guide with Examples, Pros & Cons
  • Population vs. Sample | Definitions, Differences & Examples
  • Primary Research | Definition, Types, & Examples
  • Qualitative vs. Quantitative Research | Differences, Examples & Methods
  • Quasi-Experimental Design | Definition, Types & Examples
  • Questionnaire Design | Methods, Question Types & Examples
  • Random Assignment in Experiments | Introduction & Examples
  • Random vs. Systematic Error | Definition & Examples
  • Reliability vs. Validity in Research | Difference, Types and Examples
  • Reproducibility vs Replicability | Difference & Examples
  • Reproducibility vs. Replicability | Difference & Examples
  • Sampling Methods | Types, Techniques & Examples
  • Semi-Structured Interview | Definition, Guide & Examples
  • Simple Random Sampling | Definition, Steps & Examples
  • Single, Double, & Triple Blind Study | Definition & Examples
  • Stratified Sampling | Definition, Guide & Examples
  • Structured Interview | Definition, Guide & Examples
  • Survey Research | Definition, Examples & Methods
  • Systematic Review | Definition, Example, & Guide
  • Systematic Sampling | A Step-by-Step Guide with Examples
  • Textual Analysis | Guide, 3 Approaches & Examples
  • The 4 Types of Reliability in Research | Definitions & Examples
  • The 4 Types of Validity in Research | Definitions & Examples
  • Transcribing an Interview | 5 Steps & Transcription Software
  • Triangulation in Research | Guide, Types, Examples
  • Types of Interviews in Research | Guide & Examples
  • Types of Research Designs Compared | Guide & Examples
  • Types of Variables in Research & Statistics | Examples
  • Unstructured Interview | Definition, Guide & Examples
  • What Is a Case Study? | Definition, Examples & Methods
  • What Is a Case-Control Study? | Definition & Examples
  • What Is a Cohort Study? | Definition & Examples
  • What Is a Conceptual Framework? | Tips & Examples
  • What Is a Controlled Experiment? | Definitions & Examples
  • What Is a Double-Barreled Question?
  • What Is a Focus Group? | Step-by-Step Guide & Examples
  • What Is a Likert Scale? | Guide & Examples
  • What Is a Prospective Cohort Study? | Definition & Examples
  • What Is a Retrospective Cohort Study? | Definition & Examples
  • What Is Action Research? | Definition & Examples
  • What Is an Observational Study? | Guide & Examples
  • What Is Concurrent Validity? | Definition & Examples
  • What Is Content Validity? | Definition & Examples
  • What Is Convenience Sampling? | Definition & Examples
  • What Is Convergent Validity? | Definition & Examples
  • What Is Criterion Validity? | Definition & Examples
  • What Is Data Cleansing? | Definition, Guide & Examples
  • What Is Deductive Reasoning? | Explanation & Examples
  • What Is Discriminant Validity? | Definition & Example
  • What Is Ecological Validity? | Definition & Examples
  • What Is Ethnography? | Definition, Guide & Examples
  • What Is Face Validity? | Guide, Definition & Examples
  • What Is Non-Probability Sampling? | Types & Examples
  • What Is Participant Observation? | Definition & Examples
  • What Is Peer Review? | Types & Examples
  • What Is Predictive Validity? | Examples & Definition
  • What Is Probability Sampling? | Types & Examples
  • What Is Purposive Sampling? | Definition & Examples
  • What Is Qualitative Observation? | Definition & Examples
  • What Is Qualitative Research? | Methods & Examples
  • What Is Quantitative Observation? | Definition & Examples
  • What Is Quantitative Research? | Definition, Uses & Methods

What is your plagiarism score?


is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data. According to Shamoo and Resnik (2003) various analytic procedures “provide a way of drawing inductive inferences from data and distinguishing the signal (the phenomenon of interest) from the noise (statistical fluctuations) present in the data”..

While data analysis in qualitative research can include statistical procedures, many times analysis becomes an ongoing iterative process where data is continuously collected and analyzed almost simultaneously. Indeed, researchers generally analyze for patterns in observations through the entire data collection phase (Savenye, Robinson, 2004). The form of the analysis is determined by the specific qualitative approach taken (field study, ethnography content analysis, oral history, biography, research) and the form of the data (field notes, documents, audiotape, videotape).

An essential component of ensuring data integrity is the accurate and appropriate analysis of research findings. Improper statistical analyses distort scientific findings, mislead casual readers (Shepard, 2002), and may negatively influence the public perception of research. Integrity issues are just as relevant to analysis of non-statistical data as well.

Considerations/issues in data analysis

There are a number of issues that researchers should be cognizant of with respect to data analysis. These include:

when analyzing qualitative data

A tacit assumption of investigators is that they have received training sufficient to demonstrate a high standard of research practice. Unintentional ‘scientific misconduct' is likely the result of poor instruction and follow-up. A number of studies suggest this may be the case more often than believed (Nowak, 1994; Silverman, Manson, 2003). For example, Sica found that adequate training of physicians in medical schools in the proper design, implementation and evaluation of clinical trials is “abysmally small” (Sica, cited in Nowak, 1994). Indeed, a single course in biostatistics is the most that is usually offered (Christopher Williams, cited in Nowak, 1994).

A common practice of investigators is to defer the selection of analytic procedure to a research team ‘statistician’. Ideally, investigators should have substantially more than a basic understanding of the rationale for selecting one method of analysis over another. This can allow investigators to better supervise staff who conduct the data analyses process and make informed decisions


While methods of analysis may differ by scientific discipline, the optimal stage for determining appropriate analytic procedures occurs early in the research process and should not be an afterthought. According to Smeeton and Goda (2003), “Statistical advice should be obtained at the stage of initial planning of an investigation so that, for example, the method of sampling and design of questionnaire are appropriate”.

The chief aim of analysis is to distinguish between an event occurring as either reflecting a true effect versus a false one. Any bias occurring in the collection of the data, or selection of method of analysis, will increase the likelihood of drawing a biased inference. Bias can occur when recruitment of study participants falls below minimum number required to demonstrate statistical power or failure to maintain a sufficient follow-up period needed to demonstrate an effect (Altman, 2001).



When failing to demonstrate statistically different levels between treatment groups, investigators may resort to breaking down the analysis to smaller and smaller subgroups in order to find a difference. Although this practice may not inherently be unethical, these analyses should be proposed before beginning the study even if the intent is exploratory in nature. If it the study is exploratory in nature, the investigator should make this explicit so that readers understand that the research is more of a hunting expedition rather than being primarily theory driven. Although a researcher may not have a theory-based hypothesis for testing relationships between previously untested variables, a theory will have to be developed to explain an unanticipated finding. Indeed, in exploratory science, there are no a priori hypotheses therefore there are no hypothetical tests. Although theories can often drive the processes used in the investigation of qualitative studies, many times patterns of behavior or occurrences derived from analyzed data can result in developing new theoretical frameworks rather than determined (Savenye, Robinson, 2004).

It is conceivable that multiple statistical tests could yield a significant finding by chance alone rather than reflecting a true effect. Integrity is compromised if the investigator only reports tests with significant findings, and neglects to mention a large number of tests failing to reach significance. While access to computer-based statistical packages can facilitate application of increasingly complex analytic procedures, inappropriate uses of these packages can result in abuses as well.



Every field of study has developed its accepted practices for data analysis. Resnik (2000) states that it is prudent for investigators to follow these accepted norms. Resnik further states that the norms are ‘…based on two factors:

(1) the nature of the variables used (i.e., quantitative, comparative, or qualitative),

(2) assumptions about the population from which the data are drawn (i.e., random distribution, independence, sample size, etc.). If one uses unconventional norms, it is crucial to clearly state this is being done, and to show how this new and possibly unaccepted method of analysis is being used, as well as how it differs from other more traditional methods. For example, Schroder, Carey, and Vanable (2003) juxtapose their identification of new and powerful data analytic solutions developed to count data in the area of HIV contraction risk with a discussion of the limitations of commonly applied methods.

If one uses unconventional norms, it is crucial to clearly state this is being done, and to show how this new and possibly unaccepted method of analysis is being used, as well as how it differs from other more traditional methods. For example, Schroder, Carey, and Vanable (2003) juxtapose their identification of new and powerful data analytic solutions developed to count data in the area of HIV contraction risk with a discussion of the limitations of commonly applied methods.



While the conventional practice is to establish a standard of acceptability for statistical significance, with certain disciplines, it may also be appropriate to discuss whether attaining statistical significance has a true practical meaning, i.e., . Jeans (1992) defines ‘clinical significance’ as “the potential for research findings to make a real and important difference to clients or clinical practice, to health status or to any other problem identified as a relevant priority for the discipline”.

Kendall and Grove (1988) define clinical significance in terms of what happens when “… troubled and disordered clients are now, after treatment, not distinguishable from a meaningful and representative non-disturbed reference group”. Thompson and Noferi (2002) suggest that readers of counseling literature should expect authors to report either practical or clinical significance indices, or both, within their research reports. Shepard (2003) questions why some authors fail to point out that the magnitude of observed changes may too small to have any clinical or practical significance, “sometimes, a supposed change may be described in some detail, but the investigator fails to disclose that the trend is not statistically significant ”.

No amount of statistical analysis, regardless of the level of the sophistication, will correct poorly defined objective outcome measurements. Whether done unintentionally or by design, this practice increases the likelihood of clouding the interpretation of findings, thus potentially misleading readers.
The basis for this issue is the urgency of reducing the likelihood of statistical error. Common challenges include the exclusion of , filling in missing data, altering or otherwise changing data, data mining, and developing graphical representations of the data (Shamoo, Resnik, 2003).


At times investigators may enhance the impression of a significant finding by determining how to present (as opposed to data in its raw form), which portion of the data is shown, why, how and to whom (Shamoo, Resnik, 2003). Nowak (1994) notes that even experts do not agree in distinguishing between analyzing and massaging data. Shamoo (1989) recommends that investigators maintain a sufficient and accurate paper trail of how data was manipulated for future review.



The integrity of data analysis can be compromised by the environment or context in which data was collected i.e., face-to face interviews vs. focused group. The occurring within a dyadic relationship (interviewer-interviewee) differs from the group dynamic occurring within a focus group because of the number of participants, and how they react to each other’s responses. Since the data collection process could be influenced by the environment/context, researchers should take this into account when conducting data analysis.

Analyses could also be influenced by the method in which data was recorded. For example, research events could be documented by:

a. recording audio and/or video and transcribing later
b. either a researcher or self-administered survey
c. either or
d. preparing ethnographic field notes from a participant/observer
e. requesting that participants themselves take notes, compile and submit them to researchers.

While each methodology employed has rationale and advantages, issues of objectivity and subjectivity may be raised when data is analyzed.

During content analysis, staff researchers or ‘raters’ may use inconsistent strategies in analyzing text material. Some ‘raters’ may analyze comments as a whole while others may prefer to dissect text material by separating words, phrases, clauses, sentences or groups of sentences. Every effort should be made to reduce or eliminate inconsistencies between “raters” so that data integrity is not compromised.

A major challenge to data integrity could occur with the unmonitored supervision of inductive techniques. Content analysis requires raters to assign topics to text material (comments). The threat to integrity may arise when raters have received inconsistent training, or may have received previous training experience(s). Previous experience may affect how raters perceive the material or even perceive the nature of the analyses to be conducted. Thus one rater could assign topics or codes to material that is significantly different from another rater. Strategies to address this would include clearly stating a list of analyses procedures in the protocol manual, consistent training, and routine monitoring of raters.

Researchers performing analysis on either quantitative or qualitative analyses should be aware of challenges to reliability and validity. For example, in the area of content analysis, Gottschalk (1995) identifies three factors that can affect the reliability of analyzed data:

The potential for compromising data integrity arises when researchers cannot consistently demonstrate stability, reproducibility, or accuracy of data analysis

According Gottschalk, (1995), the validity of a content analysis study refers to the correspondence of the categories (the classification that raters’ assigned to text content) to the conclusions, and the generalizability of results to a theory (did the categories support the study’s conclusion, and was the finding adequately robust to support or be applied to a selected theoretical rationale?).



Upon coding text material for content analysis, raters must classify each code into an appropriate category of a cross-reference matrix. Relying on computer software to determine a frequency or word count can lead to inaccuracies. “One may obtain an accurate count of that word's occurrence and frequency, but not have an accurate accounting of the meaning inherent in each particular usage” (Gottschalk, 1995). Further analyses might be appropriate to discover the dimensionality of the data set or identity new meaningful underlying variables.

Whether statistical or non-statistical methods of analyses are used, researchers should be aware of the potential for compromising data integrity. While statistical analysis is typically performed on quantitative data, there are numerous analytic procedures specifically designed for qualitative material including content, thematic, and ethnographic analysis. Regardless of whether one studies quantitative or qualitative phenomena, researchers use a variety of tools to analyze data in order to test hypotheses, discern patterns of behavior, and ultimately answer research questions. Failure to understand or acknowledge data analysis issues presented can compromise data integrity.

References:

Gottschalk, L. A. (1995). Content analysis of verbal behavior: New findings and clinical applications. Hillside, NJ: Lawrence Erlbaum Associates, Inc

Jeans, M. E. (1992). Clinical significance of research: A growing concern. Canadian Journal of Nursing Research, 24, 1-4.

Lefort, S. (1993). The statistical versus clinical significance debate. Image, 25, 57-62.
Kendall, P. C., & Grove, W. (1988). Normative comparisons in therapy outcome. Behavioral Assessment, 10, 147-158.

Nowak, R. (1994). Problems in clinical trials go far beyond misconduct. Science. 264(5165): 1538-41.
Resnik, D. (2000). Statistics, ethics, and research: an agenda for educations and reform. Accountability in Research. 8: 163-88

Schroder, K.E., Carey, M.P., Venable, P.A. (2003). Methodological challenges in research on sexual risk behavior: I. Item content, scaling, and data analytic options. Ann Behav Med, 26(2): 76-103.

Shamoo, A.E., Resnik, B.R. (2003). Responsible Conduct of Research. Oxford University Press.

Shamoo, A.E. (1989). Principles of Research Data Audit. Gordon and Breach, New York.

Shepard, R.J. (2002). Ethics in exercise science research. Sports Med, 32 (3): 169-183.

Silverman, S., Manson, M. (2003). Research on teaching in physical education doctoral dissertations: a detailed investigation of focus, method, and analysis. Journal of Teaching in Physical Education, 22(3): 280-297.

Smeeton, N., Goda, D. (2003). Conducting and presenting social work research: some basic statistical considerations. Br J Soc Work, 33: 567-573.

Thompson, B., Noferi, G. 2002. Statistical, practical, clinical: How many types of significance should be considered in counseling research? Journal of Counseling & Development, 80(4):64-71.

 

Research: Meaning and Purpose

  • First Online: 27 October 2022

Cite this chapter

meaning of analysis in research

  • Kazi Abusaleh 4 &
  • Akib Bin Anwar 5  

3000 Accesses

The objective of the chapter is to provide the conceptual framework of the research and research process and draw the importance of research in social sciences. Various books and research papers were reviewed to write the chapter. The chapter defines ‘research’ as a deliberate and systematic scientific investigation into a phenomenon to explore, analyse, and predict about the issues or circumstances, and characterizes ‘research’ as a systematic and scientific mode of inquiry, a way to testify the existing knowledge and theories, and a well-designed process to answer questions in a reliable and unbiased way. This chapter, however, categorizes research into eight types under four headings, explains six steps to carry out a research work scientifically, and finally sketches the importance of research in social sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

meaning of analysis in research

Research Design and Methodology

meaning of analysis in research

Research Questions and Research Design

meaning of analysis in research

Looking Back

Adams, G. R., & Schvaneveldt, J. D. (1991). Understanding research methods . Addison-Wesley Longman Ltd.

Google Scholar  

Adams, G., & Schvaneveldt, J. (1985). Obtaining Data: Questionnaire and Interview. Understanding research methods (pp. 199–229). Longman.

Adams, S. (1975). Evaluative research in corrections: A practical guide. US Department of Justice, Law Enforcement Assistance Administration, National Institute of Law Enforcement and Criminal Justice.

Aminuzzaman, S. M. (1991). Introduction to social research . Bangladesh publishers.

Ary, D., Jacobs, L. C., & Sorensen, C. K. (2010). Introduction to research in education (8th ed.). Wardsworth.

Best, J. W., & Kahn, J. V. (1986). Research in education (5th ed.). Prentice Hall.

Bhattacherjee, A. (2012). Social science research: Principles, methods, and practices . University of South Florida.

Black, T. R. (1993). Evaluating social science research: An introduction . Sage.

Borg, W. R., & Gall, M. D. (1963). Educational research: An introduction . David McKay Company.

Burns, A. C. (2006). Marketing research. Pearson Education.

Connaway, L. S., & Powell, R. R. (2010). Basic research methods for librarians . ABC-CLIO.

Cresswell, J. W. (2008). Educational research: Planning, conducting and evaluating qualitative and quantitative research (4th ed.). Merrill & Prentice Hall.

Gebremedhin, T. G., & Tweeten, L. G. (1994). Research methods and communication in the social sciences . ABC-CLIO.

Ghosh, B. N. (1985). Scientific method and social research . Stwiling Publishers/Advent Books Division.

Given, L. M. (Ed.). (2008). The Sage encyclopaedia of qualitative research methods . Sage publications.

Greenwood, D. J., & Levin, M. (2007). Introduction to action research: Social research for social change (2 nd ed.). SAGE publications.

Herr, K., & Anderson, G. L. (2014). The action research dissertation: A guide for students and faculty . Sage publications.

Kerlinger, F. N. (1964). Foundation behavioural approach . Rinehart & Winston.

Kothari, C. R. (2004). Research methodology: Methods and techniques . New Age International (P) Limited Publishers.

Kumar, R. (2011). Selecting a method of data collection’. Research methodology: a step by step guide for beginners (3 rd ed.). Sage.

Leedy, P. D. (1981). How to read research and understand it . Macmillan.

Leedy, P. D., & Ormrod, J. E. (2015). Practical research: planning and design (11th ed.). Global Edition.

Merriam-Webster Online Dictionary (2020). Merriam-Webster. Retrieved April 25, 2020 from www.merriam-webster.com/dictionary/research

Mishra, D. S. (2017). Handbook of research methodology: A Compendium for scholars & researchers . Educreation Publishing.

Narayana, P. S., Varalakshmi, D., Pullaiah, T., & Rao, K. S. (2018). Research methodology in Zoology. Scientific Publishers.

Oxford Learner’s Online Dictionaries (2020). Oxford University Press. Retrieved April 25, 2020 from www.oxfordlearnersdictionaries.com/definition/english/research_1?q=research

Polansky, N. A. (Ed.). (1960). Social work research: methods for the helping professions . University of Chicago Press.

Selltiz, C., Wrightsman, L. S., & Cook, S. W. (1976). Research methods in social relations . Holt.

Smith, V. H. (1998). Measuring the benefits of social science research (Vol. 2, pp. 01–21). International Food Policy Research Institute.

Somekh, B., & Lewin, C. (2004). Research Methods in the Social Sciences . Sage Publications.

Suchman, E. (1968). Evaluative Research: Principles and Practice in Public Service and Social Action Programs . Russell Sage Foundation.

Download references

Author information

Authors and affiliations.

Transparency International Bangladesh (TIB), Dhanmondi, Dhaka, 1209, Bangladesh

Kazi Abusaleh

Community Mobilization Manager, Winrock International, Dhaka, 1212, Bangladesh

Akib Bin Anwar

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Kazi Abusaleh .

Editor information

Editors and affiliations.

Centre for Family and Child Studies, Research Institute of Humanities and Social Sciences, University of Sharjah, Sharjah, United Arab Emirates

M. Rezaul Islam

Department of Development Studies, University of Dhaka, Dhaka, Bangladesh

Niaz Ahmed Khan

Department of Social Work, School of Humanities, University of Johannesburg, Johannesburg, South Africa

Rajendra Baikady

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Abusaleh, K., Anwar, A.B. (2022). Research: Meaning and Purpose. In: Islam, M.R., Khan, N.A., Baikady, R. (eds) Principles of Social Research Methodology. Springer, Singapore. https://doi.org/10.1007/978-981-19-5441-2_2

Download citation

DOI : https://doi.org/10.1007/978-981-19-5441-2_2

Published : 27 October 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-19-5219-7

Online ISBN : 978-981-19-5441-2

eBook Packages : Social Sciences Social Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Careers in Qual

Quick answers, analysis and interpretation, related terms.

Skip to content

Read the latest news stories about Mailman faculty, research, and events. 

Departments

We integrate an innovative skills-based curriculum, research collaborations, and hands-on field experience to prepare students.

Learn more about our research centers, which focus on critical issues in public health.

Our Faculty

Meet the faculty of the Mailman School of Public Health. 

Become a Student

Life and community, how to apply.

Learn how to apply to the Mailman School of Public Health. 

Content Analysis

Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. As an example, researchers can evaluate language used within a news article to search for bias or partiality. Researchers can then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of surrounding the text.

Description

Sources of data could be from interviews, open-ended questions, field research notes, conversations, or literally any occurrence of communicative language (such as books, essays, discussions, newspaper headlines, speeches, media, historical documents). A single study may analyze various forms of text in its analysis. To analyze the text using content analysis, the text must be coded, or broken down, into manageable code categories for analysis (i.e. “codes”). Once the text is coded into code categories, the codes can then be further categorized into “code categories” to summarize data even further.

Three different definitions of content analysis are provided below.

Definition 1: “Any technique for making inferences by systematically and objectively identifying special characteristics of messages.” (from Holsti, 1968)

Definition 2: “An interpretive and naturalistic approach. It is both observational and narrative in nature and relies less on the experimental elements normally associated with scientific research (reliability, validity, and generalizability) (from Ethnography, Observational Research, and Narrative Inquiry, 1994-2012).

Definition 3: “A research technique for the objective, systematic and quantitative description of the manifest content of communication.” (from Berelson, 1952)

Uses of Content Analysis

Identify the intentions, focus or communication trends of an individual, group or institution

Describe attitudinal and behavioral responses to communications

Determine the psychological or emotional state of persons or groups

Reveal international differences in communication content

Reveal patterns in communication content

Pre-test and improve an intervention or survey prior to launch

Analyze focus group interviews and open-ended questions to complement quantitative data

Types of Content Analysis

There are two general types of content analysis: conceptual analysis and relational analysis. Conceptual analysis determines the existence and frequency of concepts in a text. Relational analysis develops the conceptual analysis further by examining the relationships among concepts in a text. Each type of analysis may lead to different results, conclusions, interpretations and meanings.

Conceptual Analysis

Typically people think of conceptual analysis when they think of content analysis. In conceptual analysis, a concept is chosen for examination and the analysis involves quantifying and counting its presence. The main goal is to examine the occurrence of selected terms in the data. Terms may be explicit or implicit. Explicit terms are easy to identify. Coding of implicit terms is more complicated: you need to decide the level of implication and base judgments on subjectivity (an issue for reliability and validity). Therefore, coding of implicit terms involves using a dictionary or contextual translation rules or both.

To begin a conceptual content analysis, first identify the research question and choose a sample or samples for analysis. Next, the text must be coded into manageable content categories. This is basically a process of selective reduction. By reducing the text to categories, the researcher can focus on and code for specific words or patterns that inform the research question.

General steps for conducting a conceptual content analysis:

1. Decide the level of analysis: word, word sense, phrase, sentence, themes

2. Decide how many concepts to code for: develop a pre-defined or interactive set of categories or concepts. Decide either: A. to allow flexibility to add categories through the coding process, or B. to stick with the pre-defined set of categories.

Option A allows for the introduction and analysis of new and important material that could have significant implications to one’s research question.

Option B allows the researcher to stay focused and examine the data for specific concepts.

3. Decide whether to code for existence or frequency of a concept. The decision changes the coding process.

When coding for the existence of a concept, the researcher would count a concept only once if it appeared at least once in the data and no matter how many times it appeared.

When coding for the frequency of a concept, the researcher would count the number of times a concept appears in a text.

4. Decide on how you will distinguish among concepts:

Should text be coded exactly as they appear or coded as the same when they appear in different forms? For example, “dangerous” vs. “dangerousness”. The point here is to create coding rules so that these word segments are transparently categorized in a logical fashion. The rules could make all of these word segments fall into the same category, or perhaps the rules can be formulated so that the researcher can distinguish these word segments into separate codes.

What level of implication is to be allowed? Words that imply the concept or words that explicitly state the concept? For example, “dangerous” vs. “the person is scary” vs. “that person could cause harm to me”. These word segments may not merit separate categories, due the implicit meaning of “dangerous”.

5. Develop rules for coding your texts. After decisions of steps 1-4 are complete, a researcher can begin developing rules for translation of text into codes. This will keep the coding process organized and consistent. The researcher can code for exactly what he/she wants to code. Validity of the coding process is ensured when the researcher is consistent and coherent in their codes, meaning that they follow their translation rules. In content analysis, obeying by the translation rules is equivalent to validity.

6. Decide what to do with irrelevant information: should this be ignored (e.g. common English words like “the” and “and”), or used to reexamine the coding scheme in the case that it would add to the outcome of coding?

7. Code the text: This can be done by hand or by using software. By using software, researchers can input categories and have coding done automatically, quickly and efficiently, by the software program. When coding is done by hand, a researcher can recognize errors far more easily (e.g. typos, misspelling). If using computer coding, text could be cleaned of errors to include all available data. This decision of hand vs. computer coding is most relevant for implicit information where category preparation is essential for accurate coding.

8. Analyze your results: Draw conclusions and generalizations where possible. Determine what to do with irrelevant, unwanted, or unused text: reexamine, ignore, or reassess the coding scheme. Interpret results carefully as conceptual content analysis can only quantify the information. Typically, general trends and patterns can be identified.

Relational Analysis

Relational analysis begins like conceptual analysis, where a concept is chosen for examination. However, the analysis involves exploring the relationships between concepts. Individual concepts are viewed as having no inherent meaning and rather the meaning is a product of the relationships among concepts.

To begin a relational content analysis, first identify a research question and choose a sample or samples for analysis. The research question must be focused so the concept types are not open to interpretation and can be summarized. Next, select text for analysis. Select text for analysis carefully by balancing having enough information for a thorough analysis so results are not limited with having information that is too extensive so that the coding process becomes too arduous and heavy to supply meaningful and worthwhile results.

There are three subcategories of relational analysis to choose from prior to going on to the general steps.

Affect extraction: an emotional evaluation of concepts explicit in a text. A challenge to this method is that emotions can vary across time, populations, and space. However, it could be effective at capturing the emotional and psychological state of the speaker or writer of the text.

Proximity analysis: an evaluation of the co-occurrence of explicit concepts in the text. Text is defined as a string of words called a “window” that is scanned for the co-occurrence of concepts. The result is the creation of a “concept matrix”, or a group of interrelated co-occurring concepts that would suggest an overall meaning.

Cognitive mapping: a visualization technique for either affect extraction or proximity analysis. Cognitive mapping attempts to create a model of the overall meaning of the text such as a graphic map that represents the relationships between concepts.

General steps for conducting a relational content analysis:

1. Determine the type of analysis: Once the sample has been selected, the researcher needs to determine what types of relationships to examine and the level of analysis: word, word sense, phrase, sentence, themes. 2. Reduce the text to categories and code for words or patterns. A researcher can code for existence of meanings or words. 3. Explore the relationship between concepts: once the words are coded, the text can be analyzed for the following:

Strength of relationship: degree to which two or more concepts are related.

Sign of relationship: are concepts positively or negatively related to each other?

Direction of relationship: the types of relationship that categories exhibit. For example, “X implies Y” or “X occurs before Y” or “if X then Y” or if X is the primary motivator of Y.

4. Code the relationships: a difference between conceptual and relational analysis is that the statements or relationships between concepts are coded. 5. Perform statistical analyses: explore differences or look for relationships among the identified variables during coding. 6. Map out representations: such as decision mapping and mental models.

Reliability and Validity

Reliability : Because of the human nature of researchers, coding errors can never be eliminated but only minimized. Generally, 80% is an acceptable margin for reliability. Three criteria comprise the reliability of a content analysis:

Stability: the tendency for coders to consistently re-code the same data in the same way over a period of time.

Reproducibility: tendency for a group of coders to classify categories membership in the same way.

Accuracy: extent to which the classification of text corresponds to a standard or norm statistically.

Validity : Three criteria comprise the validity of a content analysis:

Closeness of categories: this can be achieved by utilizing multiple classifiers to arrive at an agreed upon definition of each specific category. Using multiple classifiers, a concept category that may be an explicit variable can be broadened to include synonyms or implicit variables.

Conclusions: What level of implication is allowable? Do conclusions correctly follow the data? Are results explainable by other phenomena? This becomes especially problematic when using computer software for analysis and distinguishing between synonyms. For example, the word “mine,” variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. Software can obtain an accurate count of that word’s occurrence and frequency, but not be able to produce an accurate accounting of the meaning inherent in each particular usage. This problem could throw off one’s results and make any conclusion invalid.

Generalizability of the results to a theory: dependent on the clear definitions of concept categories, how they are determined and how reliable they are at measuring the idea one is seeking to measure. Generalizability parallels reliability as much of it depends on the three criteria for reliability.

Advantages of Content Analysis

Directly examines communication using text

Allows for both qualitative and quantitative analysis

Provides valuable historical and cultural insights over time

Allows a closeness to data

Coded form of the text can be statistically analyzed

Unobtrusive means of analyzing interactions

Provides insight into complex models of human thought and language use

When done well, is considered a relatively “exact” research method

Content analysis is a readily-understood and an inexpensive research method

A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time.

Disadvantages of Content Analysis

Can be extremely time consuming

Is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation

Is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study

Is inherently reductive, particularly when dealing with complex texts

Tends too often to simply consist of word counts

Often disregards the context that produced the text, as well as the state of things after the text is produced

Can be difficult to automate or computerize

Textbooks & Chapters  

Berelson, Bernard. Content Analysis in Communication Research.New York: Free Press, 1952.

Busha, Charles H. and Stephen P. Harter. Research Methods in Librarianship: Techniques and Interpretation.New York: Academic Press, 1980.

de Sola Pool, Ithiel. Trends in Content Analysis. Urbana: University of Illinois Press, 1959.

Krippendorff, Klaus. Content Analysis: An Introduction to its Methodology. Beverly Hills: Sage Publications, 1980.

Fielding, NG & Lee, RM. Using Computers in Qualitative Research. SAGE Publications, 1991. (Refer to Chapter by Seidel, J. ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’.)

Methodological Articles  

Hsieh HF & Shannon SE. (2005). Three Approaches to Qualitative Content Analysis.Qualitative Health Research. 15(9): 1277-1288.

Elo S, Kaarianinen M, Kanste O, Polkki R, Utriainen K, & Kyngas H. (2014). Qualitative Content Analysis: A focus on trustworthiness. Sage Open. 4:1-10.

Application Articles  

Abroms LC, Padmanabhan N, Thaweethai L, & Phillips T. (2011). iPhone Apps for Smoking Cessation: A content analysis. American Journal of Preventive Medicine. 40(3):279-285.

Ullstrom S. Sachs MA, Hansson J, Ovretveit J, & Brommels M. (2014). Suffering in Silence: a qualitative study of second victims of adverse events. British Medical Journal, Quality & Safety Issue. 23:325-331.

Owen P. (2012).Portrayals of Schizophrenia by Entertainment Media: A Content Analysis of Contemporary Movies. Psychiatric Services. 63:655-659.

Choosing whether to conduct a content analysis by hand or by using computer software can be difficult. Refer to ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’ listed above in “Textbooks and Chapters” for a discussion of the issue.

QSR NVivo:  http://www.qsrinternational.com/products.aspx

Atlas.ti:  http://www.atlasti.com/webinars.html

R- RQDA package:  http://rqda.r-forge.r-project.org/

Rolly Constable, Marla Cowell, Sarita Zornek Crawford, David Golden, Jake Hartvigsen, Kathryn Morgan, Anne Mudgett, Kris Parrish, Laura Thomas, Erika Yolanda Thompson, Rosie Turner, and Mike Palmquist. (1994-2012). Ethnography, Observational Research, and Narrative Inquiry. Writing@CSU. Colorado State University. Available at: https://writing.colostate.edu/guides/guide.cfm?guideid=63 .

As an introduction to Content Analysis by Michael Palmquist, this is the main resource on Content Analysis on the Web. It is comprehensive, yet succinct. It includes examples and an annotated bibliography. The information contained in the narrative above draws heavily from and summarizes Michael Palmquist’s excellent resource on Content Analysis but was streamlined for the purpose of doctoral students and junior researchers in epidemiology.

At Columbia University Mailman School of Public Health, more detailed training is available through the Department of Sociomedical Sciences- P8785 Qualitative Research Methods.

Join the Conversation

Have a question about methods? Join us on Facebook

  • Open access
  • Published: 24 August 2024

Mixed effects models but not t-tests or linear regression detect progression of apathy in Parkinson’s disease over seven years in a cohort: a comparative analysis

  • Anne-Marie Hanff 1 , 2 , 3 , 4 ,
  • Rejko Krüger 1 , 2 , 5 ,
  • Christopher McCrum 4 ,
  • Christophe Ley 6 on behalf of

BMC Medical Research Methodology volume  24 , Article number:  183 ( 2024 ) Cite this article

200 Accesses

2 Altmetric

Metrics details

Introduction

While there is an interest in defining longitudinal change in people with chronic illness like Parkinson’s disease (PD), statistical analysis of longitudinal data is not straightforward for clinical researchers. Here, we aim to demonstrate how the choice of statistical method may influence research outcomes, (e.g., progression in apathy), specifically the size of longitudinal effect estimates, in a cohort.

In this retrospective longitudinal analysis of 802 people with typical Parkinson’s disease in the Luxembourg Parkinson's study, we compared the mean apathy scores at visit 1 and visit 8 by means of the paired two-sided t-test. Additionally, we analysed the relationship between the visit numbers and the apathy score using linear regression and longitudinal two-level mixed effects models.

Mixed effects models were the only method able to detect progression of apathy over time. While the effects estimated for the group comparison and the linear regression were smaller with high p -values (+ 1.016/ 7 years, p  = 0.107, -0.056/ 7 years, p  = 0.897, respectively), effect estimates for the mixed effects models were positive with a very small p -value, indicating a significant increase in apathy symptoms by + 2.345/ 7 years ( p  < 0.001).

The inappropriate use of paired t-tests and linear regression to analyse longitudinal data can lead to underpowered analyses and an underestimation of longitudinal change. While mixed effects models are not without limitations and need to be altered to model the time sequence between the exposure and the outcome, they are worth considering for longitudinal data analyses. In case this is not possible, limitations of the analytical approach need to be discussed and taken into account in the interpretation.

Peer Review reports

In longitudinal studies: “an outcome is repeatedly measured, i.e., the outcome variable is measured in the same subject on several occasions.” [ 1 ]. When assessing the same individuals over time, the different data points are likely to be more similar to each other than measurements taken from other individuals. Consequently, the application of special statistical techniques is required, which take into account the fact that the repeated observations of each subject are correlated [ 1 ]. Parkinson’s disease (PD) is a heterogeneous neurodegenerative disorder resulting in a wide variety of motor and non-motor symptoms including apathy, defined as a disorder of motivation, characterised by reduced goal-directed behaviour and cognitive activity and blunted affect [ 2 ]. Apathy increases over time in people with PD [ 3 ]. Specifically, apathy has been associated with the progressive denervation of ascending dopaminergic pathways in PD [ 4 , 5 ] leading to dysfunctions of circuits implicated in reward-related learning [ 5 ].

T-tests are often misused to analyse changes over time [ 6 ]. Consequently, we aim to demonstrate how the choice of statistical method may influence research outcomes, specifically the size and interpretation of longitudinal effect estimates in a cohort. Thus, the findings are intended for illustrative and educational purposes related to the statistical methodology. In a retrospective analysis of data from the Luxembourg Parkinson's study, a nation-wide, monocentric, observational, longitudinal-prospective dynamic cohort [ 7 , 8 ], we assess change in apathy using three different statistical approaches (paired t-test, linear regression, mixed effects model). We defined the following target estimand: In people diagnosed with PD, what is the change in the apathy score from visit 1 to visit 8? To estimate this change, we formulated the statistical hypothesis as follows:

While apathy was the dependent variable, we included the visit number as an independent variable (linear regression, mixed effects model) and as a grouping variable (paired t-test). The outcome apathy was measured by the discrete score from the Starkstein apathy scale (0 – 42, higher = worse) [ 9 ], a scale recommended by the Movement Disorders Society [ 10 ]. This data was obtained from the National Centre of Excellence in Research on Parkinson's disease (NCER-PD). The establishment of data collection standards, completion of the questionnaires at home at the participants’ convenience, mobile recruitment team for follow-up visits or standardized telephone questionnaire with a reduced assessment were part of the efforts in the primary study to address potential sources of bias [ 7 , 8 ]. Ethical approval was provided by the National Ethics Board (CNER Ref: 201,407/13). We used data from up to eight visits, which were performed annually between 2015 and 2023. Among the participants are people with typical PD and PD dementia (PDD), living mostly at home in Luxembourg and the Greater Region (geographically close areas of the surrounding countries Belgium, France, and Germany). People with atypical PD were excluded. The sample at the date of data export (2023.06.22) consisted of 802 individuals of which 269 (33.5%) were female. The average number of observations was 3.0. Fig. S1 reports the numbers of individuals at each visit while the characteristics of the participants are described in Table  1 .

As illustrated in the flow diagram (Fig.  1 ), the sample analysed from the paired t-test is highly selective: from the 802 participants at visit 1, the t-test only included 63 participants with data from visit 8. This arises from the fact that, first, we analyse the dataset from a dynamic cohort, i.e., the data at visit 1 were not collected at the same time point. Thus, 568 of the 802 participants joined the study less than eight years before, leading to only 234 participants eligible for the eighth yearly visit. Second, after excluding non-participants at visit 8 due to death ( n  = 41) and other reasons ( n  = 130), only 63 participants at visit 8 were left. To discuss the selective study population of a paired t-test, we compared the characteristics (age, education, age at diagnosis, apathy at visit 1) of the remaining 63 participants at visit 8 (included in the paired t-test) and the 127 non-participants at visit 8 (excluded from the paired t-test) [ 12 ].

figure 1

Flow diagram of patient recruitment

The paired two-sided t-test compared the mean apathy score at visit 1 with the mean apathy score at the visit 8. We attract the reader’s attention to the fact that this implies a rather small sample size as it includes only those people with data from the first and 8th visit. The linear regression analysed the relationship between the visit number and the apathy score (using the “stats” package [ 13 ]), while we performed longitudinal two-level mixed effects models analysis with a random intercept on subject level, a random slope for visit number and the visit number as fixed effect (using the “lmer”-function of the “lme4”-package [ 14 ]). The latter two approaches use all available data from all visits while the paired t-test does not. We illustrated the analyses in plots with the function “plot_model” of the R package sjPlot [ 15 ]. We conducted data analysis using R version 3.6.3 [ 13 ] and the R syntax for all analyses is provided on the OSF project page ( https://doi.org/ https://doi.org/10.17605/OSF.IO/NF4YB ).

Panel A in Fig.  2 illustrates the means and standard deviations of apathy for all participants at each visit, while the flow-chart (Fig. S1 ) illustrates the number of participants at each stage. On average, we see lower apathy scores at visit 8 compared to visit 1 (higher score = worse). By definition, the paired t-test analyses pairs, and in this case, only participants with complete apathy scores at visit 1 and visit 8 are included, reducing the total analysed sample to 63 pairs of observations. Consequently, the t-test compares mean apathy scores in a subgroup of participants with data at both visits leading to different observations from Panel A, as illustrated and described in Panel B: the apathy score has increased at visit 8, hence symptoms of apathy have worsened. The outcome of the t-test along with the code is given in Table  2 . Interestingly, the effect estimates for the increase in apathy were not statistically significant (+ 1.016 points, 95%CI: -0.225, 2.257, p  = 0.107). A possible reason for this non-significance is a loss of statistical power due to a small sample size included in the paired t-test. To visualise the loss of information between visit 1 and visit 8, we illustrated the complex individual trajectories of the participants in Fig.  3 . Moreover, as described in Table S1 in the supplement, the participants at visit 8 (63/190) analysed in the t-test were inherently significantly different compared to the non-participants at visit 8 (127/190): they were younger, had better education, and most importantly their apathy scores at visit 1 were lower. Consequently, those with the better overall situation kept coming back while this was not the case for those with a worse outcome at visit 1, which explains the observed (non-significant) increase. This may result in a biased estimation of change in apathy when analysed by the compared statistical methods.

figure 2

Bar charts illustrating apathy scores (means and standard deviations) per visit (Panel A: all participants, Panel B: subgroup analysed in the t-test). The red line indicates the mean apathy at visit 1

figure 3

Scatterplot illustrating the individual trajectories. The red line indicates the regression line

From the results in Table  2 , we see that the linear regression coefficient, representing change in apathy symptoms per year, is not significantly different from zero, indicating no change over time. One possible explanation is the violation of the assumption of independent observations for linear regressions. On the contrary, the effect estimates for the linear mixed effects models indicated a significant increase in apathy symptoms from visit 1 to visit 8 by + 2.680 points (95%CI: 1.880, 3.472, p  < 0.001). Consequently, mixed effects models were the only method able to detect an increase in apathy symptoms over time and choosing mixed effect models for the analysis of longitudinal data reduces the risk of false negative results. The differences in the effect sizes are also reflected in the regression lines in Panel A and B of Fig.  4 .

figure 4

Scatterplot illustrating the relationship between visit number and apathy. Apathy measured by a whole number interval scale, jitter applied on x- and y-axis to illustrate the data points (Panel A: Linear regression, Panel B: Linear mixed effects model). The red line indicates the regression line

The effect sizes differed depending on the choice of the statistical method. Thus, the paired t-test and the linear regression resulted in an output that would lead to different interpretations than the mixed effects models. More specifically, compared to the t-test and linear regression (which indicated non-significant changes in apathy of only + 1.016, -0.064 points from visit 1 to visit 8, respectively), the linear mixed effects models found an increase of + 2.680 points from visit 1 to visit 8 on the apathy scale. This increase is more than twice as high as indicated by the t-test and suggests linear mixed models is a more sensitive approach to detect meaningful changes perceived by people with PD over time.

Mixed effects models are a valuable tool in longitudinal data analysis as these models expand upon linear regression models by considering the correlation among repeated measurements within the same individuals through the estimation of a random intercept [ 1 , 16 , 17 ]. Specifically, to account for correlation between observations, linear mixed effects models use random effects to explicitly model the correlation structure, thus removing correlation from the error term. A random slope in addition to a random intercept allows both the rate of change and the mean value to vary by participant, capturing individual differences. This distinguishes them from group comparisons or standard linear regressions, in which such explicit modelling of correlation is not possible. Thus, the linear regression not considering correlation among the repeated observations leads to an underestimation of longitudinal change, explaining the smaller effect sizes and insignificant results of the regression. By including random effects, linear mixed effects models can better capture the variability within the data.

Another common challenge in longitudinal studies is missing data. Compared to the paired t-test and regression, the mixed effects models can also include participants with missing data at single visits and account for the individual trajectories of each participant as illustrated in Fig.  2 [ 18 ]. Although multiple imputation could increase the sample size, those results need to be interpreted with caution in case the data is not missing at random [ 18 , 19 ]. Note that we do not further elaborate here on this topic since this is a separate issue to statistical method comparison. Finally, assumptions of the different statistical methods need to be respected. The paired t-test assumes a normal distribution, homogeneity of variance and pairs of the same individuals in both groups [ 20 , 21 ]. While mixed effects models don’t rely on independent observations as it is the case for linear regression, all other assumptions for standard linear regression analysis (e.g., linearity, homoscedasticity, no multicollinearity) also hold for mixed effects model analyses. Thus, additional steps, e.g., check for linearity of the relationships or data transformations are required before the analysis of clinical research questions [ 17 ].

While mixed effects models are not without limitations and need to be altered to model the time sequence between the exposure and the outcome [ 1 ], they are worth considering for longitudinal data analyses. Thus, assuming an increase of apathy over time [ 3 ], mixed effects models were the only method able to detect statistically significant changes in the defined estimand, i.e., the change in apathy from visit 1 to visit 8. Possible reasons are a loss of statistical power due to a small sample size included in the paired t-test and the violence of the assumption of independent observations for linear regressions. Specifically, the effects estimated for the group comparison and the linear regression were smaller with high p -values, indicating a statistically insignificant change in apathy over time. The effect estimates for the mixed effects models were positive with a very small p -value, indicating a statistically significant increase in apathy symptoms from visit 1 to visit 8 in line with clinical expectations. Mixed effects models can be used to estimate different types of longitudinal effects while an inappropriate use of paired t-tests and linear regression to analyse longitudinal data can lead to underpowered analyses and an underestimation of longitudinal change and thus clinical significance. Therefore, researchers should more often consider mixed effects models for longitudinal analyses. In case this is not possible, limitations of the analytical approach need to be discussed and taken into account in the interpretation.

Availability of data and materials

The LUXPARK database used in this study was obtained from the National Centre of Excellence in Research on Parkinson’s disease (NCER-PD). NCER-PD database are not publicly available as they are linked to the Luxembourg Parkinson’s study and its internal regulations. The NCER-PD Consortium is willing to share its available data. Its access policy was devised based on the study ethics documents, including the informed consent form approved by the national ethics committee. Requests for access to datasets should be directed to the Data and Sample Access Committee by email at [email protected].

The code is available on OSF ( https://doi.org/10.17605/OSF.IO/NF4YB )

Abbreviations

Parkinson's disease

Null hypothesis

Alternative hypothesis

Parkinson's disease dementia

National Centre of Excellence in Research on Parkinson's disease

Open Science Framework

Confidence Interval

Twisk JWR. Applied Longitudinal Data Analysis for Epidemiology. A Practical Guide: Cambridge University Press; 2013.

Book   Google Scholar  

Levy R, Dubois B. Apathy and the functional anatomy of the prefrontal cortex-basal ganglia circuits. Cereb Cortex. 2006;16(7):916–28.

Article   PubMed   Google Scholar  

Poewe W, Seppi K, Tanner CM, Halliday GM, Brundin P, Volkmann J, et al. Parkinson disease. Nat Rev Dis Primers. 2017;3:17013.

Pagonabarraga J, Kulisevsky J, Strafella AP, Krack P. Apathy in Parkinson’s disease: clinical features, neural substrates, diagnosis, and treatment. Lancet Neurol. 2015;14(5):518–31.

Drui G, Carnicella S, Carcenac C, Favier M, Bertrand A, Boulet S, Savasta M. Loss of dopaminergic nigrostriatal neurons accounts for the motivational and affective deficits in Parkinson’s disease. Mol Psychiatry. 2014;19(3):358–67.

Article   CAS   PubMed   Google Scholar  

Liang G, Fu W, Wang K. Analysis of t-test misuses and SPSS operations in medical research papers. Burns Trauma. 2019;7:31.

Article   PubMed   PubMed Central   Google Scholar  

Hipp G, Vaillant M, Diederich NJ, Roomp K, Satagopam VP, Banda P, et al. The Luxembourg Parkinson’s Study: a comprehensive approach for stratification and early diagnosis. Front Aging Neurosci. 2018;10:326.

Pavelka L, Rawal R, Ghosh S, Pauly C, Pauly L, Hanff A-M, et al. Luxembourg Parkinson’s study -comprehensive baseline analysis of Parkinson’s disease and atypical parkinsonism. Front Neurol. 2023;14:1330321.

Starkstein SE, Mayberg HS, Preziosi TJ, Andrezejewski P, Leiguarda R, Robinson RG. Reliability, validity, and clinical correlates of apathy in Parkinson’s disease. J Neuropsychiatry Clin Neurosci. 1992;4(2):134–9.

Leentjens AF, Dujardin K, Marsh L, Martinez-Martin P, Richard IH, Starkstein SE, et al. Apathy and anhedonia rating scales in Parkinson’s disease: critique and recommendations. Mov Disord. 2008;23(14):2004–14.

Goetz CG, Tilley BC, Shaftman SR, Stebbins GT, Fahn S, Martinez-Martin P, et al. Movement Disorder Society-sponsored revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov Disord. 2008;23(15):2129–70.

Little RJA. A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc. 1988;83(404):1198–202.

Article   Google Scholar  

R Core Team. R: A language and environment for statistical computing Vienna: R Foundation for Statistical Computing; 2023. Available from: https://www.R-project.org/ .

Bates D, Maechler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67:1–48.

Lüdecke D. sjPlot: Data Visualization for Statistics in Social Science. 2022 [R package version 2.8.11]. Available from: https://CRAN.R-project.org/package=sjPlot .

Twisk JWR. Applied Multilevel Analysis: A Practical Guide for Medical Researchers. Cambridge: Cambridge University Press; 2006.

Twisk JWR. Applied Mixed Model Analysis. New York: A Practical Guide; 2019.

Long DJ. Longitudinal data analysis for the behavioral sciences using R. United States of America: SAGE; 2012.

Google Scholar  

Twisk JWR, de Boer M, de Vente W, Heymans M. Multiple imputation of missing values was not necessary before performing a longitudinal mixed-model analysis. J Clin Epidemiol. 2013;66(9):1022–8.

Student. The probable error of a mean. Biometrika. 1908;6(1):1–25.

Polit DF. Statistics and Data Analysis for Nursing Research. England: Pearson; 2014.

Download references

Acknowledgements

We would like to thank all participants of the Luxembourg Parkinson’s Study for their important support of our research. Furthermore, we acknowledge the joint effort of the National Centre of Excellence in Research on Parkinson’s Disease (NCER-PD) Consortium members from the partner institutions Luxembourg Centre for Systems Biomedicine, Luxembourg Institute of Health, Centre Hospitalier de Luxembourg, and Laboratoire National de Santé generally contributing to the Luxembourg Parkinson’s Study as listed below:

Geeta ACHARYA 2, Gloria AGUAYO 2, Myriam ALEXANDRE 2, Muhammad ALI 1, Wim AMMERLANN 2, Giuseppe ARENA 1, Michele BASSIS 1, Roxane BATUTU 3, Katy BEAUMONT 2, Sibylle BÉCHET 3, Guy BERCHEM 3, Alexandre BISDORFF 5, Ibrahim BOUSSAAD 1, David BOUVIER 4, Lorieza CASTILLO 2, Gessica CONTESOTTO 2, Nancy DE BREMAEKER 3, Brian DEWITT 2, Nico DIEDERICH 3, Rene DONDELINGER 5, Nancy E. RAMIA 1, Angelo Ferrari 2, Katrin FRAUENKNECHT 4, Joëlle FRITZ 2, Carlos GAMIO 2, Manon GANTENBEIN 2, Piotr GAWRON 1, Laura Georges 2, Soumyabrata GHOSH 1, Marijus GIRAITIS 2,3, Enrico GLAAB 1, Martine GOERGEN 3, Elisa GÓMEZ DE LOPE 1, Jérôme GRAAS 2, Mariella GRAZIANO 7, Valentin GROUES 1, Anne GRÜNEWALD 1, Gaël HAMMOT 2, Anne-Marie HANFF 2, 10, 11, Linda HANSEN 3, Michael HENEKA 1, Estelle HENRY 2, Margaux Henry 2, Sylvia HERBRINK 3, Sascha HERZINGER 1, Alexander HUNDT 2, Nadine JACOBY 8, Sonja JÓNSDÓTTIR 2,3, Jochen KLUCKEN 1,2,3, Olga KOFANOVA 2, Rejko KRÜGER 1,2,3, Pauline LAMBERT 2, Zied LANDOULSI 1, Roseline LENTZ 6, Laura LONGHINO 3, Ana Festas Lopes 2, Victoria LORENTZ 2, Tainá M. MARQUES 2, Guilherme MARQUES 2, Patricia MARTINS CONDE 1, Patrick MAY 1, Deborah MCINTYRE 2, Chouaib MEDIOUNI 2, Francoise MEISCH 1, Alexia MENDIBIDE 2, Myriam MENSTER 2, Maura MINELLI 2, Michel MITTELBRONN 1, 2, 4, 10, 12, 13, Saïda MTIMET 2, Maeva Munsch 2, Romain NATI 3, Ulf NEHRBASS 2, Sarah NICKELS 1, Beatrice NICOLAI 3, Jean-Paul NICOLAY 9, Fozia NOOR 2, Clarissa P. C. GOMES 1, Sinthuja PACHCHEK 1, Claire PAULY 2,3, Laure PAULY 2, 10, Lukas PAVELKA 2,3, Magali PERQUIN 2, Achilleas PEXARAS 2, Armin RAUSCHENBERGER 1, Rajesh RAWAL 1, Dheeraj REDDY BOBBILI 1, Lucie REMARK 2, Ilsé Richard 2, Olivia ROLAND 2, Kirsten ROOMP 1, Eduardo ROSALES 2, Stefano SAPIENZA 1, Venkata SATAGOPAM 1, Sabine SCHMITZ 1, Reinhard SCHNEIDER 1, Jens SCHWAMBORN 1, Raquel SEVERINO 2, Amir SHARIFY 2, Ruxandra SOARE 1, Ekaterina SOBOLEVA 1,3, Kate SOKOLOWSKA 2, Maud Theresine 2, Hermann THIEN 2, Elodie THIRY 3, Rebecca TING JIIN LOO 1, Johanna TROUET 2, Olena TSURKALENKO 2, Michel VAILLANT 2, Carlos VEGA 2, Liliana VILAS BOAS 3, Paul WILMES 1, Evi WOLLSCHEID-LENGELING 1, Gelani ZELIMKHANOV 2,3

1 Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg

2 Luxembourg Institute of Health, Strassen, Luxembourg

3 Centre Hospitalier de Luxembourg, Strassen, Luxembourg

4 Laboratoire National de Santé, Dudelange, Luxembourg

5 Centre Hospitalier Emile Mayrisch, Esch-sur-Alzette, Luxembourg

6 Parkinson Luxembourg Association, Leudelange, Luxembourg

7 Association of Physiotherapists in Parkinson's Disease Europe, Esch-sur-Alzette, Luxembourg

8 Private practice, Ettelbruck, Luxembourg

9 Private practice, Luxembourg, Luxembourg

10 Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg

11 Department of Epidemiology, CAPHRI School for Public Health and Primary Care, Maastricht University Medical Centre+, Maastricht, the Netherlands

12 Luxembourg Center of Neuropathology, Dudelange, Luxembourg

13 Department of Life Sciences and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg

This work was supported by grants from the Luxembourg National Research Fund (FNR) within the National Centre of Excellence in Research on Parkinson's disease [NCERPD(FNR/NCER13/BM/11264123)]. The funding body played no role in the design of the study and collection, analysis, interpretation of data, and in writing the manuscript.

Author information

Authors and affiliations.

Transversal Translational Medicine, Luxembourg Institute of Health, Strassen, Luxembourg

Anne-Marie Hanff & Rejko Krüger

Translational Neurosciences, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-Sur-Alzette, Luxembourg

Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University Medical Centre+, Maastricht, The Netherlands

Anne-Marie Hanff

Department of Nutrition and Movement Sciences, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University Medical Centre+, Maastricht, The Netherlands

Anne-Marie Hanff & Christopher McCrum

Parkinson Research Clinic, Centre Hospitalier du Luxembourg, Luxembourg, Luxembourg

Rejko Krüger

Department of Mathematics, University of Luxembourg, Esch-Sur-Alzette, Luxembourg

Christophe Ley

You can also search for this author in PubMed   Google Scholar

  • Geeta Acharya
  • , Gloria Aguayo
  • , Myriam Alexandre
  • , Muhammad Ali
  • , Wim Ammerlann
  • , Giuseppe Arena
  • , Michele Bassis
  • , Roxane Batutu
  • , Katy Beaumont
  • , Sibylle Béchet
  • , Guy Berchem
  • , Alexandre Bisdorff
  • , Ibrahim Boussaad
  • , David Bouvier
  • , Lorieza Castillo
  • , Gessica Contesotto
  • , Nancy de Bremaeker
  • , Brian Dewitt
  • , Nico Diederich
  • , Rene Dondelinger
  • , Nancy E. Ramia
  • , Angelo Ferrari
  • , Katrin Frauenknecht
  • , Joëlle Fritz
  • , Carlos Gamio
  • , Manon Gantenbein
  • , Piotr Gawron
  • , Laura georges
  • , Soumyabrata Ghosh
  • , Marijus Giraitis
  • , Enrico Glaab
  • , Martine Goergen
  • , Elisa Gómez de Lope
  • , Jérôme Graas
  • , Mariella Graziano
  • , Valentin Groues
  • , Anne Grünewald
  • , Gaël Hammot
  • , Anne-Marie Hanff
  • , Linda Hansen
  • , Michael Heneka
  • , Estelle Henry
  • , Margaux Henry
  • , Sylvia Herbrink
  • , Sascha Herzinger
  • , Alexander Hundt
  • , Nadine Jacoby
  • , Sonja Jónsdóttir
  • , Jochen Klucken
  • , Olga Kofanova
  • , Rejko Krüger
  • , Pauline Lambert
  • , Zied Landoulsi
  • , Roseline Lentz
  • , Laura Longhino
  • , Ana Festas Lopes
  • , Victoria Lorentz
  • , Tainá M. Marques
  • , Guilherme Marques
  • , Patricia Martins Conde
  • , Patrick May
  • , Deborah Mcintyre
  • , Chouaib Mediouni
  • , Francoise Meisch
  • , Alexia Mendibide
  • , Myriam Menster
  • , Maura Minelli
  • , Michel Mittelbronn
  • , Saïda Mtimet
  • , Maeva Munsch
  • , Romain Nati
  • , Ulf Nehrbass
  • , Sarah Nickels
  • , Beatrice Nicolai
  • , Jean-Paul Nicolay
  • , Fozia Noor
  • , Clarissa P. C. Gomes
  • , Sinthuja Pachchek
  • , Claire Pauly
  • , Laure Pauly
  • , Lukas Pavelka
  • , Magali Perquin
  • , Achilleas Pexaras
  • , Armin Rauschenberger
  • , Rajesh Rawal
  • , Dheeraj Reddy Bobbili
  • , Lucie Remark
  • , Ilsé Richard
  • , Olivia Roland
  • , Kirsten Roomp
  • , Eduardo Rosales
  • , Stefano Sapienza
  • , Venkata Satagopam
  • , Sabine Schmitz
  • , Reinhard Schneider
  • , Jens Schwamborn
  • , Raquel Severino
  • , Amir Sharify
  • , Ruxandra Soare
  • , Ekaterina Soboleva
  • , Kate Sokolowska
  • , Maud Theresine
  • , Hermann Thien
  • , Elodie Thiry
  • , Rebecca Ting Jiin Loo
  • , Johanna Trouet
  • , Olena Tsurkalenko
  • , Michel Vaillant
  • , Carlos Vega
  • , Liliana Vilas Boas
  • , Paul Wilmes
  • , Evi Wollscheid-Lengeling
  •  & Gelani Zelimkhanov

Contributions

A-MH: Conceptualization, Methodology, Formal analysis, Investigation, Visualization, Project administration, Writing – original draft, Writing – review & editing. RK: Conceptualization, Methodology, Funding, Resources, Supervision, Project administration, Writing – review & editing. CMC: Conceptualization, Methodology, Supervision, Writing – original draft, Writing – review & editing. CL: Conceptualization, Methodology, Writing – original draft, Writing – review & editing.

Corresponding author

Correspondence to Anne-Marie Hanff .

Ethics declarations

Ethics approval and consent to participate.

The study involved human participants, was reviewed and obtained approval from the National Ethics Board Comité National d’Ethique de Recherche (CNER Ref: 201407/13). The study was performed in accordance with the Declaration of Helsinki and patients/participants provided their written informed consent to participate in this study. We confirm that we have read the Journal’s position on issues involved in ethical publication and affirm that this work is consistent with those guidelines.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Hanff, AM., Krüger, R., McCrum, C. et al. Mixed effects models but not t-tests or linear regression detect progression of apathy in Parkinson’s disease over seven years in a cohort: a comparative analysis. BMC Med Res Methodol 24 , 183 (2024). https://doi.org/10.1186/s12874-024-02301-7

Download citation

Received : 21 March 2024

Accepted : 01 August 2024

Published : 24 August 2024

DOI : https://doi.org/10.1186/s12874-024-02301-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cohort studies
  • Epidemiology
  • Disease progression
  • Lost to follow-up
  • Statistical model

BMC Medical Research Methodology

ISSN: 1471-2288

meaning of analysis in research

IMAGES

  1. PPT

    meaning of analysis in research

  2. 8 Types of Analysis in Research

    meaning of analysis in research

  3. DEFINITION OF ANALYSIS

    meaning of analysis in research

  4. 7 Types of Statistical Analysis: Definition and Explanation

    meaning of analysis in research

  5. Unleashing Insights: Mastering the Art of Research and Data Analysis

    meaning of analysis in research

  6. What Is Data Analysis In Research Process

    meaning of analysis in research

COMMENTS

  1. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  2. Data analysis

    data analysis, the process of systematically collecting, cleaning, transforming, describing, modeling, and interpreting data, generally employing statistical techniques. Data analysis is an important part of both scientific research and business, where demand has grown in recent years for data-driven decision making.Data analysis techniques are used to gain useful insights from datasets, which ...

  3. Introduction to Data Analysis

    Qualitative research typically involves words and "open-ended questions and responses" (Creswell & Creswell, 2018, p. 3). According to Creswell & Creswell, "qualitative research is an approach for exploring and understanding the meaning individuals or groups ascribe to a social or human problem" (2018, p. 4). Thus, qualitative analysis usually ...

  4. What Is Data Analysis? (With Examples)

    What Is Data Analysis? (With Examples) Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock Holme's proclaims ...

  5. The Beginner's Guide to Statistical Analysis

    Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organizations. To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process. You need to specify ...

  6. Data Analysis

    Definition: Data analysis refers to the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, drawing conclusions, and supporting decision-making. It involves applying various statistical and computational techniques to interpret and derive insights from large datasets.

  7. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  8. What Is Data Analysis in Research? Why It Matters & What Data Analysts

    Data analysis in research is the process of uncovering insights from data sets. Data analysts can use their knowledge of statistical techniques, research theories and methods, and research practices to analyze data. They take data and uncover what it's trying to tell us, whether that's through charts, graphs, or other visual representations.

  9. Statistical Analysis in Research: Meaning, Methods and Types

    A Simplified Definition. Statistical analysis uses quantitative data to investigate patterns, relationships, and patterns to understand real-life and simulated phenomena. The approach is a key analytical tool in various fields, including academia, business, government, and science in general. This statistical analysis in research definition ...

  10. What is Data Analysis? An Expert Guide With Examples

    Data analysis is a comprehensive method of inspecting, cleansing, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It is a multifaceted process involving various techniques and methodologies to interpret data from various sources in different formats, both structured and unstructured.

  11. PDF What Is Analysis in Qualitative Research?

    What Is Analysis in Qualitative Research? A classic definition of analysis in qualitative research is that the "analyst seeks to provide an explicit rendering of the structure, order and patterns found among a group of participants" (Lofland, 1971, p. 7). Usually when we think about analysis in research, we think about it as a stage in the ...

  12. Basic statistical tools in research and data analysis

    The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies.

  13. An Overview of Data Analysis and Interpretations in Research

    Research is a scientific field which helps to generate new knowledge and solve the existing problem. So, data analysis is the crucial part of research which makes the result of the study more ...

  14. Unit of Analysis: Definition, Types & Examples

    Unit of Analysis: Definition, Types & Examples. The unit of analysis is the people or things whose qualities will be measured. The unit of analysis is an essential part of a research project. It's the main thing that a researcher looks at in his research. A unit of analysis is the object about which you hope to have something to say at the ...

  15. Thematic Analysis: A Step by Step Guide

    Thematic analysis is a useful method for research seeking to understand people's views, opinions, knowledge, experiences, or values from qualitative data. This method is widely used in various fields, including psychology, sociology, and health sciences. Thematic analysis minimally organizes and describes a data set in rich detail.

  16. Qualitative Research

    Qualitative Research. Qualitative research is a type of research methodology that focuses on exploring and understanding people's beliefs, attitudes, behaviors, and experiences through the collection and analysis of non-numerical data. It seeks to answer research questions through the examination of subjective data, such as interviews, focus groups, observations, and textual analysis.

  17. Quantitative Research

    Replicable: Quantitative research aims to be replicable, meaning that other researchers should be able to conduct similar studies and obtain similar results using the same methods. Statistical analysis: Quantitative research involves using statistical tools and techniques to analyze the numerical data collected during the research process ...

  18. Research Methods

    To analyze data collected in a statistically valid manner (e.g. from experiments, surveys, and observations). Meta-analysis. Quantitative. To statistically analyze the results of a large collection of studies. Can only be applied to studies that collected data in a statistically valid manner.

  19. Data Analysis and Interpretation

    In contrast, interpretation refers to the analysis of these generalizations and results, searching for the broader meaning of research findings. 3 How is a hypothesis related to research objectives? A well-formulated, testable research hypothesis is the best expression of a research objective. It is an unproven statement or proposition that can ...

  20. Data Analysis

    Data Analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data. According to Shamoo and Resnik (2003) various analytic procedures "provide a way of drawing inductive inferences from data and distinguishing the signal (the phenomenon of interest) from the noise (statistical fluctuations) present ...

  21. Research: Meaning and Purpose

    Analysis of collected data and results interpretation. Heaps of collected data are useless unless the collected data are organized and analysed systematically to produce answers to the research question. Analysis means categorizing, ordering, manipulating, and summarizing data to find the answer to the problem (Kerlinger, 1964). The objective ...

  22. Definition: Analysis and Interpretation

    Analysis and Interpretation. The process by which sense and meaning are made of the data gathered in qualitative research, and by which the emergent knowledge is applied to clients' problems. This data often takes the form of records of group discussions and interviews, but is not limited to this. Through processes of revisiting and immersion ...

  23. Content Analysis Method and Examples

    Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. ... Definition 3: "A research technique for ...

  24. Mixed effects models but not t-tests or linear regression detect

    While there is an interest in defining longitudinal change in people with chronic illness like Parkinson's disease (PD), statistical analysis of longitudinal data is not straightforward for clinical researchers. Here, we aim to demonstrate how the choice of statistical method may influence research outcomes, (e.g., progression in apathy), specifically the size of longitudinal effect ...

  25. Computer‐assisted vocabulary instruction for students with disabilities

    The purpose of this study was to synthesize the effectiveness of computer‐assisted instruction (CAI) studies aiming to increase vocabulary for students with disabilities in an effort to identify what type of CAI is promising for practice. An extensive search process with inclusion and exclusion criteria yielded a total of 13 single‐subject design studies to be included in the present study.