Welcome!

This shiny app was created as part of an Open Education Resource for students at the University of British Columbia. The goal was to create an app that simplifies data analysis in Biology labs for students with minimal statistics/coding experience, while maintaining Open Science principles such as reproducibility. To enhance reproducibility, all of the R script used to generate plots, descriptive statistics, and any analyses are displayed alongside the outputs.

Step 1: Choose or upload a dataset
You can upload your own dataset following the instructions under the Choose Data tab. Alternatively, both the mtcars and penguins datasets from the base and palmerpenguins package in R are available for use in this app.

Step 2: Create a plot
Follow the instructions under the Plot tab to visualize your data and save a copy of your new plot.

Step 3: Calculate descriptive statistics
Follow the instructions under the Descriptive Stats tab to calculate statstics for selected variables from your dataset based on their type (ie. quantitative - discrete or continuous, categorical - nominal or ordinal).

Step 4: Perform statistical analyses
Follow the instructions under the Analysis tab to perform statistical tests using your selected data. The type of test performed depends on the types of variables in your dataset.

Instructions

To upload your data:
  1. Browse for and select a csv or txt file containing your data
  2. If your file contains headers, ensure you click header option
  3. Choose the type of seperator used in your data file. If your file is a csv, then choose comma.

Then a table with your data will display on the screen.

Data Table

Instructions

To create and save a plot
  1. Choose the y (response) variable
  2. Choose the data type of the y variable (either 'Quantitative' or 'Categorical')
  3. Choose the x (explanatory) variable
  4. Choose the data type of the x variable (either 'Quantitative' or 'Categorical')
  5. Depending on your choices for the above, you may be prompted to choose the type of plot you'd like to create
  6. To save a copy of your plot, click the Download Plot button
NOTE:
Only the following 3 combinations will successfully produce a graph:
  • Y (response) variable is quantitative and X (explanatory) variable is quantitative
  • Y (response) variable is quantitative and X (explanatory) is categorical
  • Y (response) variable is categorical and X (explanatory) is categorical

Source Code

The source code shows you the R script that is used to generate your plot.

                  

Instructions

The descriptive statistics displayed are based on the data type of variables you selected (quantitative or categorical).
NOTE:
Only the following 3 combinations will produce meaningful descriptive statistics
  • Variable 1 (response variable) is quantitative and Variable 2 (explanatory variable) is quantitative
  • Variable 1 (response variable) is quantitative and Variable 2 (explanatory variable) is categorical
  • Variable 1 (response variable) is categorical and Variable 2 (explanatory variable) is categorical

Descriptive Statistics

Interpreting the Output
For two quantitative variables
  • Each variable is shown in the first column. Row '1' represents 'Variable 1' whereas row '2' indicates 'Variable 2'.
  • Common descriptive statistics for quantitative variables are shown. These include: sample size (n), mean, standard deviation (sd), median, and inter-quartile range (iqr).
For a quantitative response variable (Variable 1) and a categorical explanatory variable (Variable 2)
  • Common descriptive statistics for the quantitative response variable are shown grouped by the categories within the categorical explanatory variable.
  • Specifically, rows represent levels (categories) of the categorical variable and columns show the sample size (n), mean, standard deviation (sd), median, and inter-quartile range (iqr) for that category
For two categorical variables (nominal or ordinal)
  • A contingency table showing the frequencies (counts) of observations within each combination of categories (across the two categorical variables).

Descriptive Statistics Output for Your Variables

                  

Source Code

The source code shows you the R script that is used to generate the descriptive statistics for the chosen variables.

                  

Instructions

Statistical tests will be automatically performed based on your selection of variables and their data types. The type of analysis performed depends on the type of data you have.

t-test
  • This analysis is used when examining a single quantitative (numeric) response variable in relation to a single categorical variable that has only 2 groups.
  • When performing a t-test in this app, you will be asked for a few additional parameters.
    • Type in the significance level you would like to use for the t-test. For example, if you'd like a 5% significance interval, type in 0.05.
    • One assumption of the t-test is that the variance for each sample is approximately equal. However, the t-test used by this app (Welch's t-test) is somewhat robust to deviations in this assumption. For now, we will assume that both of your samples have equal variance. As such, please select 'Yes' when prompted for this option.

ANOVA
  • This analysis is used when examining a single quantitative (numeric) response variable in relation to a single categorical explanatory variable that has more than 2 groups.

Fisher's exact test
  • This analysis is used when testing for an association between two categorical variables. It is only used when both categorical variables have exactly 2 levels/groups. For example, if one variable is sex (male/female) and the other is survival (yes/no).

Chi-square contingency analysis
  • This analysis is used when testing for an association between two categorical variables. It is only used when at least one of the categorical variables has more than 2 levels/groups. For example, if one variable is flower colour (pink/red) and the other is season (spring/summer/fall).

Analysis Results

Interpreting the Output

Your Analysis Results

                  

Source Code

The source code shows you the R script that is used to perform the statistical analysis on the selected variables.