Types of Data Distributions

Understanding the different types of distributions is important for running statistical analyses. Each type of distribution has different properties, shapes, and types of data that can apply to it.

In this course, we are going to show you 3 common types of distributions: Binomial, Normal, and Poisson. In your studies, you may need to use different types of distributions, but many more advanced distributions are based on these 3.

Binomial Distribution

The binomial distribution describes the frequency of “successful outcomes” given some dependent variable. Owing true to it’s name, binomial distributions are binary in that they describe only 1 of 2 outcomes: yes-no, success-fail, presence-absence. Binary data is technically a form of discrete data (cant have half a no) and the binomial distribution models the probability of only 1 of these outcomes. The probability of success, or failures, or yes’s, or no’s. Not the probability of Yes AND no.

Along the x axis, we describe the number of successful (or unsuccessful) trials during our study. A biological example may be probability of survival for individuals per year. We could have the probability of surviving 1 years, 2 years, 3 years etc. etc. until we reach a known maximum. This distribution is binomial because at each year, the outcomes are only 1 of 2 options: survive or don’t survive. Another example may be the presence (or absence) of a particular gene in multiple populations, where on the x axis we would have the number of individuals in each population with that genotype.

Poisson Distribution

The Poisson distribution is a discrete probability distribution that models the mean occurrence of events over time or space (i.e. count data). While similar to the Binomial which measures the probability of successful events, the Poisson measures the number of events in a specified time or space. Further, the binomial has only 2 outcomes, while the Poisson has no limit to the number of potential outcomes. The Poisson is often used to model the number of individuals in a population, the number of mutations in a genome, or the number of rare events in a time interval.

The Poisson distribution is characterized by a single parameter, lambda, which is equal to both the mean and the variance of the distribution. Lower levels of lambda typically cause the data to be right skewed, while higher levels of lambda become equivalent to a normal distribution.

Normal Distribution

The normal distribution is a continuous probability distribution that is often used in biology as well as many other fields. The normal distribution is characterized by its bell-shaped curve, with the mean, median, and mode all being equal. Many biological measurements, such as the height of individuals in a population or the weight of seeds produced by a plant, can be modeled using the normal distribution. The normal distribution is important in statistical analysis because many statistical tests rely on the assumption that the data is normally distributed.

One thing to note is that we often use the normal distribution even for data that is not continuous. Remember how we said at sufficient levels of Lambda, the Poisson distribution (which models discrete data) can be seen as equivalent to the normal? Keep in mind that these distributions are used to model data, not describe them perfectly. While many statistical tests assume one type of distribution or another, there is a bit of wiggle room in these assumptions (as we discuss later). This course will not go into these situations, but you should keep this property of distributions in mind while continuing your own statistical journey!

Scroll to Top