- Introduction
One of the important techniques in statistics is Probability distribution, which is prominently used for future analysis or predictions, especially in banking and stock marketing. In stock marketing, it aids the predictors to decide if a stock is prominently worth to invest and predict the future of returns any stock may provide. In the given period of interest, probability distribution is used to predict the near future calculations.
Higher the volume of data available (greater length the history of any product is known) the more accurate calculations can be conducted. In probability distribution volume of the data is directly proportional to the reduction in sampling error.
In this article, we’re going to learn the basics of probability distribution, its types and with many examples. We will also talk about data handling concepts and how it is helpful in drawing right inferences. After the calculation, we will go ahead with the construction of basic probability distribution tables and an explanation of two or more advanced concepts.
- What is Probability?
First let’s discuss what is probability with statistics in mind? In simple terms, it’s a branch of mathematics that involves finding the occurrence likelihood of a given event. It is represented by a numerical that has a value between 0 and 1. When you have a value closer to 0, its occurrence of the event will be lesser, and the occurrence of the event is higher when the value is closer to 1; it’s most specific that any event will occur. Learn more about probability, in common terms, with this course.
For example, the probability that you will go to the office is 1. The probability that you will get a holiday tomorrow is close to 0. To perfectly study the probability distribution better, it’s wise that the basic terminologies are understood well. First, let’s start with a quick look at the commonly used terminologies:
- Variable: A variable is a chance of occurrence we’re watching. It is represented by (a, z, y, etc.) and consists of a certain mathematical value.
- Random Variable: From a statistical function, a variable is derived, and this value is considered to be a random variable (not verified). Here random variables are usually represented using uppercase letters.
To understand, let’s take an example: a random variable is represented in uppercase B. The probability of this random variable occurrence is represented as P (B). Suppose we consider any other variable s, then the probability of the given value of a random variable B will always be equal to corresponding variable s (numerical value) where it is represented by:
P (B=s) ( B is in upper case as its is random variable, and s in smaller case to represent it as variable)
Now if we consider the numerical value of B is by fact same as the value s, it can be represented as;
P (B=s) = 1, stating that the probability of occurrence is likely.
Now let’s discuss the key concept of probability distribution,
- What is Probability Distribution?
Probability distribution can be defined as a statistical derivation (equation or table) that provides you with the required easily occurrence values in a given random variable represented in a range. In this given result (range of possible numbers) is listed by observing past behavior of the derived random variable. The result derived can be plotted (marked) on a given graph value starting from 0 and a given maximum statistical value. During the plot the occurrence of the number is exactly influenced by many factors, namely skew, standard deviation and distribution mean.
In a given range the probability of the occurrence of any given event plotted on the graphical chart, is commonly known as probability distribution. This article will aid you to build a proper foundation to understanding probability distribution.
For better understanding let’s consider some example,
Example 1:
Consider a coin of any size, consisting of two sides: Heads and Tails. First, flip the coin once, the result is you will see either heads or tails. Represented as H and T. Now flip the coin twice, now you can see there will be 4 occurrences : selected coin will flip to heads twice (HH), or you will be able to see heads and tails (HT), or first, you will see tails and then heads (TH) or at last both the time you will be able to see (TT). Now flip the coins 3 times, you will be able to see the following outcomes
HHT, HHH, THH, HTT, TTT, TTH, THT, HTH.
Now, let’s consider any variable c that represents the numerical value of occurrence of tails (T) later after tossing the coin 3 more times. Occurrence of tails can either 0 times, 1 time, 2 times or 3 times. By this we can reference that the value of c will be in between 0-3. But, the probability of occurrence you will be able to see tails is just once: P (B= 1) will be 3/8. Lest calculate that means 3 times out of 8 you will be able to get tails once( and 3 times heads) after tossing the coin thrice.
Now, let’s start preparing the probability distribution table and its representation is as follows:
Number of Tails c | Probability of B Being Equal to the values of C | Final Result |
0 | P(B=0) | 1/8 |
1 | P(B=1) | 3/8 |
2 | P(B=2) | 3/8 |
3 | P(B=3) | 1/8 |
Example 2:
Now consider a dice and roll the dice once. As the dice has six sides there will be following occurrence values: 1, 2, 3, 4, 5 and 6. Now let’s consider the variable valuable c,now lets see how many times 2s occur you roll the die once. You will be able to get s 2 ones from the 6 times of rolling the dice: P(B=1) will be 1/6. Rest when the dice is rolled, you won’t be getting a 2: P(B=0) will be 5/6.
Now as we have made the chart, lets prepare the probability distribution table for the following equation,
Number of 2s on Rolling the Die Once | Probability of B Being Equal to the Value of c | Final Result |
0 | P(B=0) | 5/6 |
1 | P(B=1) | 1/6 |
Outcome of the dice roll in general can be written as,
Outcome of the dice roll | 1 | 2 | 3 | 4 | 5 | 6 |
Probability of occurrence | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |
- Probability Distribution: Discrete and Continuous
Focusing on Probability it can either be continuous or discrete. If the variables that are calculated are discrete and we are able to get a discrete probability distribution table. If the variables subjected are continuous by nature, we will be able to get a continuous probability distribution table.
Still confused with between discrete and continuous variables? Instances are:
Continuous Variable: when values occur in the specific range of observation, it is a continuous variable.
When you are concerned with probabilities values from any of the random variables that have continuous outcomes. For example, when you pick the height of the random adults from the crowd. For the time taken by students to complete the examination, in these two examples, the random variable is better fit for a continuous probability distribution.
When a probability function is used to describe a continuous probability distribution it is generally called a probability density function ( pdf).
Discrete Variable: If a given variable is not continuous in nature, it is discrete in number(an integer value). For example, pizza can be sliced into 4,6,8,12 or more, but it can’t be 7.5 slices.
Here it showcases the probability distribution for a given process which has two possible outcomes.
- Conclusion
To conclude, a probability distribution can be stated as a list of given outcomes and also their associated probabilities. In comparison to Small distribution with tables and the large distribution with functions, large distribution would be easy to summarise. The output of a probability mass function is a probability. The area under the curve produced by a probability density function represents the probability. Probability function parameters play a central role in defining the outcomes of a random variable. It is basically used to do future analysis or predictions. The probability distributions have their own significance in real life applications. Understanding probability distribution is an essential foundation before performing real-life statistical inference. The type of distribution of your underlying question and data will determine which statistical method you need to use. It is a statistical derivation that shows all possible values of a random variable can acquire in a given range. It can either be discrete or continuous. If the variables are discrete it would be a discrete, probability distribution table. If they are continuous it would lead to a continuous probability distribution table. There are a variety of probability distributions that you can use to model different types of data. The correct distribution depends on your data. Hence, probability distribution is a function that describes the likelihood of obtaining the values that a random variable can assume. It also indicates the likelihood of an event or outcome.