Probability

The probabilities of rolling several numbers using two dice.

Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, 0 indicates impossibility of the event and 1 indicates certainty. The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes ("heads" and "tails") are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2 (which could also be written as 0.5 or 50%).

These concepts have been given an axiomatic mathematical formalization in probability theory, which is used widely in areas of study such as statistics, mathematics, science, finance, gambling, artificial intelligence, machine learning, computer science, game theory, and philosophy to, for example, draw inferences about the expected frequency of events. Probability theory is also used to describe the underlying mechanics and regularities of complex systems.

Terminology of the probability theory

Experiment: An operation which can produce some well-defined outcomes, is called an Experiment.

Example: When we toss a coin, we know that either head or tail shows up. So, the operation of tossing a coin may be said to have two well-defined outcomes, namely, (a) heads showing up; and (b) tails showing up.

Random Experiment: When we roll a die we are well aware of the fact that any of the numerals 1,2,3,4,5, or 6 may appear on the upper face but we cannot say that which exact number will show up.

Such an experiment in which all possible outcomes are known and the exact outcome cannot be predicted in advance, is called a Random Experiment.

Sample Space: All the possible outcomes of an experiment as an whole, form the Sample Space.

Example: When we roll a die we can get any outcome from 1 to 6. All the possible numbers which can appear on the upper face form the Sample Space(denoted by S). Hence, the Sample Space of a dice roll is S={1,2,3,4,5,6}

Outcome: Any possible result out of the Sample Space S for a Random Experiment is called an Outcome.

Example: When we roll a die, we might obtain 3 or when we toss a coin, we might obtain heads.

Event: Any subset of the Sample Space S is called an Event (denoted by E). When an outcome which belongs to the subset E takes place, it is said that an Event has occurred. Whereas, when an outcome which does not belong to subset E takes place, the Event has not occurred.

Example: Consider the experiment of throwing a die. Over here the Sample Space S={1,2,3,4,5,6}. Let E denote the event of 'a number appearing less than 4.' Thus the Event E={1,2,3}. If the number 1 appears, we say that Event E has occurred. Similarly, if the outcomes are 2 or 3, we can say Event E has occurred since these outcomes belong to subset E.

Trial: By a trial, we mean performing a random experiment.

Example: (i) Tossing a fair coin, (ii) rolling an unbiased die.

Theory

Like other theories, the theory of probability is a representation of its concepts in formal terms—that is, in terms that can be considered separately from their meaning. These formal terms are manipulated by the rules of mathematics and logic, and any results are interpreted or translated back into the problem domain.

There have been at least two successful attempts to formalize probability, namely the Kolmogorov formulation and the Cox formulation. In Kolmogorov's formulation (see also probability space), sets are interpreted as events and probability as a measure on a class of sets. In Cox's theorem, probability is taken as a primitive (i.e., not further analyzed), and the emphasis is on constructing a consistent assignment of probability values to propositions. In both cases, the laws of probability are the same, except for technical details.

There are other methods for quantifying uncertainty, such as the Dempster–Shafer theory or possibility theory, but those are essentially different and not compatible with the usually-understood laws of probability.

Applications

Probability theory is applied in everyday life in risk assessment and modeling. The insurance industry and markets use actuarial science to determine pricing and make trading decisions. Governments apply probabilistic methods in environmental regulation, entitlement analysis, and financial regulation.

An example of the use of probability theory in equity trading is the effect of the perceived probability of any widespread Middle East conflict on oil prices, which have ripple effects in the economy as a whole. An assessment by a commodity trader that a war is more likely can send that commodity's prices up or down, and signals other traders of that opinion. Accordingly, the probabilities are neither assessed independently nor necessarily rationally. The theory of behavioral finance emerged to describe the effect of such groupthink on pricing, on policy, and on peace and conflict.

In addition to financial assessment, probability can be used to analyze trends in biology (e.g., disease spread) as well as ecology (e.g., biological Punnett squares). As with finance, risk assessment can be used as a statistical tool to calculate the likelihood of undesirable events occurring, and can assist with implementing protocols to avoid encountering such circumstances. Probability is used to design games of chance so that casinos can make a guaranteed profit, yet provide payouts to players that are frequent enough to encourage continued play.

Another significant application of probability theory in everyday life is reliability. Many consumer products, such as automobiles and consumer electronics, use reliability theory in product design to reduce the probability of failure. Failure probability may influence a manufacturer's decisions on a product's warranty.

The cache language model and other statistical language models that are used in natural language processing are also examples of applications of probability theory.

Mathematical treatment

Consider an experiment that can produce a number of results. The collection of all possible results is called the sample space of the experiment, sometimes denoted as $\Omega$ . The power set of the sample space is formed by considering all different collections of possible results. For example, rolling a die can produce six possible results. One collection of possible results gives an odd number on the die. Thus, the subset {1,3,5} is an element of the power set of the sample space of dice rolls. These collections are called "events". In this case, {1,3,5} is the event that the die falls on some odd number. If the results that actually occur fall in a given event, the event is said to have occurred.

A probability is a way of assigning every event a value between zero and one, with the requirement that the event made up of all possible results (in our example, the event {1,2,3,4,5,6}) is assigned a value of one. To qualify as a probability, the assignment of values must satisfy the requirement that for any collection of mutually exclusive events (events with no common results, such as the events {1,6}, {3}, and {2,4}), the probability that at least one of the events will occur is given by the sum of the probabilities of all the individual events.

The probability of an event A is written as $P(A)$ , $p(A)$ , or ${\text{Pr}}(A)$ . This mathematical definition of probability can extend to infinite sample spaces, and even uncountable sample spaces, using the concept of a measure.

The opposite or complement of an event A is the event [not A] (that is, the event of A not occurring), often denoted as $A',A^{c}$ , ${\overline {A}},A^{\complement },\neg A$ , or ${\sim }A$ ; its probability is given by P(not A) = 1 − P(A). As an example, the chance of not rolling a six on a six-sided die is 1 – (chance of rolling a six) = 1 - ${\tfrac {1}{6}}={\tfrac {5}{6}}$ .

If two events A and B occur on a single performance of an experiment, this is called the intersection or joint probability of A and B, denoted as $P(A\cap B)$ .

Independent events

If two events, A and B are independent then the joint probability is

P(A{\mbox{ and }}B)=P(A\cap B)=P(A)P(B).

For example, if two coins are flipped, then the chance of both being heads is ${\tfrac {1}{2}}\times {\tfrac {1}{2}}={\tfrac {1}{4}}$ .

Mutually exclusive events

If either event A or event B can occur but never both simultaneously, then they are called mutually exclusive events.

If two events are mutually exclusive, then the probability of both occurring is denoted as $P(A\cap B)$ and

P(A{\mbox{ and }}B)=P(A\cap B)=0

If two events are mutually exclusive, then the probability of either occurring is denoted as $P(A\cup B)$ and

P(A{\mbox{ or }}B)=P(A\cup B)=P(A)+P(B)-P(A\cap B)=P(A)+P(B)-0=P(A)+P(B)

For example, the chance of rolling a 1 or 2 on a six-sided dice is $P(1{\mbox{ or }}2)=P(1)+P(2)={\tfrac {1}{6}}+{\tfrac {1}{6}}={\tfrac {1}{3}}.$

Not mutually exclusive events

If the events are not mutually exclusive then

P\left(A{\hbox{ or }}B\right)=P(A\cup B)=P\left(A\right)+P\left(B\right)-P\left(A{\mbox{ and }}B\right).

For example, when drawing a single card at random from a regular deck of cards, the chance of getting a heart or a face card (J,Q,K) (or one that is both) is ${\tfrac {13}{52}}+{\tfrac {12}{52}}-{\tfrac {3}{52}}={\tfrac {11}{26}}$ , since among the 52 cards of a deck, 13 are hearts, 12 are face cards, and 3 are both: here the possibilities included in the "3 that are both" are included in each of the "13 hearts" and the "12 face cards", but should only be counted once.

Conditional probability

Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written $P(A\mid B)$ , and is read "the probability of A, given B". It is defined by

P(A\mid B)={\frac {P(A\cap B)}{P(B)}}.\,

If $P(B)=0$ then $P(A\mid B)$ is formally undefined by this expression. In this case $A$ and $B$ are independent, since $P(A\cap B)=P(A)P(B)=0$ . However, it is possible to define a conditional probability for some zero-probability events using a σ-algebra of such events (such as those arising from a continuous random variable).

For example, in a bag of 2 red balls and 2 blue balls (4 balls in total), the probability of taking a red ball is $1/2$ ; however, when taking a second ball, the probability of it being either a red ball or a blue ball depends on the ball previously taken. For example, if a red ball was taken, then the probability of picking a red ball again would be $1/3$ , since only 1 red and 2 blue balls would have been remaining. And if a blue ball was taken previously, the probability of taking a red ball will be $2/3$ .

Inverse probability

In probability theory and applications, Bayes' rule relates the odds of event $A_{1}$ to event $A_{2}$ , before (prior to) and after (posterior to) conditioning on another event $B$ . The odds on $A_{1}$ to event $A_{2}$ is simply the ratio of the probabilities of the two events. When arbitrarily many events $A$ are of interest, not just two, the rule can be rephrased as posterior is proportional to prior times likelihood, $P(A|B)\propto P(A)P(B|A)$ where the proportionality symbol means that the left hand side is proportional to (i.e., equals a constant times) the right hand side as $A$ varies, for fixed or given $B$ (Lee, 2012; Bertsch McGrayne, 2012). In this form it goes back to Laplace (1774) and to Cournot (1843); see Fienberg (2005).

Summary of probabilities

Summary of probabilities
Event	Probability
A	$P(A)\in [0,1]\,$
not A	$P(A^{\complement })=1-P(A)\,$
A or B	${\begin{aligned}P(A\cup B)&=P(A)+P(B)-P(A\cap B)\\P(A\cup B)&=P(A)+P(B)\qquad {\mbox{if A and B are mutually exclusive}}\\\end{aligned}}$
A and B	${\begin{aligned}P(A\cap B)&=P(A\|B)P(B)=P(B\|A)P(A)\\P(A\cap B)&=P(A)P(B)\qquad {\mbox{if A and B are independent}}\\\end{aligned}}$
A given B	$P(A\mid B)={\frac {P(A\cap B)}{P(B)}}={\frac {P(B\|A)P(A)}{P(B)}}\,$