Statistics Lecture 7 Continuous probability distributions •David Bartl •Statistics •INM/BASTA Outline of the lecture •Continuous probability distributions •Continuous uniform distribution •Gaussian normal distribution •Exponential distribution •Some other continuous probability distributions Experiment — Trial — Random variable Random variable — Dataset •To conclude, the random variable assigns a numerical value to each outcome of the random experiment. •Now, a dataset is a collection of measurements and observations, i.e. it is a collection of data. A data unit is an entity of the population under study, and a data item or a variable is a characteristics of each data unit. We are considering numerical (quantitative) variables now. •We assume the hypothesis that the data items in the dataset are realizations of the random variable, i.e. the random variable (via the trials of the random experiment) generates the data. Examples of continuous random variables Random variable Assumptions to simplify the matters Probability mass function Assumptions to simplify the matters Probability density function Assumptions to simplify the matters Cumulative distribution function The density & the cumulative distribution function anima2 1 z=0 z=0 p=0,5 z=0,7 p=0,5 p=0,62 p=0,62 Continuous uniform distribution • • Uniform distribution (continuous) Uniform distribution (continuous) Uniform distribution (continuous) Uniform distribution (continuous) Uniform distribution (continuous) Uniform distribution (continuous) Uniform distribution (continuous) in Excel Uniform distribution (continuous): Example •The bus operates every 15 minutes. A passenger (who does not know the timetable of the bus) comes to the bus stop at a random time. • •The operator comes to the telephone set every 15 minutes. A customer makes a telephone call at a random time and waits as long as the operator comes. • •Notice the difference: • — we have just one request here • — when applying the discrete Poisson distribution or the continuous exponential distribution, we have a series of requests coming at a constant rate Gaussian normal distribution •Normal distribution •Normalized normal distribution •Central Limit Theorem • Normal distribution •It has been observed in practice that the distribution of many real phenomena yields the typical “bell curve”, i.e. the phenomena can be approximated by the classical Gaussian normal distribution. These phenomena include: • — the results of repeated measurements of lengths, distances, weights, etc. • — the results of various tests and examinations (exams) • •We explain these practical observations by hypothesizing that many random quantities present in the experiment sum up together, cancel one another, and (by the Central Limit Theorem) yield the normal distribution approximately. Normal distribution Normal distribution Normal distribution source: Wikipedia Normal distribution source: Wikipedia Normal distribution anima2 1 z=0 z=0 p=0,5 p=0,5 Normal distribution Normal distribution Normal distribution: The 68–95–99.7 rule •The 68–95–99.7 rule / The three-sigma (3σ) rule 400px-Standard_deviation_diagram source: Wikipedia Normalized normal distribution Normalized normal distribution Normalized normal distribution Normalized normal distribution Normalized normal distribution 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.00000 0.00399 0.00798 0.01197 0.01595 0.01994 0.02392 0.02790 0.03188 0.03586 0.1 0.03983 0.04380 0.04776 0.05172 0.05567 0.05962 0.06356 0.06749 0.07142 0.07535 0.2 0.07926 0.08317 0.08706 0.09095 0.09483 0.09871 0.10257 0.10642 0.10260 0.11409 0.3 0.11791 0.12172 0.12552 0.12930 0.13307 0.13683 0.14058 0.14431 0.14803 0.15173 0.4 0.15542 0.15910 0.16276 0.16640 0.17003 0.17364 0.18824 0.18082 0.18439 0.18793 0.5 0.19146 0.19497 0.19847 0.20194 0.20540 0.20884 0.21226 0.21566 0.21904 0.22240 0.6 0.22575 0.22907 0.23237 0.23565 0.23891 0.24215 0.24537 0.24857 0.25175 0.25490 0.7 0.25804 0.26115 0.26424 0.26730 0.27035 0.27337 0.27637 0.27935 0.28230 0.28524 0.8 0.28814 0.29103 0.29389 0.29673 0.29955 0.30234 0.30511 0.30785 0.31057 0.31327 0.9 0.31594 0.31859 0.32121 0.32381 0.32639 0.32894 0.33147 0.33398 0.36460 0.33891 1.0 0.34134 0.34375 0.34614 0.34850 0.35083 0.35314 0.35543 0.35769 0.35993 0.36214 1.1 0.36433 0.36650 0.36864 0.37076 0.37286 0.37493 0.37698 0.37900 0.38100 0.38298 1.2 0.38493 0.38686 0.38877 0.39065 0.39251 0.39435 0.39617 0.39796 0.39973 0.40147 1.3 0.40320 0.40490 0.40658 0.40824 0.40988 0.41149 0.41309 0.41466 0.41621 0.41774 1.4 0.41924 0.42073 0.42220 0.42364 0.42507 0.42647 0.42786 0.42922 0.43056 0.43189 1.5 0.43319 0.43448 0.43574 0.43699 0.43822 0.43943 0.44062 0.44179 0.44295 0.44408 Normal distribution in Excel Normal distribution in Excel Normal distribution in Excel Normal distribution: Example Central Limit Theorem (CLT) •There are several versions or variants of the Central Limit Theorem. • •Its earlies version is now known as the de Moivre-Laplace Theorem. It states that the normal distribution is an approximation of the discrete binomial distribution. • •We shall then mention the Lindeberg-Lévy Theorem, which is a comprehensible variant of the Central Limit Theorem. CLT: de Moivre-Laplace Theorem (local form) CLT: de Moivre-Laplace Theorem (integral form) CLT: Lindeberg-Lévy Theorem Exponential distribution • • Exponential distribution •There are some events, such as • — customers coming to a shop during one hour (between 10:00 and 11:00, say) • — telephone calls incoming during one hour (between 10:00 and 11:00, say) • — requests incoming to a server during one minute (between 10:00 and 10:01) • — meteorites of diameter   ≥ 1 meter hitting the Earth during a year • — decay events from a radioactive source •that (as we suppose) have some properties in common. Exponential distribution •Suppose that a random event occurs repeatedly and satisfies the following assumptions: •the event can occur at any time •the average number of occurrences of the event during an interval of time of a fixed length is constant; the number does not depend on the beginning of the interval, and does not depend on the number of occurrences of the event before the beginning of the time interval •the average number of occurrences of the event during an interval of time is proportional to the length of the interval •… Exponential distribution Exponential distribution Exponential distribution Exponential distribution Exponential distribution source: Wikipedia Exponential distribution source: Wikipedia Exponential distribution expon x=0,69 F(x)=0,5 p=0,5 x=0,69 f(x) F(x) Exponential distribution Exponential distribution: Examples •The time till the next telephone call. •The time until a radioactive particle decays. •The time between clicks of a Geiger-Müller counter. •The time until the next default in risk modelling. •The time till the next failure / accident / … Some continuous probability distributions derived from the normal distribution •Pearson’s χ2 distribution •Student’s t distribution •Fisher-Snedecor F distribution • Pearson’s χ2 distribution chi-squared distribution source: Wikipedia chi-squared distribution source: Wikipedia chi-squared distribution The gamma function The gamma function – another definition (due to Euler) chi-squared distribution in Excel chi-squared distribution in Excel chi-squared distribution in Excel Student’s t distribution Student’s t distribution t-distribution source: Wikipedia t-distribution source: Wikipedia t-distribution t-distribution in Excel t-distribution in Excel t-distribution in Excel Fisher-Snedecor F distribution Fisher-Snedecor F distribution F-distribution source: Wikipedia F-distribution source: Wikipedia F-distribution F-distribution F-distribution in Excel F-distribution in Excel F-distribution in Excel