# Statistics/ Econometrics

Click on the underlined name of the video to play the video on YouTube.

*FILE* Means that there is an associated file.  Many of them are not going to be available for a while, since I have to recreate links for all of them. You can try going to the menu at the top under "More", go to "Files", and some of them might work...

Hans Rosling, talented data educator has died at age 68. Presenter of 10 TED talks, and developer of gapminder.org for understanding data around the world; he will be missed. Link to my tribute video.

Introductory Statistics Course:
Stats:  BasicsStats Intro Vocabulary 1 Boring but necessary stats vocabulary (First half): e.g. Qualitative, categorical, parameter, statistic, ratio vs. interval data, etc.Stats Intro Vocabulary 2 Continuation of above(OLD) Stats Tables and Graphs 1 Frequency Distribution, Graphing Categorical data, Bar Graph, Excel pivot table(OLD) Stats Tables and Graphs 2 Frequency Distribution, Graphing Quantitative data, Excel pivot table, Histogram.Categorical Data in Excel 2016: Update Here we make frequency distributions two ways: First using the COUNTIF function, and then using a Pivot Table.Quantitative Data in Excel 2016:Frequency and Histogram Using Quantitative Data to create a Frequency Distribution, and graph a Histogram using Excel 2016 pivot table.Histograms: Accurate Limits How to force Excel to label the endpoints of our classes properly. Histogram: Add Normal Curve: How to add  Normal distribution curve to an existing histogram.Quantitative Data and CUMULATIVE distributions in Excel In this video we make a Cumulative Frequency Distribution, and make a graph of an Ogive.Crosstabulations in Excel Here we use Excel's PivotTable feature to make a cross-tabulation in Excel.Making Scatterplots and Trendlines How to make and understand scatterplots and trendlines in Excel 2016.How to ID a point in a Scatterplot I show a couple of quick ways to find a scatter plot point in your data set.Stats:  Numerical Descriptive StatisticsMean, Median, Mode Some details about how to find and interpret the mean, median, and mode.Percentiles, Quartiles, and Simple Box Plots How to calculate and interpret percentiles, quartiles, range, interquartile range, and make a simple boxplot.Dispersion: Variance, Std. Dev. Etc. Here we calculate variance, standard deviation, coefficient of variation, range, and interquartile range.Skewness and Kurtosis How to think about skewness and Kurtosis MeasuresStats:  Introduction to ProbabilityIntro Probability A Preliminary rules and symbols used in probabilityIntro Probability B Complement and addition rules for probabilityIntro Probability C Conditional probability rule, multiplication rule, and statistical independence.Intro Probability D Multiplication rule for independent events, and what I call Burkey's rule of common sense for probability.Intro Probability E Using probability rules with a joint probability table: Calculating unions, finding intersections and marginal probabilities in a table.Intro Probability F Using a joint probability table to calculate conditional probabilities, and check for statistical independence.Probability G Bayes Rule A derivation of Bayes' rule, a numerical example so that you can see WHY it works, and an example calculation using drug tests.Stats: Discrete Probability Distributions : Tables, Poisson, Binomial, and HypergeometricDiscrete Numeric Probability Distributions Overview An overview of what a discrete numeric probability distribution is, and three examples: the Poisson, binomial, and hypergeometric. I just give an overview here, no formulas. Further videos focus on each distribution individually.Basic Formulas for Discrete Distributions Mean, Variance, and Standard Deviation for discrete probability distributions, working with tables.Poisson The Poisson probability distribution and how it works.Binomial The binomial probability distribution and how to use it.Hypergeometric The hypergeometric probability distribution and how to use it. An example using cards is given.Discrete Distributions: Applications with Lottery TicketsScratchers  Basic Formulas applied to Scratch-Off Style TicketsPick Three and Pick Four The Binomial Distribution, to analyze Pick 3 and Pick 4 tickets (each balled pulled from separate urns)Lotto The Hypergeometric Distribution to Analyze multiple balls pulled from one urn.Stats: Continuous Probability Distributions (Uniform, Normal, Exponential) Continuous Uniform Distribution Calculations Introduction to continuous distributions; how to solve problems with the continuous Uniform distribution. Includes variance, standard deviation, expected value, and probability calculations.Uniform distribution variance: Why the 12? Everyone who studies the uniform distribution wonders: Where does the 12 come from in (b-a)^2/12? Here I show you!Normal Distribution Intro Here we take an introductory look at the Normal distribution and the Empirical Rule.Is my data Normally Distributed? Using qqplots and other techniques to seehow close a distribution is to Normal HandoutNormal Dist. Calculations 1: Finding Probailities  How to use a z score table and solve basic problems like: Given an x, what is the probability of being more than x? Or less than x? Or between 2 x's?Normal Calcs 2: Finding X's  Here we do what I call a "backward problem": given a probability, find the x value that would result in that probability.Normal Dist. Calcs 3: Finding Two Symmetric X's with Alpha  Here we solve a common problem: Given that say, 95% is in the middle of the normal distribution, how do I find two x's the same distance on either side?  I introduce the concept of Alpha here, the probability that "ain't in the middle" Introduction to Sampling DistributionsIntroduction to Sampling Distributions- Central Limit Theorem and LLN Here we introduce the idea of a sampling distribution and what they are about.  I also discuss the meaning of the central limit theorem and the law of large numbers-- two commonly confused ideas.Sampling Distribtion: Means and Standard Errors Here we calculate some problems with sample means and sampling distributions.Sampling Distribution: Proportions Here we calculate some problems with sample proportion sampling distributions.How to Make Confidence IntervalsConfidence Intervals: An Introduction Here is an overview of the ides of a confidence interval, and some of the terminology used.Confidence Intervals for Means Here we focus on doing confidence intervals for means, with an introduction to the t distribution.Confidence intervals for means: Practice Problem Here we do a practice problem making confidence intervals for means with both the t and z, and discuss why you use each.Confidence Intervals for Proportions Here we look at how to make confidence intervals for proportions.Crash Course (or Review) of Hypothesis TestingReview of Hypothesis Testing1 Review of basic statistical ideas needed for hypothesis testing: deviation, standard deviation, empirical rule, and z scores.Review of Hyp.Testing2 An overview of the idea behind hypothesis testing.Review of Hyp.Testing3 Standard errors, and a z test of proportions for coin flipping data.Review of Hypothesis Testing4 z Review of several kinds of z-based hypothesis tests, z test for one proportion, two proportions, 1 mean, and two means.Review of Hypothesis Testing5 t A comparison of many different types of t tests that you might see, and how they are similar. Hypothesis testing using a t with one mean, paired t tests, and two independent samples.Review of Hypothesis Testing6 x2 A review of what the chi square distribution is, and 3 common chi square tests. Testing a sample variance, test of independence (contingency table), and a goodness of fit test.Review of Hyp.Testing6 x2 ci Using a chi-square distribution to make a confidence interval for a population variance based on a sample variance.Review of Hypothesis Testing7 F Introduction to the F distribution and F tests: Comparing two sample variances and a One-Way Anova.*FILE* HotDog.xls
Introductory Econometrics Course with R:
Each section includes lectures on the ideas and theories, as well as a "How To" section using the R Statistics Package (FREE!)  In my course, you learn the WHY and WHAT everything means.

Video 1: earn R Free within R! Intro to the "swirl"package.

Econometrics One: Getting Started with Modeling Linear Relationships
Econometrics Preliminaries An introduction to the course, and what you should know before you start.Econometrics basic intuition Introducing econometric modeling at a basic level (Lecture 1).Econometrics  intuition b Part 2 of lecture 1Econometrics  intuition c Part 3 of lecture 1.Econometrics  intuition D1 An example of modeling height and weight with a dummy variable for gender.Econometrics  intuition D2 Example modeling HT/WT with dummies, pt. 2. 3d visualization Visualizing a multiple regression in 3D using Maple.R Intro 1 The basics of using R for statistics. Link to R website (FREE!)Mailbag: dummy interaction In a regression, what does interacting TWO dummy variables mean? Here is the answer!Mailbag: Notation An answer to a viewer question: Why do we see alphas, betas, beta-hat, B's for slopes, and e's, epsilon's and u's for error terms?? Why can't the notation be easier? (Contains some basic+advanced content)

Econometrics Two:Modeling NON-Linear Relationships (Curves):Econometrics Curves intuition a1  Motivation: How residual plots can tell you about a nonlinear relationship, and intro to logarithms: Important!Econometrics Curves a2 Short continuation of above (1.5 minutes)Econometrics Curves intuition b Using logarithms to model curves. Review of logs again, and log-linear models, and start of lin-log modelsEconometrics Curves c lin-log models and log-log modelsEconometrics Curves d1 Modeling with polynomialsEconometrics Curves intuition d2 Second part of modeling with polynomials (short, 1.5 minutes)R intro 2 Using R to estimate non-linear models
Econometrics Three: OLS Formulas  *File* 3 OLS Formulas.xlsOLS Form a Ordinary least squares regression. How do you calculate OLS slopes and y intercepts?OLS Form b Using numbers to calculate a slope and y intercept "by hand"OLS Form c Interpretation and calculation of R squared and TSS RSS ESS, etc. OLS Form d Interpretation of and calculating R-squared and adjusted R squaredR Intro 3a The third lecture on how to use R, focusing on OLS Formulas.R Intro 3b Continuation of above.OLS: A graphical View An intuitive, graphical view of the RSS, ESS, and TSS formulas that is easy to understand and remember.Derivation of OLS Formulas A derivation of the OLS Slope and Intercept Formulas using calculus. Special Cases: Derivations Here I derive two special case formulas, where we know the slope OR intercept should be = 0.Centered Data In this brief visualization, I illustrate why when you subtract the means from your data in a regression, it does not change the slopes, but sets the y intercept to zero.
Econometrics Four: USING 'MetricsUsing 'metrics a Here we look at how to evaluate a regression and some steps in conducting a regression project.Using 'metrics b The steps in doing a regression project.Using 'metrics c Here we look at a real research study, and evaluate it IN DETAIL (4 videos total)Using 'metrics d Continuation of the above.Using 'metrics e Yet another continuation...Using 'metrics f Yet ANOTHER continuation of above.R Intro 4a Here we talk about modeling using some data in R.R Intro 4b Copying, pasting, and formatting regression results for use in a paper or report. *FILE* R4sumstatsBurkey's writing tips Common writing problems that economists complain about.  Some common mistakes everyone should avoid, even me!
Econometrics 5: Assumptions of the CLRM(Gauss-Markov Theorem)  *FILE* 5 CLRM.pdf
Assumptions of CLRM a The assumptions of the Classical Linear Regression Model. Part a discusses some preliminary ideas.Assumptions of CLRM b Part b shows a graphical discussion of "Minimum variance (efficiency)" versus "unbiased" estimators.Assumptions of CLRM c Overview of the Gauss-Markov Theorem Assumptions of CLRM d Assumption 1: Linear in the coefficients, correctly specified, and additive error term.Assumptions of CLRM e Assumptions 3 and 4: Error term has zero mean and errors are unrelated to the explanatory variables.Assumptions of CLRM f Assumptions 4 and 5: No serial correlation and no heteroskedasticity.Assumptions of CLRM g Assumptions 6 and 7: No perfect multicollinearity and normally distributed error term.
Econometrics 6: Statistical Inference(Also see Crash Course on Hypothesis Testing, listed under Statistics)
Inference A1 A review of basic information about the central limit theorem and the normal distribution, the foundation of most statistical inference calculations.Inference A2 Continuation of aboveInference B A review of the Law of Large Numbers and Standard Errors, important for understanding inference.  Inference C The epistemology of inference:  That is, what can we learn about the world from collecting data and how do we know it?Inference D Statistical Inference: Actually Doing it, Pt. 1Inference E Statistical Inference: Actually Doing it, Pt. 2:  An introduction to the p-value.Inference F Now that we have DONE some hypothesis tests... we must sit back and think... so what? What does it all mean?Inference G A trip to the courtroom for a trial: A metaphor for understanding type 1 and type 2 errors, power, confidence levels, and the relationship between alpha and beta in hypothesis testing.Inference H  Discover Your Inner Alpha: Why do people talk about alpha=.05 in hypothesis testing? Why do some people use .10 or .01? Discover your inner alpha in this experiment, your willingness to commit a type 1 error.Inference I Why use a z, t, F, or chi square distribution? In this part, you can see where these distributions come from.Inference J  Why use a z, t, F, or chi square distribution?   In Part J we look at some common statistical tests, and you get to see why they have a particular distribution.
Econometrics Seven: Specification and Variable SelectionFor information about functional form selection see the first two sections: modeling linear/nonlinear relationships
Econometrics Specification 1  An introduction to specification.  How do you select the right set of explanatory variables?  Introduction.Econometrics Specification 2 Using Venn Diagrams, we look at where the INFORMATION used to calculate slopes/standard errors comes from, and look at multicollinearity. This is important when you are concerned about R squared, statistical significance, etc.Econometrics Spec. 2b Continuation of above. We focus on Omitted variables bias this time. Econometrics Specification 3 Specification: Selecting the Variables for an Econometric Model.  There is no one right way, but many wrong ways... I discuss some common methods such as I throw in some quotes about econometrics for fun.
Econometrics Eight: Multicollinearity
Econometrics Multicollinearity A Multicollinearity increases standards errors. So, this video focuses on the calculation of standard errors in a regression.  We calculate some by hand (using R and Excel).Multicollinearity B Continuation/Discussion of above, Variance Inflation Factors, and what they REALLY mean.
Econometrics 9 Heteroskedasticity
Heteroskedasticity A What heteroskedasticity is, why we care, and intro to detection. Heteroskedasticity B White Test, Brief info on Park Test and Goldfield Quandt, and White/Huber/Eicker Sandwich EstimatorsHeteroskedasticity C *File white function* Applied Heteroskedasticity in R:Check for heteroskedasticity "by hand" with White/Breusch Pagan tests, define a function to apply White/Huber/Eicker correction using the lmtest and car libraries.Econometrics 10 Two Stage Least SquaresEconometrics TSLS in R Part 1 A brief overview of endogeneity, and how to do Two Stage Least Squares "by hand".Econometrics TSLS in R part 2 Using R's library for two stage least squares "automatically".Econometrics TSLS in R part 3 How to calculate the TSLS correction factor to fix standard errors, and how to correct heteroskedasticity using White's correction with TSLS.
See my brief Spatial Econometrics course here: https://spatial.burkeyacademy.com/ Panel Data Models An introduction to Panel Data and Fixed Effects Models: We use R, too! *File: Panel DataFixed Effects v Random Effects What is the difference, and why should I care?  Hausman test. We use R!