Statistical Tests

Course Aim

This is a basic course. Those students who have not learned the basics of statistical methods and will conduct experimental studies or numerical simulations in the future are encouraged to take the course.

Course Description

Develop the basic methodology of hypothesis testing for statistical analysis of experimental and simulation studies. Through lectures and exercises using Python, explore the fundamentals of probability theory, population statistics, and statistical methods including p-values, t-test, U-test, Welch test, confidence intervals, single and multivariate analyses, and correlations. Extend these concepts with discussion of information theory, mutual information, and experimental design.

Course Contents

Every week, a lecture on each topic is followed by an exercise with Python language.

1. Introduction
History and basic concepts of hypothesis testing are explained. The fundamentals of probability distributions are also given.

2. Sampling and Central Limit Theory
The central limit theory is the core of various hypothesis testing methods. Low of large numbers and the theory is explained in the context of sampling from a population. I will also explain the degrees of freedom in data sampling.

3. T-test, U-test, Welch test
Comparison of means between two groups is frequently required in statistical assessment of measured data. Depending on the properties of data, however, different methods should be adopted. These methods are explained together with the basic notions of statistical significance and p-values.

4.Confidence Intervals
Now, the mere use of p-values is not encouraged by experts. First, I will explain why the use of p-values is not sufficient for statistical assessment. Then, I will show how statistical differences can be more reliably assessed within the hypothesis-testing framework by using the confidence intervals of the means and proportions.

5. ANOVA, Effect Size
Statistical comparison between multiple groups is frequently required in a realistic situation. I will explain how such a comparison can be done by comparing the within-class variances and the between-class variances. Various corrections required for multiple comparisons are also explained together with the criteria for statistical differences.

6. Correlation Analysis
Correlation analysis is a standard method for analyzing the statistical relationship between statistical variables. After explaining the meaning of statistical independence, I will explain the correlation analysis of continuous and discrete variables together with their limitations.

7. Information Theory
Information theory is a concept that was not discovered in ancient Greek. In particular, mutual information is often used for quantifying the relationship between two statistical variables. A virtue of mutual information is that unlike correlations mutual information is applicable to variables showing a nonlinear mutual relationship. I will explain the basics of information theory.

Assessment

weekly homework exercises and coding (75%), in-term test(25%)

Prerequisites or Prior Knowledge

Students are expected to have basic knowledge of elementary mathematics such as differentiation, integration, and elementary linear algebra. However, whenever necessary, mathematical details will be explained.
Students will need to write some code in Python