pre-advising 'publish'

2026-07-10 13:25:32 -05:00 · 2024-10-04 12:59:37 -04:00
parent c90854e730
commit 6473765329
2 changed files with 16 additions and 7 deletions
--- a/report/report.pdf
+++ b/report/report.pdf
--- a/report/report.tex
+++ b/report/report.tex
@@ -84,9 +84,13 @@ There are three probability axioms:
 \begin{enumerate}
 \item \textbf{Expectation - }The weighted average of the probabilities in the sample space
 \[\sum_{}^{S}{P(A) * A} = E \quad\text{where }E\text{ is the expected value}\]
-\item \textbf{Variance - }The spread of possible values for a random variable
-\item \textbf{Standard Deviation - }something
-\[std = \sqrt{V}\quad\text{where variance is }V\]
+\item \textbf{Variance - }The spread of possible values for a random variable, calculated as:
+\[\sigma^{2}=\frac{\sum(X - \mu)^{2}}{N}\]
+Where \(N\) is the population size, \(\mu\) is the population average, and \(X\) is each value in the population.\\
+For samples, variance is calculated with \textbf{Bessel's Correction}, which increases the variance to avoid overfitting the sample:
+\[s^{2}=\frac{\sum(X - \bar{x})^{2}}{n - 1}\]
+\item \textbf{Standard Deviation - }The square root of the variance, giving a measure of the average distance of each data point from the mean in the same units as the data.
+\[\sigma = \sqrt{V}\quad\text{where variance is }V\]
 \end{enumerate}

 \subsubsection{Probability Functions}
@@ -128,11 +132,9 @@ means will approach the true mean of the population.
 The Central Limit Theorem states that the sampling distribution of a sample mean is a normal distribution even when the 
 population distribution is not normal.
 \[
-\frac{\sqrt{n} \left( \bar{X}_n - \mu \right)}{\sigma} \xrightarrow{d} N(0, 1),
-\]
-\[
-\text{Where \( \bar{X}_n = \frac{1}{n} \sum_{i=1}^{n}\), \( X_i \) is the sample mean, and \( N(0, 1) \) is a standard normal distribution.}
+\frac{\sqrt{n} \left( \bar{X}_n - \mu \right)}{\sigma} \xrightarrow{d} N(0, 1)
 \]
+Where \(X_i\) is the sample mean, \(N(0, 1)\) is a standard normal distribution, and \(\bar{X}_n = \frac{1}{n} \sum_{i=1}^{n}X_i\).\\
 This is a challenging to understand solely as an equation.  As an example, take a sample of two six-sided dice rolls and average their numbers.  
 The more sample averages taken, the more they will resemble a normal distribution where the majority of samples average around 3.

@@ -140,6 +142,13 @@ The more sample averages taken, the more they will resemble a normal distributio
 Confidence is described using a confidence interval, which is a range of values that the true value is expected to be in, and its associated confidence level, 
 which is a probability (expressed as a percentage) that the true value is in the confidence interval.

+It is important to note that confidence levels, such as 95\%, do not indicate that the real value is within 5\% of the point estimate.  The confidence level expresses 
+the probability that the real value is in the range provided by the confidence interval.
+
+At the highest level, calculating confidence intervals is simply the observed statistic (generally the mean) plus or minus the standard error.
+
+To calculate standard error, kys.
+
 % Confidence intervals can be calculated with z-tests, t-tests.  Go into parametric vs non-parametric

 \subsubsection{Statistical Inference}