Bayes Graphic

This commit is contained in:
2024-10-17 14:44:37 -04:00
parent f0eb89cd4f
commit 291ed8066f
5 changed files with 40 additions and 10 deletions

Binary file not shown.

View File

@@ -3,6 +3,7 @@
\usepackage{hyperref}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{tikz}
\usepackage[a4paper, total={6in, 10in}]{geometry}
\usepackage{setspace}
\setstretch{1.25}
@@ -187,8 +188,8 @@ to assume that the sophistication of our tools overrides imperfections in the da
In this unit I explored some common fallacies and assumptions held by analysts who may not fully grasp the content that they work with,
nor the problems they intend to solve. This required extensive research that I found was best digested in the form of books whose chapters chronicle multiple
examples of a given principle. As such, the reading was not confined to just the timeslot designated for this unit. Research started during the months leading up
to the start of the semester\footnote{Only research during the semester was logged in the timesheet} and have continued through the independent study. This structure was particularly helpful to pull me back and gain perspective of what
my goal was when I was knee-deep in feature construction and model formulation.
to the start of the semester\footnote{Only research during the semester was logged in the timesheet} and have continued through the independent study. This
structure was particularly helpful to pull me back and gain perspective of what my goal was when I was knee-deep in feature construction and model formulation.
\subsubsection{Moral Hazards and The Bob Rubin Trade}
Picking pennies in front of a steamroller.
@@ -199,8 +200,8 @@ flags for significant events in reality that do not effect the proposed course o
The 2009 recession, attributed to the collapse of the housing market bubble, is the most common example of a moral hazard because the displacement of risk from
banks who were federally required to give subprime loans to the taxpayer meant that banks could profit from subprime loans but would not be harmed when the inevitable
occurred. In popular media, the housing bubble bursting is attributed to the banks where those in the industry passed off the event as something that nobody could
have forseen.\footnote{For instance, in the 2015 movie \textit{The Big Short}, only a few savvy traders who bothered to look into the details find that banks had,
in their ignorance, built the bundled mortgages on an unstable foundation.} In reality, banks only ignored a probablistic eventuality because their models did not
have foreseen\footnote{For instance, in the 2015 movie \textit{The Big Short}, only a few savvy traders who bothered to look into the details find that banks had,
in their ignorance, built the bundled mortgages on an unstable foundation.}. In reality, banks only ignored a probablistic eventuality because their models did not
need to account for such an event.
Most emphasize the problems with risk transferrence when creating models. For this study's purposes, the important learning is that probablistic models should not
@@ -291,9 +292,35 @@ Finally, this equation is updated to replace descriptions with technical terms:
\]
Even this equation can be misconstrued as a number of arrangements of ratios involving total occurrences from a category or non-occurrences from outside
of the category so as a final demonstration, the sample space will be visualized geometrically
\footnote{Concept credit to 3Blue1Brown on Youtube, this video is what finally clarified in my mind what the equation behind Bayes Theorem meant.\\
\url{https://www.youtube.com/watch?v=HZGCoVF3YvM}} as a 1 unit by 1 unit square.
of the category so as a final demonstration, the sample space can be visualized geometrically as a 1 unit by 1 unit
square\footnote{Concept credit to 3Blue1Brown on Youtube, this video is what finally clarified in my mind what the frankly simple equation behind Bayes Theorem
meant.\\\url{https://www.youtube.com/watch?v=HZGCoVF3YvM}}. The area of this square, 1 unit squared, is the equivalent to a probability of 1 (or 100\%).
In such an example, a vertical line is drawn to separate proportions representative of the category (or the assumed-true event) and observations not of that category.
Horizontal lines drawn in each represent the probability of an occurrence in each category.
Consider an example where a cancer test given to 1,000 people has a 95\% accuracy rate. Of those 1,000 people, 10\% of them have cancer, 95 of whom test positive
(true positive) and 5 who test negative (false negative). Of the remaining 900, 45 test positive (false positive) and 855 test negative (true negative). Such
an example can be expressed visually as:
\vskip 2pt
\begin{center}
\begin{tikzpicture}
\draw[gray, thick] (0,0) rectangle (3,3);
\draw[gray, thin] (3/10, 0) -- (3/10, 3);
\draw[gray, thin] (0, 0) rectangle (3/10, 3*.95);
\node[label=below:95/1000] at (-1,1) {TP};
\draw[->] (-.6, 1) -- (.15, 1);
\node[label=below:45/1000] at (1.5,-2/3) {FP};
\draw[->] (1.5, -1/3) -- (1.5, .05);
\draw[gray, thin] (3/10, 0) rectangle (3, 3*.05);
\end{tikzpicture}
\end{center}
\vskip 2pt
Using this visual where TP represents true positives and FP representing false positives, Bayes Theorem is simply expressed as:
\[
P(A|E) = \frac{TP}{TP + FP} = \frac{\frac{95}{1000}}{\frac{95}{1000} + \frac{45}{1000}} = 67.9\%
\]
Meaning that, given a random positive test, there is a 67.9\% chance of the patient actually having cancer. This percentage visually tracks with the graphic as
the TP box appears to be approximately twice the size of the FP box, giving a two-thirds chance of the patient being a true positive.
\subsubsection{Bayesian Updating}

View File

@@ -22,4 +22,5 @@ Week,Date,Type,Duration (Hours),Description
7,10/11,Advising Meetings,1,"Epistemology and Overview discussion, hex mapping"
8,10/15,Research,3,"Bayes Belief Networks"
8,10/16,Application,2.5,"Bayes visualizations and practice worksheets"
8,10/16,Reporting,2,"Early Bayesian Statistics Report"
8,10/16,Reporting,2,"Early Bayesian Statistics Report"
8,10/17,Application,2,"Bayes Geometric Visualization"
1 Week Date Type Duration (Hours) Description
22 7 10/11 Advising Meetings 1 Epistemology and Overview discussion, hex mapping
23 8 10/15 Research 3 Bayes Belief Networks
24 8 10/16 Application 2.5 Bayes visualizations and practice worksheets
25 8 10/16 Reporting 2 Early Bayesian Statistics Report
26 8 10/17 Application 2 Bayes Geometric Visualization

Binary file not shown.

View File

@@ -80,13 +80,15 @@ Week & Date & Type & Duration (Hours) & Description \\
\hline
8 & 10/16 & Reporting & 2 & Early Bayesian Statistics Report \\
\hline
8 & 10/17 & Application & 2 & Bayes Geometric Visualization \\
\hline
\end{tabular}
\end{table}
\noindent Hours for Advising Meetings: 6.0\\
Hours for Application: 4.0\\
Hours for Application: 6.0\\
Hours for Reporting: 16.0\\
Hours for Research: 28.5\\
\textbf{Total Hours: 54.5}\\
\textbf{Total Hours: 56.5}\\
% CLOSE Timesheet
\end{document}