mirror of
https://github.com/asimonson1125/Implementations-of-Probability-Theory.git
synced 2026-02-24 21:59:50 -06:00
Drafted bayes report
This commit is contained in:
Binary file not shown.
@@ -4,6 +4,8 @@
|
|||||||
\usepackage{amsmath}
|
\usepackage{amsmath}
|
||||||
\usepackage{amssymb}
|
\usepackage{amssymb}
|
||||||
\usepackage[a4paper, total={6in, 10in}]{geometry}
|
\usepackage[a4paper, total={6in, 10in}]{geometry}
|
||||||
|
\usepackage{setspace}
|
||||||
|
\setstretch{1.25}
|
||||||
\hyphenpenalty 1000
|
\hyphenpenalty 1000
|
||||||
|
|
||||||
\begin{document}
|
\begin{document}
|
||||||
@@ -33,7 +35,24 @@
|
|||||||
\newpage
|
\newpage
|
||||||
% Begin report
|
% Begin report
|
||||||
\section{Objective}
|
\section{Objective}
|
||||||
yada yada yah I started this independent study for my own selfish gain
|
The educational focus of Implementations of Probability Theory surrounds the application of data
|
||||||
|
models that produce non-deterministic insights through probabilistic methodology. By pursuing this
|
||||||
|
study I hope to gain a deeper understanding of how to apply data in risk calculation for mitigation
|
||||||
|
scenarios as they appear in real life, rather than the experimental lab conditions that enable algorithmic
|
||||||
|
certainty.
|
||||||
|
|
||||||
|
In contrast to the path of black-box artificial intelligence and algorithms taught in \textbf{CSCI 335: Machine Learning}, this study is tailored to methods
|
||||||
|
designed to produce confidence levels for uncertain events using certain terms, leveraging logical,
|
||||||
|
traceable, and definite, calculations. Current course offerings in the realm of data science focus largely on
|
||||||
|
the storing and management of data, and it is noted that the cluster of data science was until very recently
|
||||||
|
under the branding of data management. Implementations of Probability Theory is intended to extend
|
||||||
|
learnings in previous courses, notably \textbf{CSCI 420: Principles of Data Mining}, for more advanced algorithms
|
||||||
|
used at the intersection of data and computing after the preprocessing stage.
|
||||||
|
|
||||||
|
After beginning this study the intended deliverable outline was determined to be technically implausible and has been replaced with
|
||||||
|
demonstrations of applied algorithms. Taking inspiration from the retinal mosaic as displayed in \textbf{CSCI 431: Intro to Computer Vision}
|
||||||
|
and discussion in \textbf{IGME 589: Computational Creativity and Algorithmic Art} on the appearance and nature of randomness in graphics, I hope to create
|
||||||
|
a program that can determine the liklihood that randomly distributed colors on a hexagonal grid appear as they do in an image.
|
||||||
|
|
||||||
\newpage
|
\newpage
|
||||||
\section{Units}
|
\section{Units}
|
||||||
@@ -155,4 +174,139 @@ To calculate standard error, kys.
|
|||||||
Statistical Inference is any data analysis to draw conclusions from a sample to make assertions about the population.
|
Statistical Inference is any data analysis to draw conclusions from a sample to make assertions about the population.
|
||||||
Methods include estimation via averages and confidence intervals, and hypothesis testing, which attempts to invalidate (never \textit{validate}) a hypothesis.
|
Methods include estimation via averages and confidence intervals, and hypothesis testing, which attempts to invalidate (never \textit{validate}) a hypothesis.
|
||||||
|
|
||||||
|
\newpage
|
||||||
|
\subsection{Unit 2: Probabilistic Theories and Epistemology}
|
||||||
|
When developing probabilistic models it is vital to use domain expertise to expose the product to the full range of external variables that would be expected
|
||||||
|
of a model applied to the real world. Without an appropriate understanding of both the limitations in research procedures and the true value of the data collected,
|
||||||
|
the integrity of the model becomes inherently compromised.
|
||||||
|
|
||||||
|
As data scientists, we are uniquely at risk of falling for this trap because it is hard to fully grasp domain expertise when the nature of data science
|
||||||
|
in a business setting frequently means consulting for many separate projects with a collectively massive scope. Of equal consideration, it is also easy
|
||||||
|
to assume that the sophistication of our tools overrides imperfections in the data, in spite of mantras like 'Garbage In, Garbage Out'.
|
||||||
|
|
||||||
|
In this unit I explored some common fallacies and assumptions held by analysts who may not fully grasp the content that they work with,
|
||||||
|
nor the problems they intend to solve. This required extensive research that I found was best digested in the form of books whose chapters chronicle multiple
|
||||||
|
examples of a given principle. As such, the reading was not confined to just the timeslot designated for this unit. Research started during the months leading up
|
||||||
|
to the start of the semester\footnote{Only research during the semester was logged in the timesheet} and have continued through the independent study. This structure was particularly helpful to pull me back and gain perspective of what
|
||||||
|
my goal was when I was knee-deep in feature construction and model formulation.
|
||||||
|
|
||||||
|
\subsubsection{Moral Hazards and The Bob Rubin Trade}
|
||||||
|
Picking pennies in front of a steamroller.
|
||||||
|
When studying the effectiveness of a model the scope of review must capture the entire range of the sample space. Discarding black swans that don't impact
|
||||||
|
the client does not mean the results will not reflect on the client for an oversight. There is therefore a question of obligation for data scientists to include
|
||||||
|
flags for significant events in reality that do not effect the proposed course of action to the client.
|
||||||
|
|
||||||
|
The 2009 recession, attributed to the collapse of the housing market bubble, is the most common example of a moral hazard because the displacement of risk from
|
||||||
|
banks who were federally required to give subprime loans to the taxpayer meant that banks could profit from subprime loans but would not be harmed when the inevitable
|
||||||
|
occurred. In popular media, the housing bubble bursting is attributed to the banks where those in the industry passed off the event as something that nobody could
|
||||||
|
have forseen.\footnote{For instance, in the 2015 movie \textit{The Big Short}, only a few savvy traders who bothered to look into the details find that banks had,
|
||||||
|
in their ignorance, built the bundled mortgages on an unstable foundation.} In reality, banks only ignored a probablistic eventuality because their models did not
|
||||||
|
need to account for such an event.
|
||||||
|
|
||||||
|
Most emphasize the problems with risk transferrence when creating models. For this study's purposes, the important learning is that probablistic models should not
|
||||||
|
drop evaluations as soon as an event leaves the scope of the immediate client.
|
||||||
|
|
||||||
|
\subsubsection{Ignoring Improbable Outliers with Outsized Impact}
|
||||||
|
In machine learning it is common for algorithms to drop the most extreme (or a random selection of) datapoints to avoid overfitting and errors in data collection.
|
||||||
|
One issue with the current implementation of this procedure is that it is often done blindly, ignorant of information that these outliers may relay. For instance,
|
||||||
|
in a selection of 300 water samples from a stream, all but a few show a normal amount of oxygen in the stream. A citizen scientist may discount the remaining pockets
|
||||||
|
as a statistical implausibility that is most likely indicative of a failure in sample testing and drop the most extreme 5\% of datapoints.
|
||||||
|
However, if these few pockets show a complete disruption of the dissolution process, the vast majority of aquatic life in the stream will eventually pass through
|
||||||
|
these pockets without oxygen and die, resulting in an outsized impact from just a few sources.
|
||||||
|
|
||||||
|
Nassim Taleb in \textit{Fooled By Randomness} describes this event with an analogy to Russian Roulette: If there was a 5/6 chance of winning a million dollars and a
|
||||||
|
1/6 chance of killing yourself, many people would at least hesitate before pulling the trigger. But what if the barrel is 10,000 rounds and it was only a
|
||||||
|
1/10,000 chance of harm? In this case, many less-than-rational actors use the game repeatedly to acquire wealth indefinitely, forgetting or even outright ignorant
|
||||||
|
that eventually the unlikely, or, as the actor would see it, the unthinkable, happens and all of the gains are completely negated.
|
||||||
|
|
||||||
|
\subsubsection{Fooled By Randomness}
|
||||||
|
May justify its own subsection since the others acknowledge small probabilities whereas this is outright randomness.
|
||||||
|
|
||||||
|
\subsubsection{Lindy Effect}
|
||||||
|
"For the perishable, every additional day in its life translates into a shorter additional life expectancy.
|
||||||
|
For the nonperishable, every additional day may imply a longer life expectancy."
|
||||||
|
A tool that is proven is more likely to stand the test of time than a new tool replacing it since it is unproven.
|
||||||
|
"The robustness of an item is proportional to its life!"
|
||||||
|
|
||||||
|
"Inaccurate science\ldots is constantly being published. The Lindy-conscious consumer of scientific data will take seriously only
|
||||||
|
information that has held up over a period of time."\footnote{\url{https://www.nytimes.com/2021/06/17/style/lindy.html}}
|
||||||
|
|
||||||
|
\subsubsection{Decision Theory}
|
||||||
|
Decision theory is the study of how people make decisions with uncertain information. There are two main branches of decision theory:
|
||||||
|
\subsubsection*{Normative/Rational Decision Theory}
|
||||||
|
This branch studies how people \textit{should} make decisions. In problems with other actors, as in game theory, it is assumed that all other actors will also
|
||||||
|
act with perfect rationality, allowing for precise calculation of the actions of all of the others and their expected utility to the agent.
|
||||||
|
\subsubsection*{Descriptive Decision Theory}
|
||||||
|
This branch studies how people actually make decisions which includes factors such as psychological and emotional biases.
|
||||||
|
|
||||||
|
\subsubsection{Info Gap Decisions}
|
||||||
|
In info gap decision theory there is not enough information to assign probabilities to events and the goal is to select a course of action that is robust in the
|
||||||
|
face of uncertainty. Where decision theory can predict expectations in irrationality to determine expected values, info gap decisions approximate the range of
|
||||||
|
probabilities and weight them to estimate expected value. In essence, it applies probabilities to probabilities, adding an additional layer to insulate calculations
|
||||||
|
from a lack of data or lack of understanding of a topic.
|
||||||
|
|
||||||
|
\subsubsection{Methodology Considerations}
|
||||||
|
Given I have taken 10134023 instances of the last 40 years, all of which Obama has been alive, I can say with a high degree of certainty that Obama is immortal.
|
||||||
|
|
||||||
|
An event never occurring in history does not discount its possiblity of occurring in the future. Similarly, events that may have been impossible in the past
|
||||||
|
are not necessarily impossible in the future.
|
||||||
|
Also, psychology. Someone who knows they are being studied will act differently than someone who isn't being studied so models will be inaccurate.
|
||||||
|
|
||||||
|
\newpage
|
||||||
|
\subsection{Unit 3: Bayesian Statistics}
|
||||||
|
This unit was deliberately separated from statistical review due to the percieved complexity of the topic and the magnitude of usage in recent data science
|
||||||
|
breakthroughs. Bayes Theorem is a part of the cirriculum for both \textbf{MATH 351 - Probability and Statistics} and \textbf{CSCI 420 - Principles of Data Mining}.
|
||||||
|
However, as both approached the topic from different perspectives and while neither solidified my personal confidence in its use, I chose to take extra time to learn
|
||||||
|
this important topic in my own way.
|
||||||
|
|
||||||
|
It has been said that statistics does not come naturally to the human brain, hence statistics is, by mathematical standards, a
|
||||||
|
young discipline. Resulting research on Bayesian statistics has led me to the conclusion that the opposite may be true - Bayes Theorem is quite intuitive, but
|
||||||
|
its discipline has not had the time to crystallize best practices for instructing it. For instance, updating one's beliefs to compare probabilities with the
|
||||||
|
number of documented occurrences is frequently used in philosophical discussion in the form of explanations that subsets with high liklihood of fufilling terms
|
||||||
|
are valid classifications even when the subset size results in overall fufilled terms to be infrequently categorized as the proposed subset. Most people understand
|
||||||
|
these expressions but, when shown a table and how to calculate those ratios, the content enters the realm of collegiate instruction.
|
||||||
|
|
||||||
|
\subsubsection{Bayes Theorem}
|
||||||
|
|
||||||
|
The equation for Bayes Theorem is as follows:
|
||||||
|
|
||||||
|
\[
|
||||||
|
P(A|E) = \frac{P(A) * P(E|A)}{P(A) * P(E|A) + (1 - P(A)) * P(E|\neg A)}
|
||||||
|
\]
|
||||||
|
|
||||||
|
This formula appears more complex as it is. The denominator, while directly translating to "The probability of A times the probability of event E occuring in A
|
||||||
|
divided by the probability of A times the probability of event E occuring in A plus the probability of not A times the probability of E occuring in not A"
|
||||||
|
can be more easily expressed simply as \(P(E)\) or the probability of event E occuring.
|
||||||
|
|
||||||
|
By utilizing venacular more familiar to everyday life, Bayes Theorem can be translated into:
|
||||||
|
|
||||||
|
\[
|
||||||
|
\text{P(occurence came from category)} = \frac{\text{\# of occurences from category}}{\text{total \# of occurences}}
|
||||||
|
\]
|
||||||
|
|
||||||
|
Finally, this equation is updated to replace descriptions with technical terms:
|
||||||
|
|
||||||
|
\[
|
||||||
|
\text{Posterior Probability} = \frac{\text{prior} * \text{likelihood}}{\text{Evidence}}
|
||||||
|
\]
|
||||||
|
|
||||||
|
Even this equation can be misconstrued as a number of arrangements of ratios involving total occurrences from a category or non-occurrences from outside
|
||||||
|
of the category so as a final demonstration, the sample space will be visualized geometrically
|
||||||
|
\footnote{Concept credit to 3Blue1Brown on Youtube, this video is what finally clarified in my mind what the equation behind Bayes Theorem meant.\\
|
||||||
|
\url{https://www.youtube.com/watch?v=HZGCoVF3YvM}} as a 1 unit by 1 unit square.
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{Bayesian Updating}
|
||||||
|
Bayesian Updating is another term that has been added to buzzword vocabulary to describe a process that isn't directly related to Bayesian Statistics but appears
|
||||||
|
to have been rediscovered by academia through study of applied Bayes Theorem. In essence, Bayesian Updating simply states that observed occurrences should not
|
||||||
|
override previous evidence and that it should instead be added to it in equal weight (equal value being a naive assumption). This evidence updating makes
|
||||||
|
applications of Bayes Theory calculate posterior probabilities continuously as new information enters the system rather than a calculation that is only done once.
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{Bayesian Belief Networks}
|
||||||
|
Bayesian Belief Networks are probablistic graphical models that preserve conditional dependence between random variables. In spite of its name,
|
||||||
|
Bayesian Belief Networks do not necessarily apply Bayesian models, though they are a way to utilize Bayes Theorem for domains with greater complexity beyond a
|
||||||
|
single posterior probability. In this type of network, edges are directed and the structure is utilized in a single direction. This is in contrast to undirected
|
||||||
|
Hidden Markov Models that do not assume the order of aquisition of random variables.
|
||||||
|
|
||||||
\end{document}
|
\end{document}
|
||||||
@@ -36,7 +36,7 @@ def csv2Table(inFile):
|
|||||||
rows = list(reader)
|
rows = list(reader)
|
||||||
|
|
||||||
out = "\\begin{table}[h!]\n\\centering\n"
|
out = "\\begin{table}[h!]\n\\centering\n"
|
||||||
out += "\\begin{tabular}[t]{|" + " c | c | c | c | p{6cm} |}\n"
|
out += "\\begin{tabular}[t]{|" + " c | p{1.3cm} | c | c | p{6cm} |}\n"
|
||||||
out += "\\hline\n"
|
out += "\\hline\n"
|
||||||
|
|
||||||
for row in rows:
|
for row in rows:
|
||||||
|
|||||||
@@ -1,10 +1,25 @@
|
|||||||
Week,Date,Type,Duration (Hours),Description
|
Week,Date,Type,Duration (Hours),Description
|
||||||
1,08/30,Advising Meetings,2,"Stat Review Content acknowledgement, Latex overview for reports"
|
1,08/30,Advising Meetings,2,"Stat Review Content acknowledgement, Latex overview for reports"
|
||||||
2,09/02,Reporting,3,"First applications of Latex for final report, created Timesheet System."
|
2,09/02,Reporting,3,"First applications of Latex for final report, created Timesheet System."
|
||||||
2,09/02,Research,2,"Stat Review: Sample Space through Probability Density Functions"
|
2,09/02,Research,2.5,"Stat Review: Sample Space through Probability Density Functions"
|
||||||
2,09/06,Advising Meetings,1,"Research Review and exploration of PDF expected values and confidence intervals"
|
2,09/06,Advising Meetings,1,"Research Review and exploration of PDF expected values and confidence intervals"
|
||||||
|
3,09/14,Research,3,"Reading: Fooled by Randomness by Nassim N. Taleb"
|
||||||
4,09/19,Research,2,"Producing Confidence Intervals"
|
4,09/19,Research,2,"Producing Confidence Intervals"
|
||||||
4,09/20,Research,1,"Statistical Inference and t-testing"
|
4,09/20,Research,1.5,"Statistical Inference and t-testing"
|
||||||
4,09/20,Advising Meetings,1,"Stat Review finalization, definition of reporting standard"
|
4,09/20,Advising Meetings,1,"Stat Review finalization, definition of reporting standard"
|
||||||
5,09/23,Research,2,"Parametric and Non-parametric tests"
|
5,09/23,Research,2.5,"Parametric and Non-parametric tests"
|
||||||
6,10/03,Reporting,4,"Structuring stat review report"
|
5,09/26,Research,3,"Kinsman's suggested reading: Prob and Stat by Charles Linn"
|
||||||
|
6,09/25 - 09/30,Research,5,"Reading: Fooled by Randomness by Nassim N. Taleb"
|
||||||
|
6,10/03,Reporting,4,"Structuring stat review writeup"
|
||||||
|
6,10/04,Reporting,2,"Confidence Statistics writeup"
|
||||||
|
6,10/04,Research,2.5,"Ludic Fallacy Reading: Skin in the Game by Nassim N. Taleb"
|
||||||
|
6,10/04,Advising Meetings,1,"Report review and discussion on replacing deliverables"
|
||||||
|
6,10/05,Application,1.5,"Hexagonal basis vectors"
|
||||||
|
7,10/08,Research,2,"The Black Swan by Nassim Taleb"
|
||||||
|
7,10/10,Reporting,2,"Epistemology Writeup"
|
||||||
|
7,10/10,Research,1.5,"The Lindy Effect: The Lindy Way of Living - NYT"
|
||||||
|
7,10/11,Reporting,3,"Moral Hazards, Outsized Impact, Lindy Effect in writeup"
|
||||||
|
7,10/11,Advising Meetings,1,"Epistemology and Overview discussion, hex mapping"
|
||||||
|
8,10/15,Research,3,"Bayes Belief Networks"
|
||||||
|
8,10/16,Application,2.5,"Bayes visualizations and practice worksheets"
|
||||||
|
8,10/16,Reporting,2,"Early Bayesian Statistics Report"
|
||||||
|
Binary file not shown.
@@ -28,7 +28,7 @@
|
|||||||
% OPEN Timesheet
|
% OPEN Timesheet
|
||||||
\begin{table}[h!]
|
\begin{table}[h!]
|
||||||
\centering
|
\centering
|
||||||
\begin{tabular}[t]{| c | c | c | c | p{6cm} |}
|
\begin{tabular}[t]{| c | p{1.3cm} | c | c | p{6cm} |}
|
||||||
\hline
|
\hline
|
||||||
Week & Date & Type & Duration (Hours) & Description \\
|
Week & Date & Type & Duration (Hours) & Description \\
|
||||||
\hline
|
\hline
|
||||||
@@ -36,26 +36,57 @@ Week & Date & Type & Duration (Hours) & Description \\
|
|||||||
\hline
|
\hline
|
||||||
2 & 09/02 & Reporting & 3 & First applications of Latex for final report, created Timesheet System. \\
|
2 & 09/02 & Reporting & 3 & First applications of Latex for final report, created Timesheet System. \\
|
||||||
\hline
|
\hline
|
||||||
2 & 09/02 & Research & 2 & Stat Review: Sample Space through Probability Density Functions \\
|
2 & 09/02 & Research & 2.5 & Stat Review: Sample Space through Probability Density Functions \\
|
||||||
\hline
|
\hline
|
||||||
2 & 09/06 & Advising Meetings & 1 & Research Review and exploration of PDF expected values and confidence intervals \\
|
2 & 09/06 & Advising Meetings & 1 & Research Review and exploration of PDF expected values and confidence intervals \\
|
||||||
\hline
|
\hline
|
||||||
|
3 & 09/14 & Research & 3 & Reading: Fooled by Randomness by Nassim N. Taleb \\
|
||||||
|
\hline
|
||||||
4 & 09/19 & Research & 2 & Producing Confidence Intervals \\
|
4 & 09/19 & Research & 2 & Producing Confidence Intervals \\
|
||||||
\hline
|
\hline
|
||||||
4 & 09/20 & Research & 1 & Statistical Inference and t-testing \\
|
4 & 09/20 & Research & 1.5 & Statistical Inference and t-testing \\
|
||||||
\hline
|
\hline
|
||||||
4 & 09/20 & Advising Meetings & 1 & Stat Review finalization, definition of reporting standard \\
|
4 & 09/20 & Advising Meetings & 1 & Stat Review finalization, definition of reporting standard \\
|
||||||
\hline
|
\hline
|
||||||
5 & 09/23 & Research & 2 & Parametric and Non-parametric tests \\
|
5 & 09/23 & Research & 2.5 & Parametric and Non-parametric tests \\
|
||||||
\hline
|
\hline
|
||||||
6 & 10/03 & Reporting & 4 & Structuring stat review report \\
|
5 & 09/26 & Research & 3 & Kinsman's suggested reading: Prob and Stat by Charles Linn \\
|
||||||
|
\hline
|
||||||
|
6 & 09/25 - 09/30 & Research & 5 & Reading: Fooled by Randomness by Nassim N. Taleb \\
|
||||||
|
\hline
|
||||||
|
6 & 10/03 & Reporting & 4 & Structuring stat review writeup \\
|
||||||
|
\hline
|
||||||
|
6 & 10/04 & Reporting & 2 & Confidence Statistics writeup \\
|
||||||
|
\hline
|
||||||
|
6 & 10/04 & Research & 2.5 & Ludic Fallacy Reading: Skin in the Game by Nassim N. Taleb \\
|
||||||
|
\hline
|
||||||
|
6 & 10/04 & Advising Meetings & 1 & Report review and discussion on replacing deliverables \\
|
||||||
|
\hline
|
||||||
|
6 & 10/05 & Application & 1.5 & Hexagonal basis vectors \\
|
||||||
|
\hline
|
||||||
|
7 & 10/08 & Research & 2 & The Black Swan by Nassim Taleb \\
|
||||||
|
\hline
|
||||||
|
7 & 10/10 & Reporting & 2 & Epistemology Writeup \\
|
||||||
|
\hline
|
||||||
|
7 & 10/10 & Research & 1.5 & The Lindy Effect: The Lindy Way of Living - NYT \\
|
||||||
|
\hline
|
||||||
|
7 & 10/11 & Reporting & 3 & Moral Hazards, Outsized Impact, Lindy Effect in writeup \\
|
||||||
|
\hline
|
||||||
|
7 & 10/11 & Advising Meetings & 1 & Epistemology and Overview discussion, hex mapping \\
|
||||||
|
\hline
|
||||||
|
8 & 10/15 & Research & 3 & Bayes Belief Networks \\
|
||||||
|
\hline
|
||||||
|
8 & 10/16 & Application & 2.5 & Bayes visualizations and practice worksheets \\
|
||||||
|
\hline
|
||||||
|
8 & 10/16 & Reporting & 2 & Early Bayesian Statistics Report \\
|
||||||
\hline
|
\hline
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{table}
|
\end{table}
|
||||||
\noindent Hours for Advising Meetings: 4\\
|
\noindent Hours for Advising Meetings: 6.0\\
|
||||||
Hours for Reporting: 7\\
|
Hours for Application: 4.0\\
|
||||||
Hours for Research: 7\\
|
Hours for Reporting: 16.0\\
|
||||||
\textbf{Total Hours: 18}\\
|
Hours for Research: 28.5\\
|
||||||
|
\textbf{Total Hours: 54.5}\\
|
||||||
% CLOSE Timesheet
|
% CLOSE Timesheet
|
||||||
|
|
||||||
\end{document}
|
\end{document}
|
||||||
Reference in New Issue
Block a user