
<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Statistics&#8217; dirtiest secret</title>
	<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/</link>
	<description>"All manner of statistical analyses cheerfully undertaken."</description>
	<pubDate>Fri, 21 Nov 2008 14:58:20 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.3</generator>
		<item>
		<title>By: Tony</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-7258</link>
		<dc:creator>Tony</dc:creator>
		<pubDate>Tue, 17 Jun 2008 01:06:38 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-7258</guid>
		<description>Excellent article and clearly explained. My statistics lecturers did explain that association was not causation but of course this does not mean that this is regularly noted in the literature. I have also seen too many examples where data sets are blindly put into a linear model and conclusions drawn from the results. I have done the same thing myself only to see that the same parameters applied to similar related data set are NOT predictive. I have no problem with this providing the  perpetrator is only looking for guidance as to possible relationships and realises that this may give a guide. 

I have also used regression of a derived relationship based on the physics of a process for a guide to the accuracy of the model. This gave a reasonable predictive relationship although I was aware that I had made a number of simplifications in developimg the model. My model which was of a filtration process was based on well established relationships and so the regression should have and did show a good correlation between two primary input variables and one output variable. 

The development of models in this way is surely legitimate where the number of variables is small and there is reason to posit one model over others on physical grounds. I see that there are complications with climate models because of the extent of climate, the number of variables required to model climate is large and their measurement is problematical and data sets of climate measurements are incomplete and may be inaccurate and are inconsistent in timing and location. Given also, that irrespective of human activities climate is continuously changing, establishment of some stable basis on which to interpolate or extrapolate seems also to be difficult.</description>
		<content:encoded><![CDATA[<p>Excellent article and clearly explained. My statistics lecturers did explain that association was not causation but of course this does not mean that this is regularly noted in the literature. I have also seen too many examples where data sets are blindly put into a linear model and conclusions drawn from the results. I have done the same thing myself only to see that the same parameters applied to similar related data set are NOT predictive. I have no problem with this providing the  perpetrator is only looking for guidance as to possible relationships and realises that this may give a guide. </p>
<p>I have also used regression of a derived relationship based on the physics of a process for a guide to the accuracy of the model. This gave a reasonable predictive relationship although I was aware that I had made a number of simplifications in developimg the model. My model which was of a filtration process was based on well established relationships and so the regression should have and did show a good correlation between two primary input variables and one output variable. </p>
<p>The development of models in this way is surely legitimate where the number of variables is small and there is reason to posit one model over others on physical grounds. I see that there are complications with climate models because of the extent of climate, the number of variables required to model climate is large and their measurement is problematical and data sets of climate measurements are incomplete and may be inaccurate and are inconsistent in timing and location. Given also, that irrespective of human activities climate is continuously changing, establishment of some stable basis on which to interpolate or extrapolate seems also to be difficult.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eco sites to take seriously&#8230;&#8230; &#171; Thoughts&#8230; and more thoughts&#8230;..</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-2125</link>
		<dc:creator>Eco sites to take seriously&#8230;&#8230; &#171; Thoughts&#8230; and more thoughts&#8230;..</dc:creator>
		<pubDate>Sat, 05 Apr 2008 13:56:05 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-2125</guid>
		<description>[...] William M. Briggs The site of US-based statistician Briggs, who seeks to question and evaluate the data involved in climate change theories, ultimately coming to quite different conclusions. Don&#8217;t approach unless you&#8217;re very comfortable with numbers. Check out statistics&#8217; dirtiest secret [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] William M. Briggs The site of US-based statistician Briggs, who seeks to question and evaluate the data involved in climate change theories, ultimately coming to quite different conclusions. Don&#8217;t approach unless you&#8217;re very comfortable with numbers. Check out statistics&#8217; dirtiest secret [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: DaveO</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-1628</link>
		<dc:creator>DaveO</dc:creator>
		<pubDate>Mon, 24 Mar 2008 17:55:33 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-1628</guid>
		<description>Great discussion.

Can I add two further distortions arising from classical statistical methods:

Publication bias (perhaps what you meant by 'file drawer' in post 33?),

The notion that p refers to the probability of a particular hypothesis rather than the probability of data arising, given a particular model.

D</description>
		<content:encoded><![CDATA[<p>Great discussion.</p>
<p>Can I add two further distortions arising from classical statistical methods:</p>
<p>Publication bias (perhaps what you meant by &#8216;file drawer&#8217; in post 33?),</p>
<p>The notion that p refers to the probability of a particular hypothesis rather than the probability of data arising, given a particular model.</p>
<p>D</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Briggs</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-940</link>
		<dc:creator>Briggs</dc:creator>
		<pubDate>Sat, 01 Mar 2008 12:01:10 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-940</guid>
		<description>M. Cawdry,

Amen to your comment on meta-analyses.  I mean to write about that later.  In my experience, most meta analyses always seek to prove what the individuals studies that compose it could not, while ignoring the insurmountable file-drawer and never-done-experiment problems.

Briggs</description>
		<content:encoded><![CDATA[<p>M. Cawdry,</p>
<p>Amen to your comment on meta-analyses.  I mean to write about that later.  In my experience, most meta analyses always seek to prove what the individuals studies that compose it could not, while ignoring the insurmountable file-drawer and never-done-experiment problems.</p>
<p>Briggs</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: M. Cawdery</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-939</link>
		<dc:creator>M. Cawdery</dc:creator>
		<pubDate>Sat, 01 Mar 2008 11:52:37 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-939</guid>
		<description>I liked the comment ", it means that nearly all statistics results that you see published are overly boastful. "  This is so true!  

Another point is that a "correlation" i.e. mathematical is taken in many medical reports to be synonymous with "CAUSAL"

Meta-analyses seem these days to used for supporting otherwise unsupported data.  This is much used drug studies.  Unfortunately, by selection and manipulations such meta-analyses of failed studies can often be "proved" to support the intended theory or belief.

Statistics are very useful but unless the original raw data on which they are based are available, the "statistically significant" claim should be treated with the utmost caution</description>
		<content:encoded><![CDATA[<p>I liked the comment &#8220;, it means that nearly all statistics results that you see published are overly boastful. &#8221;  This is so true!  </p>
<p>Another point is that a &#8220;correlation&#8221; i.e. mathematical is taken in many medical reports to be synonymous with &#8220;CAUSAL&#8221;</p>
<p>Meta-analyses seem these days to used for supporting otherwise unsupported data.  This is much used drug studies.  Unfortunately, by selection and manipulations such meta-analyses of failed studies can often be &#8220;proved&#8221; to support the intended theory or belief.</p>
<p>Statistics are very useful but unless the original raw data on which they are based are available, the &#8220;statistically significant&#8221; claim should be treated with the utmost caution</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: oriental medicine blog &#187; Blog Archive &#187; Statistics? dirtiest secret</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-917</link>
		<dc:creator>oriental medicine blog &#187; Blog Archive &#187; Statistics? dirtiest secret</dc:creator>
		<pubDate>Fri, 29 Feb 2008 05:34:48 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-917</guid>
		<description>[...] Briggs wrote this today. I think it is worth reading. Here is a little snippet:This is why?picking medical journals as an example?one day you will see a headline that touts ?Eating Broccoli Reduces Risk of Breast Cancer,? only to later read, ?The Broccolis; They Do Nothing!? It?s just too easy to find results that &#8230; [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] Briggs wrote this today. I think it is worth reading. Here is a little snippet:This is why?picking medical journals as an example?one day you will see a headline that touts ?Eating Broccoli Reduces Risk of Breast Cancer,? only to later read, ?The Broccolis; They Do Nothing!? It?s just too easy to find results that &#8230; [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Briggs</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-859</link>
		<dc:creator>Briggs</dc:creator>
		<pubDate>Tue, 26 Feb 2008 13:13:56 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-859</guid>
		<description>p,

Most introductory statistics course won't cover Bayesian statistics.  Surprisingly, in classical or frequentist statistics it is &lt;strong&gt;forbidden&lt;/strong&gt; to even ask questions like "What is the probability that this hypothesis is true?"

Briggs</description>
		<content:encoded><![CDATA[<p>p,</p>
<p>Most introductory statistics course won&#8217;t cover Bayesian statistics.  Surprisingly, in classical or frequentist statistics it is <strong>forbidden</strong> to even ask questions like &#8220;What is the probability that this hypothesis is true?&#8221;</p>
<p>Briggs</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: p</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-857</link>
		<dc:creator>p</dc:creator>
		<pubDate>Tue, 26 Feb 2008 12:32:42 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-857</guid>
		<description>"But if you are only, say, 50% sure that the model you used is the correct one, then your proposition is only true ?at the 45% level? not at the 90% level"

i only ever took a first-level statistics class but this seems like the sort of thing i should have learned. is this statement just a case of the simple math i'm visualizing or is there some more advanced theory or analysis required to arrive at this conclusion?</description>
		<content:encoded><![CDATA[<p>&#8220;But if you are only, say, 50% sure that the model you used is the correct one, then your proposition is only true ?at the 45% level? not at the 90% level&#8221;</p>
<p>i only ever took a first-level statistics class but this seems like the sort of thing i should have learned. is this statement just a case of the simple math i&#8217;m visualizing or is there some more advanced theory or analysis required to arrive at this conclusion?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike D.</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-851</link>
		<dc:creator>Mike D.</dc:creator>
		<pubDate>Tue, 26 Feb 2008 06:01:58 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-851</guid>
		<description>Lovely discussion. Looking forward to Part 2. I think this was what Thomas Kuhn was talking about. The paradigm is a model. Anomalies are outliers. Too many outliers forces one to change one's model, but the new one is not necessarily the right one, even if the data "fit" better.</description>
		<content:encoded><![CDATA[<p>Lovely discussion. Looking forward to Part 2. I think this was what Thomas Kuhn was talking about. The paradigm is a model. Anomalies are outliers. Too many outliers forces one to change one&#8217;s model, but the new one is not necessarily the right one, even if the data &#8220;fit&#8221; better.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Thacker</title>
		<link>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-846</link>
		<dc:creator>John Thacker</dc:creator>
		<pubDate>Mon, 25 Feb 2008 23:27:41 +0000</pubDate>
		<guid>http://wmbriggs.com/blog/2008/02/18/statistics-dirtiest-secret/#comment-846</guid>
		<description>People really don't realize that the common correlation coefficient assumes a linear relationship.  Uncorrelated definitely does not mean anything like independent.  I always find it useful to demonstrate that the points {(-2, 4), (-1,1), (0,0), (1,1), (2, 4)} are uncorrelated despite the obvious relationship.

The book &lt;em&gt;Counterexamples in Probability&lt;/em&gt; has some interesting examples showing, among other things, how you can put an incredibly small maximum bound on the correlation between two random variables related via Y = e^X</description>
		<content:encoded><![CDATA[<p>People really don&#8217;t realize that the common correlation coefficient assumes a linear relationship.  Uncorrelated definitely does not mean anything like independent.  I always find it useful to demonstrate that the points {(-2, 4), (-1,1), (0,0), (1,1), (2, 4)} are uncorrelated despite the obvious relationship.</p>
<p>The book <em>Counterexamples in Probability</em> has some interesting examples showing, among other things, how you can put an incredibly small maximum bound on the correlation between two random variables related via Y = e^X</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.170 seconds -->
