Author Topic: A Math/Statistics Question For Anyone Interested  (Read 6958 times)

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #25 on: December 19, 2014, 11:40:06 AM »
OK, that methodology computes to me, at least. The sample size is 160K, which is more than significant for the US population.

But that's precisely why there's a standard error CV, isn't it? The survey is broken up into multiple sections. If you get to a section with so few sample cases, those numbers still aren't reliable.

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #26 on: December 19, 2014, 11:41:46 AM »


The weight they created DOESNT correct for this. They don't even claim it does. There is a high probability of error.

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #27 on: December 19, 2014, 11:49:44 AM »
The weight they created DOESNT correct for this. They don't even claim it does. There is a high probability of error.

The weight counts series incidents as the actual number of incidents reported by the victim, up to a maximum of 10 incidents. Including series victimizations in national rates results in rather large increases in the level of violent victimization; however, trends in violence are generally similar regardless of whether series victimizations are included.

In 2013, series incidents accounted for about 1% of all victimizations and 4% of all violent victimizations. Weighting series incidents as the number of incidents up to a maximum of 10 incidents produces more reliable estimates of crime levels, while the cap at 10 minimizes the effect of extreme outliers on the rates. Additional information on the series enumeration is detailed in the report Methods for Counting High Frequency Repeat Victimizations in the National Crime Victimization Survey, NCJ 237308, BJS web, April 2012.
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #28 on: December 19, 2014, 11:52:52 AM »
Once again, your bullshit train hurdles forward. "Series Incidents"  means multiple incidents perpetrated on the same victim/s.

That paragraph has nothing to do with case studies of less than 10. It just means that if someone is beaten up by the same  person  or something like that, the report doesn't count it more than 10 times.
 ::)

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #29 on: December 19, 2014, 11:59:04 AM »
Once again, your bullshit train hurdles forward. "Series Incidents"  means multiple incidents perpetrated on the same victim/s.

That paragraph has nothing to do with case studies of less than 10. It just means that if someone is beaten up by the same  person  or something like that, the report doesn't count it more than 10 times.
 ::)

Why do they use that sample size?  You claim you don't dispute the data so what is it that you have a problem with?
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #30 on: December 19, 2014, 12:02:43 PM »
Why do they use that sample size?  You claim you don't dispute the data so what is it that you have a problem with?

They don't use a sample size. That paragraph says nothing about sample size. it says that it limits "series incidents" in the total crime count to a number of 10. That has nothing to do with certain categories being unreliable if they have fewer than 10 examples. These two points aren't under the same section. They are completely unrelated.

FermiDirac

  • Getbig IV
  • ****
  • Posts: 2351
  • That'll do pig, that'll do
Re: A Math/Statistics Question For Anyone Interested
« Reply #31 on: December 19, 2014, 12:03:10 PM »
With a too small sample you will get biased estimations of your mean and standard deviation and the smaller the sample size, the larger the effect an outlier will have.


Some useful information for the topic at hand:

http://en.wikipedia.org/wiki/Convergence_of_random_variables

http://en.wikipedia.org/wiki/Selection_bias

http://en.wikipedia.org/wiki/Coefficient_of_variation

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #32 on: December 19, 2014, 12:05:06 PM »
They don't use a sample size. That paragraph says nothing about sample size. it says that it limits "series incidents" in the total crime count to a number of 10. That has nothing to do with certain categories being unreliable if they have fewer than 10 examples. These two points aren't under the same section. They are completely unrelated.

Apparently there are problems with the stats.  They might underestimate crime


The nation needs accurate measurements of victimization rates to allocate resources to fight crime, support victims' needs, and shape policies and programs to deter these crimes in the future. The National Crime Victimization Survey (NCVS), which is administered by the Bureau of Justice Statistics (BJS), is currently the major tool available to measure these rates and victim characteristics. As discussed in the preceding chapters, there is controversy as to whether the incidence of rape and sexual assault is being underestimated on the NCVS, in part because other sources of data have shown higher levels of victimization than estimated through the NCVS. These differences reflect, in part, the clear definitional differences and methodological differences among the sources, which in turn affect the estimated victimization levels.

The panel could not ascertain which data source provided the most accurate estimates of rape and sexual assault. Even though the other sources (excluding the Uniform Crime Reports [UCR]) showed larger estimates than did the NCVS (or National Crime Survey), the panel is not concluding that “bigger is better.” With that said, the higher rates estimated by the several reviewed surveys lend support to concerns about a potential underestimate by the NCVS. These concerns, as well as the original charge to the panel (see Box 1-1 in Chapter 1), led to the panel's close analysis of the NCVS. It is important to note that the panel's work focused on the NCVS and did not examine as closely the other sources of data on rape and sexual assault described in Chapter 5. By addressing only the NCVS in this and the next three chapters, the panel is not implying that there are more issues with the NCVS than with the others.1
http://www.ncbi.nlm.nih.gov/books/NBK202273/
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #33 on: December 19, 2014, 12:07:49 PM »
Apparently there are problems with the stats.  They might underestimate crime

Whuhh??  So, the weights don't make it 100% accurate. But there are 10!!  ::)

Hulkotron

  • Getbig V
  • *****
  • Posts: 28212
  • I ate an entire box of popsicles the day prior
Re: A Math/Statistics Question For Anyone Interested
« Reply #34 on: December 19, 2014, 12:13:02 PM »
Al Doggity I think you may be incorrectly conflating the warning about N=10 on the frontpage with this specific method that also uses the number 10 as the limit to how many reports can come from an individual.  

If I'm understanding it correctly, if someone reported 1-10 incidents in the study period they would all get counted, if someone reported 11 or more, only the first 10 are counted.  So if anything it is under-estimating crime, and is not related to the statistical problems we pointed out earlier with N≤10 for these types of things in any obvious way that I can see.  It is just a way of avoiding having the results be rendered invalid or not generalizable, due to a small number of outliers that may report hundreds of incidents.

I've lost track of what's going on and who is fucking who and whose mother so I'll contribute to this thread in another fashion:


Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #35 on: December 19, 2014, 12:20:25 PM »
Al Doggity I think you may be incorrectly conflating the warning about N=10 on the frontpage with this specific method that also uses the number 10 as the limit to how many reports can come from an individual.  

If I'm understanding it correctly, if someone reported 1-10 incidents in the study period they would all get counted, if someone reported 11 or more, only the first 10 are counted.  So if anything it is under-estimating crime, and is not related to the statistical problems we pointed out earlier with N≤10 for these types of things in any obvious way that I can see.  It is just a way of avoiding having the results be rendered invalid or not generalizable, due to a small number of outliers that may report hundreds of incidents.

I've lost track of what's going on and who is fucking who and whose mother so I'll contribute to this thread in another fashion:




You are correct in everything, with the exception that Archer77 is the one who is trying to mix up the two reports. I actually corrected him a few posts up. And the only reason he posted that blurb is because it is a wall of text he expected no one to read.


Also, your other contributions are greatly appreciated. ;D

Grape Ape

  • Getbig V
  • *****
  • Posts: 22251
  • SC è un asino
Re: A Math/Statistics Question For Anyone Interested
« Reply #36 on: December 19, 2014, 12:22:07 PM »
With a too small sample you will get biased estimations of your mean and standard deviation and the smaller the sample size, the larger the effect an outlier will have.

No shit, really?
Y

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #37 on: December 19, 2014, 12:24:33 PM »

You are correct in everything, with the exception that Archer77 is the one who is trying to mix up the two reports. I actually corrected him a few posts up. And the only reason he posted that blurb is because it is a wall of text he expected no one to read.


Also, your other contributions are greatly appreciated. ;D

I concede, there are problems with the sample size.  I'm not too proud to see your point.  If anything it appears to underestimate crime. However, the DOJ considers the numbers to accurate enough to use.  
A

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #38 on: December 19, 2014, 12:36:26 PM »
http://www.ncbi.nlm.nih.gov/books/NBK202273/table/tab_7_5/?report=objectonly

This explains why they use a sample size of ten. It's so serial incidents don't inflate the total number of rapes and sexual assaults. I may have made this point before.


Victimizations Reported as Series Victimizations in the NCVS, by Type of Crime, 1993-1999 and 2000-2009, as Percentage of All Victimizations Reported.
had difficulty recalling exactly how many times violent victimizations occurred within a 6-month reference period. The observed patterns of response clustering indicated that many victims provided estimates of the number of times the victimizations occurred rather than counting directly from memory.

Thus, when an individual is victimized so many times during a 6-month period that he or she has difficulty recalling individual incidents, that respondent may also have difficulty providing an accurate count of the number of incidents that happened and whether the incidents occurred within the reference period. Lynch, Berbaum, and Planty (2002, p. 23) further speculated about another potential measurement error problem that may exist in this category:

Series incidents in a large part may be an artifact of Census Bureau procedures. More specifically, multiple events may be treated as a series event when the respondent can clearly recall and report on these incidents, simply because it is easier for the interviewer to complete a single incident form, as opposed to multiple incident form

From a statistical point of view, series victimization procedures create outlier problems for estimation. In general, outlier problems can be caused by large estimation weights, large outlying data values, or moderate values. Estimation weights for the NCVS are fairly large. When estimating rape and sexual assault (a low-incidence item in the NCVS data), the data values are generally zero (no rape or sexual assault reported). When rape or sexual assault is reported as a series, the data value can be quite high.4 Under the new procedures the value is truncated at “10” for individuals reporting more than 10 incidents in a single series.

Even with the truncation, these outliers (representing only 6 percent of the positive responses to rape and sexual assault) tied to the NCVS weights have a substantial impact on the estimates and the standard errors of those estimates, with both increasing fairly substantially. Fortunately, the statistical literature is fairly well developed in the areas of detecting and adjusting for outliers, and some of the developed techniques (adjusting the weights, the data value, or both) may be appropriate for use in measuring rape and sexual assault

Until 2011, NCVS deleted these outliers for the purpose of estimates reported in Criminal Victimization (although they counted a series as a single victimization, rather than deleting, in some special reports). The effect was to heavily suppress the larger numbers that were reported by ignoring these multiple victimizations. This process added to a potential underestimation of victimizations (Planty and Strom, 2007).

Beginning in 2011, BJS stopped deleting these outliers. Instead, reported series victimizations are now directly included in the estimates with no additional adjustment unless more than 10 victimizations are reported in one series. Reported values greater than 10 are truncated to the value of 10. BJS has made the change retroactively back to 1993 in its online NCVS database.5

The effect of changing the method for handling these outliers in the estimates of rape and sexual assault is huge (see Figure 7-1 and Table 7-5). Across the past 18 years, this change in methodology increased the estimates of incidents of rape and sexual assault by an average of 52 percent per year, and it increased the estimates of incidence rate by 55 percent. The estimates (number of victimizations) also fluctuated more from year to year. The change ranged from a low of zero percentage change in 2007 (there were no series victimizations reported) to a high of 143 percentage change in 2009
A

Hulkotron

  • Getbig V
  • *****
  • Posts: 28212
  • I ate an entire box of popsicles the day prior
Re: A Math/Statistics Question For Anyone Interested
« Reply #39 on: December 19, 2014, 02:17:52 PM »
http://www.ncbi.nlm.nih.gov/books/NBK202273/table/tab_7_5/?report=objectonly

This explains why they use a sample size of ten. It's so serial incidents don't inflate the total number of rapes and sexual assaults. I may have made this point before.

You are using the wrong term here.  This is not a sample size.

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #40 on: December 19, 2014, 02:21:56 PM »
You are using the wrong term here.  This is not a sample size.

you're absolutely right but its the term we've been using
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #41 on: December 19, 2014, 02:53:38 PM »
you're absolutely right but its the term we've been using

No, it isn't the term we've been using. You're posting walls of text that are about something else entirely in the hope that no one will actually read what you've posted. The walls of text get longer and longer and have less and less to do with what was actually being discussed.

But my point is made. You're incapable of having an honest debate. You're not mentally equipped and you lie like a shameless rug.

Hulkotron

  • Getbig V
  • *****
  • Posts: 28212
  • I ate an entire box of popsicles the day prior
Re: A Math/Statistics Question For Anyone Interested
« Reply #42 on: December 19, 2014, 03:04:05 PM »
you're absolutely right but its the term we've been using

Who is we?  I don't see anyone else misusing it. 

Saying "the sample size is too small" gives a false impression that something is wrong with this approach methodologically.  It seems very sensible to me.  I agree with Al D, it seems to me you are trying to prop up a false argument by deliberately using the wrong term.

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #43 on: December 19, 2014, 03:08:31 PM »
Who is we?  I don't see anyone else misusing it.  

Saying "the sample size is too small" gives a false impression that something is wrong with this approach methodologically.  It seems very sensible to me.  I agree with Al D, it seems to me you are trying to prop up a false argument by deliberately using the wrong term.


It was Al's arguement that the sample size was too small.  That's been his argument from the beginning.

Right. So, basically, a sample size of 10 is not going to give you a reliable national statistic. Is that what you're saying?

He's used the term sample size multiple times in this very thread.  I posted the methodology they used. 

In several of the categories, there are less than 10 sample cases. You made the claim that you could make an accurate assessment on race based on these sample cases. Within the very study, it says that you cannot make an accurate assessment because the sample sizes are too small. You have stated over and over that the BOJ created some kin d of weight that corrects for this. That's simply not true. You are clinging to that point because you feel like it makes a point about race and crime, when no logical point can be inferred from such a small sample size. The authors of the study admit that.

Please peddle some more idiotic mumbo jumbo to stay the course.


Yes: (Interpret data with caution. Estimate based on 10 or fewer sample cases, or the coefficient of variation is greater than 50%).

That is exactly what that means. That wall of text you just posted is almost wholly irrelevant to the discussion. This one line is all you need to know. The sample size  is too small to make accurate assessments. contained right there in the report.
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #44 on: December 19, 2014, 03:13:38 PM »

It was Al's arguement that the sample size was too small.

He's used the term sample size multiple times in this very thread.


There are times when "sample size" is an appropriate term to use. I have used it correctly. You are conflating the "series incidents" methodology with "weighting" methodology. Neither has anything to do with the other.

And what you're doing is obvious and completely pathetic. But, yeah, you're data-driven and intellectually honest.  ::)

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #45 on: December 19, 2014, 03:15:19 PM »
There are times when "sample size" is an appropriate term to use. I have used it correctly. You are conflating the "series incidents" methodology with "weighting" methodology. Neither has anything to do with the other.

And what you're doing is obvious and completely pathetic. But, yeah, you're data-driven and intellectually honest.  ::)

I'm not confusing series incidents with weighing methodology.  The link I posted discusses this.  Series incidents were effecting total counts.  This was my original argument. 

http://www.ncbi.nlm.nih.gov/books/NBK202273/
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #46 on: December 19, 2014, 03:19:36 PM »
I'm not confusing series incidents with weighing methodology.  The link I posted discusses this.  Series incidents were effecting total counts.  This was my original argument. 

http://www.ncbi.nlm.nih.gov/books/NBK202273/

No, it wasn't. You just desparately glommed onto this blurb because it contained the number 10. It has nothing to do with the margin of error, which is pretty straightforward.

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #47 on: December 19, 2014, 03:23:28 PM »
No, it wasn't. You just desparately glommed onto this blurb because it contained the number 10. It has nothing to do with the margin of error, which is pretty straightforward.

Yes it does. After changing the methodology they've gone back as far as 95 to correct errors. .
A

Archer77

  • Getbig V
  • *****
  • Posts: 14174
  • Team Shizzo
Re: A Math/Statistics Question For Anyone Interested
« Reply #48 on: December 19, 2014, 03:32:12 PM »
There are series victimization and national victimization.  They use a set number in order to prevent serial victimizations stats from skewing the data for national victimizations stats



Beginning with Criminal Victimizations, 2011, BJS began including series victimizations directly in its estimates. The NCVS uses the victim's report of the number of similar victimizations, with a maximum of 10, and collects (and applies to each victimization) detailed information only for the most recent victimization. These new procedures are being applied to all types of victimizations, including rape and sexual assault (Bureau of Justice Statistics, 2012a, p. 13):

BJS now includes series victimizations using the victim's estimate of the number of times the victimizations occurred over the past 6 months, capping the number of victimizations within each series at a maximum of 10. This strategy for counting series victimizations balances the desire to estimate national rates and account for the experience of persons with repeat victimizations while noting that some estimation errors exist in the number of times these victimizations occurred. This bulletin is the first to include series victimizations throughout the entire report, and all victimizations estimates in this report reflect this new count strategy.

A technical report provides findings on the extent and nature of series victimization (Lauritsen et al., 2012, p. iii):

Including series victimizations in national rates results in rather large increases in the level of violent victimizations; however, trends in violence are generally similar regardless of whether series victimizations are included. The impact of including series victimizations may vary across years and crime types, in part reflecting the relative rarity of the offense type under consideration.



This is why it has such a co-efficient.

Given the findings from this research, BJS will enumerate
series victimizations using the victim’s estimates of the
number of times the victimizations occurred over the past 6
months, capping the number of victimizations within each
series at a maximum of 10. This strategy for counting series
victimizations balances the desire to estimate national rates
and account for the experiences of persons with repeated
victimizations while noting that some estimation errors exist
in the number of times these victimizations occurred
.
A

Al Doggity

  • Getbig V
  • *****
  • Posts: 7286
  • Old School Gemini
Re: A Math/Statistics Question For Anyone Interested
« Reply #49 on: December 19, 2014, 03:36:35 PM »
Yes it does. After changing the methodology they've gone back as far as 95 to correct errors. .

And none of this has anything to do with your original point that a statistic involving a sample size of 10 or less gives you a good idea of a national trend. If anything, it does the opposite.  ::)