Today, we will discuss the concept of sampling. People with a statistics background will be very comfortable with it, but if you have not studied statistics, a little bit of knowledge will be helpful. You are not required to know this for the GMAT, however there could be questions framed on the sampling premise, and you will be far more comfortable solving them with some understanding in place. A sample is a selection made from a larger group (the “population”) which helps you examine certain characteristics of the larger group using limited resources.
In a large population, say all the people in a state, it is difficult to find the number of people with a certain trait, such as red hair. So you pick up 100 people at random (from different families, different areas, different backgrounds) and find the number of people who have red hair in this selection of 100.
Let’s say 12 have red hair. You can then generalize that approximately 12% of the whole population has red hair. The more unbiased your sample, the better the approximation.
In this example, you found something about the entire population (12% has red hair) based on a small sample and hence, using few resources. To find the actual percentage of people who have red hair in the entire population, you would need far more effort, time and money. Usually the use of fewer resources justifies the use of sampling even though it comes with some error.
So that is a bit of background on sampling. It will help you make sense of the official question given below:
In a certain pond, 50 fish were caught, tagged, and returned to the pond. A few days later, 50 fish were caught again, of which 2 were found to have been tagged. If the percent of tagged fish in the second catch approximates the percent of tagged fish in the pond, what is the approximate number of fish in the pond?
This is what took place: From a pond, 50 fish were caught, tagged and returned to the pond. Then 50 were caught again and 2 of those were found to be tagged.
Why was this done?
The total number of fish in the pond is the population of the pond. It is unknown. Since counting the total number of fish in the pond was hard, they tagged 50 of them and let them disperse evenly in the population. This means they gave a certain trait to a known number of fish in the pond – they tagged 50 fish.
Then they caught 50 fish again and these fish became the sample. Out of these 50, 2 were found to be tagged. So 2 of the 50 fish caught were found to have the trait given (tagged) – 4% of our sample was tagged.
The question tells us that “… the percent of tagged fish in the second catch approximates the percent of tagged fish in the pond …” that is, the question tells us that the sample is representative of the population. This implies that 50 (the number of fish we tagged) is 4% of the entire fish population of the pond.
50 = 4% of Total Fish Population, therefore, we can calculate that the Total Fish Population = 50 * 100/4 = 1250. Our answer is then C.
Using sampling, we were able to calculate the total population of the pond without actually counting each fish. For increased accuracy, often the exercise of taking samples is repeated many times and then some kind of average is used to get the best approximation.
Karishma, a Computer Engineer with a keen interest in alternative Mathematical approaches, has mentored students in the continents of Asia, Europe and North America. She teaches the GMAT for Veritas Prep and regularly participates in content development projects such as this blog!