Tuesday, May 14, 2013

A Fair Definition of Crosswordese

Here's how we can define crosswordese without bias.

Wikipedia defines "crosswordese" as "a term generally used to describe words frequently found in crossword puzzles but seldom found in everyday conversation". IMO, a better way of defining crosswordese is as "a term generally used to describe answers found in crossword puzzles but are foreign to a large percentage of intelligent adults." Mostly the same thing, but there are some major differences which I feel get to the heart of what crosswordese is. Many words are seldom spoken, but that does not make it an obscure word. In addition, crosswordese does not have to be words (e.g. TKO), and intelligent adults should be a good measurement of the population.

Crosswordese, using the definition I've given above, has two parts:

1) It's used in a crossword puzzle (DUH!)

2) It is foreign to a large percentage of intelligent adults.


As an example, I recently did a online survey polling people of about middle age and asking them "Do you know what an aleut is?"

Results are shown below.

NO
YES
NO
NO
YES
NO
NO
NO
NO
NO
NO
NO
YES
NO
NO
NO
NO
NO
YES
NO
NO
NO
YES
NO
YES
NO
NO
NO
NO
NO
NO
NO
NO (33 entries)

6 YESes out of 33. Assuming my data was perfectly unbiased, that gives me a 95% confidence interval of about 5%-31%.

So is ALEUT crosswordese?

1) It's used in a crossword puzzle: YES
2) It is foreign to a large percentage of intelligent adults: We can't be certain, but the data strongly suggests YES.

Now let's assume that a puzzle has 5 pieces of crosswordese, all known by only 30% of the population.



The chance of not knowing all 5 is 16.807%. (about 1 in 6 people!)

The chance of knowing 1 in 5 is 36.015%

The chance of knowing 2 in 5 is 30.87%

The chance of knowing 3 in 5 is 13.23%

The chance of knowing 4 in 5 is 2.835%

The chance of knowing all 5 is 0.243%. That's less than 1 in 400 people!



With just 5 entries, at least half the people would know 1 entry or less, and over 83% of people would not know at least 3 entries!

Given the hundreds of crosswordese in puzzles, it will take a new person  a significant amount of time before they can even begin to solve puzzles. A line from a Marc Romano book says, "to do well solving crosswords, you absolutely need to keep a running mental list of “crosswordese”.

Here's wishing that wasn't the case. 



The Secret Tricks of Independence (Probability)

*Will assume a basic understanding of union(U), intersection (n), complements ('), as well as a basic knowledge of how probability works and basic notation.

To begin, the probabilities of two or more things are independent if the probability of one thing does not affect the probability of the other.

For example, if we roll two dice, what is the probability of rolling two 1's? That is 1/6* 1/6 = 1/36.

In this example, if we were to roll a 1,2,3,4,5 or 6 on the first die, the chance of rolling a 1 on the second die is still 1/6. The probability of rolling a 1 on the first die and the probability of rolling a 1 on the second die are independent.

If two things, "a" and "b" are independent, then we calculate the probability of them occurring simultaneously by using the following formula.

P(AnB) = P(A) * P(B)

In our example then:

Let P(A) = probability of rolling a 1 on first die
Let P(B) = probability of rolling a 1 on second dieP(AnB) = 1/6 * 1/6
P(AnB) = 1/36     

Remember that if the two probabilities were dependent, this formula would not work. This could happen if say, we were surveying the number of blond/brunette males/females in the world. The number of blond males is different than the number of blond females.

Now, let's ask ourselves a slightly different question. What is the probability of rolling a 1 on the first die and not rolling a 1 on the second die?

The answer is of course, 1/6 * 5/6 = 5/36. But can you see another way we could do it?

Since the two dice are independent...

P(rolling a 1 on first die and not rolling 1 on second die) = P(rolling a 1 on first die) * P(not rolling a 1 on second die)

Thus...

P(AnB') = P(A) * P(B')
 P(AnB') = 1/6 *  5/6
P(AnB') = 5/36

Therefore, for any two independent probabilities...

P(AnB) = P(A) * P(B)
P(AnB') = P(A) * P(B')
P(A'nB) = P(A') * P(B)
P(A'nB') = P(A') * P(B')

These four formulas are in fact, quite easy to remember. The probability of two things, whether complements or not, are equal to the probability of those things happening, provided they are independent.


This may not seem groundbreaking, but it allows for some interesting inductions that would not be possible otherwise. Take the following question for instance. Try it on your own.

Sample Question #1: A scientist is studying an animal species. Some members of the species have stripes, some have spots,  some have both stripes and spots, and some have neither stripes or spots.

24% of the members are striped but not spotted

36% of the members have neither stripes or spots.

The probability of the members having spots is independent of the probability of the members having stripes.

 What is the probability that a randomly selected animal of this species has spots, but not stripes?

Hint: A Venn Diagram is a great way to visualize these problems

Solution:

 Let's define P(A) as the chance of having stripes, and let's define P(B) as the chance of having spots.

Our goal is P(A'nB).

Since 36% of the members have neither stripes or spots, P(AUB)' = .36

Cause 24% are striped but not spotted, we get P(AnB') = .24
Therefore, P(B) = .4, since all things must add up to 100%

Our goal is P(AnB).

Recall that P(AnB) = P(A) * P(B)

Therefore....
P(AnB) = P(A) * .4

Ouch!!! We have two variables, but only one equation. Impossible right? YES!!!! We need a new method of attack for this problem.

Remember those other formulas up above for independent variables? Let's try one of them.

P(AnB') = P(A) * P(B')
Remember up above that we determined that P(AnB') = .24? Along with P(B'), plug it in and you get
.24 = P(A) * .6

By simple algebra, we simplify to
P(A) = .24/.6
P(A) = .4!!!!

Therefore, if we have a 40% of the animal having stripes, and a 36% chance of having neither stripes or spots, P(A'nB) = 100 - 36 - 40 = 24%

There is a 24% chance of the animal having spots, but not stripes.



Practice Problem:


P(A) =  .2
P(AUB)' = .56

Events A and B are independent of each other

Find P(B|A).