Some days ago, I posted some noobish thoughts about crossoword construction. I'd figured out that Nutrimatic's default word lists were good for Nutrimatic's use case, but not so great for a list of candidate words for an automatic crossword grid-filler program. I got comments from some smart people on that post.

Tinhorn pointed me at the Collaborative Word List Project. This is a big word list that's maintained by a bunch of puzzle creators. Once I heard about this, I knew it was going to be an awesome thing. I sent off for it right away. It took a while for me to get it, though. This list is "collaborative," and to collaborate, puzzle designers need to run a program on their computers. The download page had Mac, Windows versions... and of course, I'm still a lunkhead Linux user so I had to go and ask what to do. Fortunately, the programmer, Alex Boisvert, was a smartie and had designed the system well enough such that he was able to figure out a solution for me. But it took a while for my questions to go back and forth.

Meanwhile, I was working on Dan Egnor's ideas on how to get "clueable" words and phrases from Nutrimatic's data. (You recall that Nutrimatic uses Wikipedia: it finds common phrases, article titles, and link "anchor text".) Dan suggested concentrating on article titles and link anchor text. That would get more phrases like "Hank Aaron" but fewer like "from his". I tried that and it helped a lot! Which then ran me into the next hurdle: rodr. Nutrimatic thinks that "rodr" is a word. It doesn't think it's a great word, but it thinks rodr is OK. But rodr looks funny in a crossword grid, or at least I was surprised to find it there when I tried using a revised Nutri-clueables list with Crossword Constructor. And there were some other weird things in there, too. It turned out that these were artifacts of, uhm, international text. Text like "Rodríguez". Nutrimatic is set up for English; when it sees "Rodríguez" it does its best and interprets the text as two words "rodr guez". And again, for Nutrimatic's use case, that's fine. But now I was trying to figure out how to tweak Nutrimatic to be more careful with these förêígn letters. So I'm digging around, trying to find out which software library is dropping this stuff and what other software library I might replace it with. And it's taking me a while because every time I think I've figured something out the next step is "OK, now that it's working, run this program over all of Wikipedia... and come back tomorrow when it's done." So when things weren't working as well as I thought... uhm, yeah.

Meanwhile, Mr Boisvert had set me up with his software. So I finally got a copy of the heralded Collaborative Word List. And since I don't want to be just a parasite, I figure I should, y'know contribute something. So I write a little program to look at Nutrimatic's highly-rated phrases+words and compare them to what was already in the C.W.L. I figured that out of Nutrimatic's top 50K words, there would probably be 100 worth adding to the C.W.L. And... there was "Eurovision." Yup, that was about it. "Eurovision" was popular in Nutrimatic, but not in the C.W.L. (But probably the C.W.L. people had considered "Eurovision" seeing as how their list had "EUROVISIONSONGCONTEST".) Wow, one word was... less than 100. So I looked over the C.W.L.

If you get a list of "good" words out of Nutrimatic, you're looking at about 50 thousand words. There are more words than that in Wikipedia... but if you go for more than the most "popular" 50,000, you're kind of in the dregs. There are about 100 thousand "good" phrases, including things like "from his". The C.W.L. has 400 thousand entries. Hand-crafted. This thing is a cultural treasure. I just kind of took all the tinkering I was doing to re-jigger Nutrimatic and Wikipedia and nudged it aside. Nutrimatic is good at doing what Nutrimatic is good at; I don't need to try to force it to do something else.

Now I've got a pretty good crossword construction word list. Now when I make a crossword puzzle, I'm not complaining about the fill. Now I'm complaining about the crappy clues I write. That's progress, right?

Bones of Contention

1. K-12
5. Sailboat with lateen sails
12. Rapper known for "Shawty"
17. Island territory
18. Sixty minutes from now
20. Cost
21. Star near Sol
23. Prefix with graphic
24. Dr House: "It's not ___"
25. ___ Paulo, Brazil
26. Actress Myrna
27. Actress Sophia
28. Sang "Orinoco Flow"
30. Pilgrim portrayer
31. Retinal researchers' org.
32. Divine disagreement topic
38. Hoosegow
39. Early anesthetic
40. Term for a long-handled gardening tool
41. Suffix with chlor-
42. Skater Harding
44. Longtime congressman from New Jersey
47. Excavating machine
49. "Hulk" director Lee
50. Fingerrpinting dept.
52. Asian capital
54. Advanced teaching deg.
55. Fierce fight, contained
59. Flightless flock
62. New York mayor during blackout of '77
63. Southern school, home of Mike the Tiger
64. ___ Victor
67. Shakespearean setting
69. Actress Nastassja
72. Breakfast, lunch and dinner
74. Fla. neighbor
75. Flowery verse
77. River in eastern France
79. Sporty car roof
80. Chromatic cause of arguments
85. Snare, for example
86. Humorist Lebowitz
87. Castle part
88. They make water filters
89. A as in Austria
90. Online vocational sch. teaching Mold Awareness and more
93. "Sunflowers" setting
96. "Bolero" composer
97. Signal of understatement
100. Less than a Manwich
101. Raised
102. A chip, maybe
103. Toadlike
104. Opposite of dowry?
105. Future atty.'s exam

1. Alike: Fr.
2. Humdinger
3. Flashmob, for example
4. Richard Wright's complaint to his mother.
5. Taiwanese manufacturer of motherboards
6. Hydrocarbon suffixes
7. Superboy's girlfriend
8. Golden rule word
9. Half a dance
10. Grand ___ Dam
11. Sky lights
12. Anti-abortionist
13. The rose, to Ramon
14. Comic book with slogan "Mad scientsists are a disease. Meet the cure."
15. Singer Melissa
16. Anon
19. Mideast capital
22. "__ __ sow, so..."
29. Bar order
30. Skywalker portrayer
32. Amenhotep IV's god
33. Pool ball type
34. Cry of surprise
35. Actor Beatty
36. Lots
37. Lack
38. "Come ___?" (Italian greeting)
43. Air hero
45. Alumna bio word
46. South Africa's ___ Paul Kruger
48. Ambulance V.I.P.
51. ___ Gigante: Univision show
53. Talks to sattelites
55. Military leader known for chicken
56. Furniture wood
57. "I wish you hadn't told me that."
58. Total
59. Emptying a place of ppl.
60. Movie with exaggerated emotions
61. Kazakhstan riparian feature
64. Hardest thing to learn about mobile phones
65. Blockhead
66. Cleopatra biter
68. In the ordinary way
70. Mumbai TV station
71. Carp
73. Light
76. Decadent
78. Comics shriek
81. What Takeru Kobayashi can do
82. Wind instrument with a keyboard
83. Gretel's brother
84. Close, as an envelope
88. Dressed in a fine or showy manner.
90. IMHO
91. ___ von Bismarck
92. Consequently
94. "┬┐Como ___ usted?"
95. Editor's mark
98. Wrote "Nothing But The Truth"
99. Dreyer, east of the Rocky Mountains

(No gimmick in this puzzle. The Os don't make the Big Dipper or anything.)

