New: Puzzle Hunts are Everywhere: Brute Force Web Quiz Crawler

It's another blog post about how web programming skillz can aid in game-ish activities.

A couple of years ago, Team XX-Rated hosted the Paparazzi Game. I was sorry that illness made me miss the game itself. Fortunately, the illness didn't strike until after the pre-game, so I was able to participate in that. The pre-game gave me an excuse to use glitter glue. Also, there were puzzles.

The Online Dating Style Quiz Puzzle was pretty mysterious to me. It was a multiple-choice quiz; you could submit a set of choices and get a grade on your dating style. Later on, my team-mates showed me the clever way to solve this puzzle. But I didn't spot that on my own. I wasn't sure that I could spot cleverness in this puzzle on my own. I could tell it had some references to tabloid celebrities. But I knew almost nothing about tabloid celebrities.

I tried filling in the quiz a few times. Each time, I just got the grade "Honestly, you're pretty lame and none of us on staff would want to date you. Maybe you should re-read the questions.". But one time, I got a different grade: "You're exciting, but not so much to scare your partner away. Count on the questions in this quiz to lead you in the right direction." I hoped that might inspire a clever approach, but it didn't. But by now I was pretty sure that the grade was based solely on my multiple-choice choices: there wasn't something weird going on based on my history of past attempts, timing, or other factors.

So I used brute force: try all possible combinations of choices. Or, rather, try many of them. I wrote a little program that would cycle through choices and log those that got a non-"lame" grade. To avoid hogging the server, the program would "sleep" for a second between queries. Since there were many, many combinations of choices to consider, the program would take a long time: I planned to let it run overnight but it still wouldn't have time to try every choice. But I didn't especially want it to finish. I wanted enough data back so that I could understand the problem better, maybe figure out the clever bit.

So, the script. This is not that script; I lost track of that script. This script is a reconstruction; it's probably similar to that script.

import time
import urllib

ABC = "abc"
QBC = "qbc" # question 5 has different choices

QUIZ_URL = "http://www.xx-rated.org/xxtraonline/quizzes.php"

LOG_PATH = "/home/lahosken/log.txt"

q = [None,'','','','','','','','','','', '']

# Don't log the whole page; it's mostly boilerplate.  Just log the interesting part.
def GetUsefulParts(s):
  (before, marker, after) = s.partition('<!-- InstanceBeginEditable name="content" -->')
  if after: 
    s = after
  else:
    s = before
  (s, _, _) = s.partition('<!-- InstanceEndEditable')
  retval = ""
  for line in s.splitlines():
    line = line.strip()
    if not line: continue
    if line == '<p><img src="images/quizzes.gif" width="435" height="90"></p>' : continue
    if line == '<div align="center">': continue
    if line == """<h1><span class="style8">What's your dating style?</span></h1>""": continue
    if line == '<font size="+2" color="#000000">': continue
    if line == '</font>': continue
    if line == '<h3><a href="quizzes.php">Try again!</a></h3>': continue
    retval = retval + line
  return retval

for q[1] in ABC:
  for q[2] in ABC:
    for q[3] in ABC:
      for q[4] in ABC:
        for q[5] in QBC: # careful
          for q[6] in ABC:
            for q[7] in ABC:
              for q[8] in ABC:
                for q[9] in ABC:
                  for q[10] in ABC:
                    for q[11] in ABC:
                      time.sleep(1)
                      qlist = [('q'+str(ix), q[ix]) for ix in range(1,12)]
                      qlist.append(('Submit', "What's my style?"))
                      qlist.append(('force', 'brute'))
                      post_data = urllib.urlencode(qlist)
                      page_contents = urllib.urlopen(QUIZ_URL, post_data).read()

                      if page_contents.find("you're pretty lame") > -1: continue

                      log_file = open(LOG_PATH, 'a')
                      log_file.write(post_data)
                      log_file.write('\t')
                      log_file.write(GetUsefulParts(page_contents))
                      log_file.write('\n')
                      log_file.close()

The next morning I had a lot of data: choices which had produced non-"lame" grades.

q1=a&q2;=a&q3;=a&q4;=b&q5;=c&q6;=b&q7;=c&q8;=c&q9;=c&q10;=b&q11;=a&Submit=What%27s+my+style%3F&force=brute <p>You're exciting, but not so much to scare your partner away. Count on the questions in this quiz to lead you in the right direction. </p>
q1=a&q2;=a&q3;=a&q4;=c&q5;=b&q6;=b&q7;=a&q8;=a&q9;=c&q10;=b&q11;=a&Submit=What%27s+my+style%3F&force=brute <p>You're pretty spicy although not as much as you could be. You may want to reconsider some of your choices on the next date.</p>
q1=a&q2;=a&q3;=a&q4;=c&q5;=b&q6;=b&q7;=a&q8;=b&q9;=c&q10;=b&q11;=a&Submit=What%27s+my+style%3F&force=brute <p>You're exciting, but not so much to scare your partner away. Count on the questions in this quiz to lead you in the right direction. </p>
q1=a&q2;=a&q3;=a&q4;=c&q5;=b&q6;=b&q7;=a&q8;=c&q9;=c&q10;=b&q11;=a&Submit=What%27s+my+style%3F&force=brute <p>You're exciting, but not so much to scare your partner away. Count on the questions in this quiz to lead you in the right direction. </p>

This was interesting. That "pretty spicy" grade suggested that guess was pretty close. Also, I saw that the (a) choice for question 11 showed up in many good answers, as did the (b) choice for 10, the (c) choice for 9. Probably those were correct choices. (The recurring (a) choice for question 1 is less exciting: because of the way I'd ordered the guesses, all guesses that night had (a) for question 1.) I stared at those correct answers for a while, trying to see the clever pattern. That didn't get me very far.

So I used this information to tweak the brute force script. I changed some of the for loops so that instead of considering each choice (a, b, or c) it only considered the correct choice. (Yeah, there were more elegant ways I could have coded it; this was an easy edit.)

                for q[9] in 'c': # was ABC:
                  for q[10] in 'b': # was ABC:
                    for q[11] in 'a': # was ABC:
...and re-ran the script. Now it was wasting less time on bad choices for those questions. I let it run a while longer, looked at the output again, used it to narrow down a few more choices. Ran it again. The next time I looked through the logs, there was another kind of grade: one that let me know that the script had found the perfect answer.

No doubt it would have been more satisfying to solve this puzzle with cleverness than by brute force. But brute force can be fun, too.

Labels: ,

Posted 2008-03-09