Milestone: 15 Million Hits

I'm writing this in kind of a hurry. An out-of-town cousin is in town. There have been fun activities. There will be more. Thus, apologies. I write in haste.

The 15 millionth item (modulo the usual disclaimers of reportage noise)

168.10.168.61 - - [20/Dec/2009:00:02:17 -0400] "GET /new/labels/interspecies%20diplomacy.html HTTP/1.1" 200 49246 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

This appears to be a robot: it loaded a few pages from my site, but not the accompanying graphics, style sheets, etc. It started out by loading my Evil Guest book report and then loading some pages linked-to from there.

Whois and traceroute make me guess that this machine is in Georgia, belonging to the Georgia Department of Education. Its robot-like behavior makes me worry that it's been taken over by a botnet—crawling some random web pages on my site doesn't seem like a very department-of-education-ish thing to do. Then again, there are stranger things in heaven and earth etc etc. I gotta go.

Labels: ,

Milestone: 14 Million Hits

Wow, it's this web site's 14 millionth hit. The people and the robots, they keep showing up.

118.237.143.134 - - [06/Jul/2009:04:35:09 -0400] "HEAD /favicon.ico HTTP/1.1" 404 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11"

Let's see, 118.237.143.134 means a machine in the domain gyao.ne.jp. When I try to find information about GyaO, it seems to be an internet portal specializing in video.

When I look at my records for the day, I think that this machine at GyaO visits my site once every 10 minutes, each time checking on that file /favicon.ico. That seems like a web robot-y thing to do. I guess this is a robot, not a human. Looks like it started about six hours ago, and is still trying. This web site doesn't have a /favicon.ico file, but the robot keeps checking for the file's existence anyhow. It's making HEAD requests instead of GET requests--that gives it metadata about /favicon.ico without the data itself. Though in this case the metadata is boring: 404 not found. (You might wonder what a favicon.ico file is. It's a tiny icon. It's common convention to have one. It's reasonable of GyaO to assume I'd have such a file. More about favicon.ico.)

The idea of writing a web robot to periodically check web sites for the presence of a /favicon.ico file seems strange to me. I can't figure out why you'd want such a robot, but it's fun to think about. Some of this site's robot visitors are more interesting than some of the humans.

If only there were some way to combine humans with robots. On the internet, nobody knows you're a cyborg. Or something.

I attempted to find out what makes my writing voice unique. To this end, I of course looked at letter frequencies. Maybe something I could build a web robot out of.

You might recall that my daily nonsense page daily generates random text based upon Markov Chain patterns--frequencies by which one blob of text tends to follow another. E.g., if one character of some English text is "q", the next character is probably "u".

What if we look at the frequency data of my writing and then subtract out the frequencies of "typical" writing?!? Surely that would result in the essence of my writing voice--as determined by science, math, and statistics. This is serious. I tried it out, and the following gem of text emerged from the process:

Getty guarand trize talked a mt reard sun't withey saill. Platereng all yourealkere tood so nice ne clogres phoulatchfriguy? Elly, hat to trying thenihis inkind, quile blany ingibs. Seem nock seen some it gaven wit fun't hey some dayelseep of ork cloo car thiceople. Pat be hady. I an imm bit donew's funce maker. Some woketty do ke rog, somay dowon't for of ming.

You might look at that and say "That's totally incoherent," but that might mean that it's totally captured my writing patterns, you know?

OK, that's pretty incoherent. But I kind of like that sentence "Pat be hady." I'll try saying that the next time someone asks me how it's going. "Pat be hady."

Anyhow, wow. 14 million. Dear reader, thank you for reading.

Labels: ,

Milestone: 13 Million Hits

Wow, it's the site's 13000000th hit. (Sort of. Actually, it probably passed 13000000 a while back. I skipped counting a bunch of hits (most of them?) during October-November. Anyhow.)

124.185.38.69 - - [18/Jan/2009:05:48:16 -0400] "GET /departures/Seattle/11/03623_al_mary_veronica_tom_table_tm.jpg HTTP/1.1" 200 4497 "http://lahosken.san-francisco.ca.us/departures/Seattle/11/" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506; InfoPath"

Let's see, what's going on here? Someone viewed the travelog of that road trip to Vancouver that Tom Lester and I took four years ago when we were both between jobs. That page shows a bunch of "thumbnail" graphics--small versions of large photos. Here, the browser is fetching one of those thumbnails, specifically a photo from Veronica & Patrick's place up at the Sixes River in Oregon--in the photo, Patrick's parents and Veronica are sitting around the kitchen table.

The IP address suggests that the user is a customer of Bigpond, an ISP service run by Telstra in Australia. Assuming that Bigpond uses a sensible naming convention, I'm guessing this customer is in Queensland:

$ dig -x 124.185.38.69
...
;; ANSWER SECTION:
69.38.185.124.in-addr.arpa. 85476 IN PTR CPE-124-185-38-69.qld.bigpond.net.au.

"qld" seems like an abbreviation for Queensland, doesn't it?

Looking at previous hits for that same IP address (presumably the same user), we can see loading lots more thumbnails... Ah, and here's where they loaded the page itself:

124.185.38.69 - - [18/Jan/2009:05:48:14 -0400] "GET /departures/Seattle/11/ HTTP /1.1" 200 45255 "http://www.google.com.au/search?hl=en&sa=X&oi=spell&resnum=1&ct=result&cd=1&q=recommended+road+trip+between+LA+and+Vancouver&spell=1" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506; InfoPath.1)"

I guess they arrived at the page after doing a Google search for road trip recommendations from L.A. to Vancouver. Dang, Tom and I started a ways north of L.A. I sure hope that that Australian nevertheless leaves some slack touristy-time between L.A. and S.F. There's plenty of stuff to see there.

Labels: , ,

Milestone: 12 Million Hits

Wow, it's the site's 12 millionth hit. Let's look at the log:

66.249.73.131 - - [27/Jun/2008:00:13:43 -0400] "GET /new/archive/2005_08_01_index.html HTTP/1.1" 304 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Here, the Google crawler was making sure that a page of blog entries from 2005 hadn't changed.

I use these "millionth" hits as an excuse to babble about webmasterly stuff. I didn't think I'd have anything to talk about this time. But about a week ago, I heard about Google's "Google Trends for Websites" feature, "a fun tool that gives you a view of how popular your favorite websites are, including your own!" You give it the name of a domain (e.g., "lahosken.san-francisco.ca.us") and it shows you a pretty graph of how many visitors that site got over time, a list of other domains popular with those visitors, and queries that visitors tend to search for. Wow, interesting! So, I asked for trends for my website:

[Empty chart with words: 'No data available']

...And thus we are reminded that my web site is not very popular; Google hasn't been able to gather enough data to generate cool statistics. And I notice that it says this about all of san-francisco.ca.us, not just the lahosken part. That is, my site isn't just unpopular: it's a small part of an unpopular backwater. If you're reading this, you have obscure interests.

Labels: ,

Milestone: 11 Million Hits (plus gratuitous Taft domain pestering)

Wow, it's the site's eleven-millionth hit.

195.225.178.21 - - [20/Feb/2008:06:20:03 -0400] "GET /anecdotal/hunt/15/darcy_ian.html HTTP/1.1" 200 853 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

If you look at this hit in isolation, it looks like someone is browsing a page with a photo of Darcy & Ian of Team Taft on a Raft (more about that team later). But this "someone" is probably a robot. If I look at other hits coming from the same IP address, there are a few of them per second; this thing moved faster than most humans would click. It loaded many photo pages--but not the accompanying photos.

OK, that hit's not so interesting, but it gives me an excuse to talk about webmasterly stuff. Since the last time I hit a million-milestone, Microsoft Live Search created a site for webmasters where we can glimpse what Live thinks about our site. It's pretty cool. (This is probably one of those times when I should mention that my opinions are mine; I don't speak for my employer.)

Their site is new, still "finding its legs." I find it confusing (but figure it will improve in the months to come). It says that they indexed 19,500 pages on my site--but my site has less than 3000 pages; less than 8000 files. I asked about this on their support forum, but never got an answer. OK, that's confusing, but overall their site is really useful!

They give a list of the top five pages on my site:

  1. New
  2. Comment: All of the Comments
  3. Seattle/Vancouver Road Trip Travelog
  4. 36 Views of Seattle's Pier 86 Grain Terminal
  5. New: the Book Reports

I'm not sure how "topness" is measured here, but this is an interesting collection of pages. It might measure how many people choose to visit those pages from a Live Search--many people visit that "All of the Comments" page (but I think they go away disappointed... at first they're so happy to find a page that mentions both "St Louis" and "wh*res", but then they find out that those phrases came from totally separate emails...).

What else does this Webmaster site let me do? I can provide them with an email by which they can alert me to problems with my site. I appreciate this feature very much. If evil spammers take over my site, I want to know. Heck, the taftraft.com domain expired a few days ago, and now it's just showing boring rafting ads. Wouldn't it be nice if MSN live had some way to tell Team Taft on a Raft about that? You bet it would. (I mailed Ian at his berkeley.edu address, was there a better thing to try? Can I renew a domain for someone else? I am not enjoying the rafting ads.)

Especially interesting was a list of top sites that link to me:

  1. Graphic Novel Review » Realism/ Slice of Life
  2. Graphic Novel Review » Literary
  3. Vishwas M S Curriculim Vitae
  4. Divided Review Project: Page-by-page Review of Prank the Monkey, the ...
  5. Graphic Novel Review » Autobiography
  6. Graphic Novel Review » Fantagraphics
  7. Graphic Novel Review » Elsewhere on the Web
  8. Graphic Novel Review » Elsewhere on the Web: The Squirrel Mother and ...
  9. Graphic Novel Review » Megan Kelso
  10. Piaw's Blog

Most of these are the result of one blog post in Graphic Novel Review. In this article, the author points out that I am a philistine for not properly appreciating Megan Kelso's comic book "The Squirrel Mother". Which just goes to show that there's no such thing as bad publicity.

Thank you for reading!

Labels: ,

Milestone: 10 Million Hits (including 26000 strange ones)

Wow, it's the site's ten-millionth hit. In decimal notation, that is a very round number. Let's take a look at the log record of that hit:

219.142.53.25 - - [23/Oct/2007:08:33:47 -0400] "GET /frivolity/LuxiSerif-Bold_pfa_u_tm.png HTTP/1.1" 503 413 "-" "MSNPTC/1.0"

This is probably a 'bot, a crawler, a program that automatically reads web pages without human intervention following web links to find other web pages to download. I couldn't figure that out just from this record, but looking at the many many records that precede it, I see that the same, uhm, entity is downloading a lot of files without taking much time to read them. The internet address is 219.142.53.25, which is at bjtelecom.net. Beijing Telecom--so perhaps this is a Chinese user using Beijing Telecom as an ISP? My site returned a status code 503 which, roughly speaking, means "You're asking me for stuff too quickly. Please slow down." As I look at previous requests that this bot made, I see that it did not check for the existence of a robots.txt file which suggests it was either written by an ignoramus or else it is illegitimate or both. As I keep looking at previous requests, I also notice that the bot tries to read several nonexistent files--so I guess it probably was coded by someone incompetent. Hey, now that I look more closely I notice that /frivolity/LuxiSerif-Bold_pfa_u_tm.png, the file that this ten-millionth hit asked for--that file doesn't exist. If the 'bot had asked for /frivolity/tav/LuxiSerif-Bold_pfa_u_tm.png then it would have been onto something.

This crude bot is not the strangest phenomenon to hit the web site recently. I never would have noticed that little bot if it hadn't been responsible for the site's 10000000th hit.

The strangest thing recently has been the 26 thousand visits to the Book Report: Leave Me Alone, I'm Reading page. To compare, that page has had more hits in the few weeks of its existence than, say, my Japanese Ska page has had in the past few years. Hundreds of hits a day. It's not people using browsers. People-controlled browsers report refering pages. People-controlled browsers download the pictures that go with a page. These "visitors" haven't reported a referer, haven't downloaded graphics. They come from a wide variety of IP addresses. If the requests came 1000 times per second instead of 1000 times per day, I'd think they were a distributed denial-of-service attack. If I displayed advertising, I'd think they were trying to corrupt my advertising statistics. Oh, and some of them garble the file name in strange ways like the middle request here:

68.92.21.114 - - [27/Sep/2007:21:31:47 -0400] "GET /new/2007/08/book-report-leave-me-alone-im-reading.html HTTP/1.1" 200 7953 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
68.32.235.250 - - [27/Sep/2007:21:32:03 -0400] "GET /new/2007/08/book-report-leave-me-alone%0D%0A1bd5%0D%0A-im-reading.html HTTP/1.1" 404 1123 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
85.100.23.105 - - [27/Sep/2007:21:33:03 -0400] "GET /new/2007/08/book-report-leave-me-alone-im-reading.html HTTP/1.1" 200 7953 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"

Could someone be trying to crack a web server by putting gobbledegook into the requested address and hoping to choke the web server program and trick it into... writing that gobbledegook into memory somewhere where it might get executed? It seems like there are other easier ways to crack into systems, ways more likely to succeed. I have no idea what the story is behind these 26000 hits. If you know or if you have an amusing theory, please drop me a line.

Labels: ,

Milestone: Nine Million Hits

Wow, it's the site's nine-millionth hit:

66.249.70.236 - - [10/Jul/2007:21:58:59 -0400] "GET /departures/monterey/0/3267_diver_tm.jpg HTTP/1.1" 301 376 "-" "Googlebot-Image/1.0"

It looks like some Google web crawler is making sure that my photo of a diver in the Monterey Bay Aquarium from my Monterey travelog is still there.

"Millions of hits" doesn't mean that millions of people look at the site. Plenty of people do look at it. But there are plenty of robots, too. Maybe hits aren't the best thing to count. But it's not really clear what I do want to count. Counting hits is easy. So I count the hits--whether they be from humans, robots, or whatever.

Error hits add to the count. It's easier to count them than to decide which hits are errors and which aren't. I recently decided that, web-wise, my site was going to be lahosken.san-francisco.ca.us, not www.lahosken.san-francisco.ca.us. To do this I set up a "301 redirect". That is, any time someone points their browser at the web address www.lahosken..., the web server returns an error saying "Error 301: You meant lahosken.... When your browser sees one of these "301" errors, it knows to load the corrected address. But that generates two hits: first you try to load www.lahosken..., then you successfully load lahosken.... Eventually, no-one will have the "www" in their bookmarks and so these errors will stop happening. But since I just recently set up the redirect, the old bookmarks and links and whatnot have been boosting the count.

Oh, and the count... the count is not so rigorous. In theory, each night my web service provider rotates the log files: each night, some magical script somewhere renames the access log, so that I know it was "yesterday's" access log. A few hours later, my magical script runs over "yesterday's" access log; my script maintains the permanent long-term count, the thing that just ticked past nine million. Except that a few months ago, my web service provider's magic script had a hiccup. The log file didn't get renamed. My script happily read "yesterday's" log file--but that was really yester-yesterday's log file. So my script counted yester-yesterday's hits twice, artificially boosting the count. I noticed it happening. If I was super-rigorous, I would have subtracted out those numbers. I noticed it happened at least a couple of times since then. I didn't fix those either. It might have happened a few times when I didn't notice. I don't always pay so much attention. This morning, I noticed a different problem: the log file got renamed, but at a different time than usual. The result this time is that my magical script totally overlooked a day's worth of logs. They were named as if they were yester-yesterday's logs, but were really just a few hours old. I'm too lazy to fix that, too. I don't know how many times that's happened.

A few years ago, there was one of those double-counted days. I carefully fixed up my permanent count to undo the double-counting. I was more rigorous then, more careful.

Labels: , ,

Milestone: 8 Million Hits

Wow, it's the site's eight-millionth hit! Please pardon me as I now babble on about web minutiae. Starting with... let's take a look at the log of that hit:

69.41.96.6 - - [02/Apr/2007:13:03:28 -0400] "GET /slick.css HTTP/1.1" 200 2812 "http://lahosken.san-francisco.ca.us/departures/stl02/3355_illinois_side_power_plant.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.11) Gecko/20070312 Firefox/1.5.0.11"

This hit came from IP address 69.41.96.6, and whois tells me that's at the Savannah College of Art and Design in famous Savannah, Georgia, USA. Here, they're loading the style sheet file that goes a web page. Strangely, when I check my logs, I don't see that this person actually got the web page itself. Perhaps they had it cached from a previous visit? Perhaps my web server doesn't log everything perfectly? Perhaps I'm not as good at reading these records as I think I am? I don't know.

Last time the site hit such a milestone, I showed some site information that I got from Yahoo! Site Explorer, Yahoo!'s service for sharing information with webmasters. Yahoo! had some interesting information about my site that I couldn't get from my own records or from Google: a list of pages on other sites that linked to my site. (This is probably a good time to repeat that my opinions are mine, not my employer's.) A few days after that, I got a message from Vanessa saying that she wanted to meet for lunch. Vanessa is on the Google Webmaster Central team; i.e., she works on Google's service for sharing information with webmasters, the counterpart to Yahoo!'s Site Explorer.

On the way to lunch, we walked through a parking lot. Vanessa mentioned that she'd noticed my blog post about Yahoo! Site Explorer. Ah, Vanessa had noticed that I was saying nice things about our competitors. I glanced around nervously. The parking lot was empty, except for us. We were in the parking lot at Vanessa's request--she'd dropped something off in her car. I remembered that Vanessa was a fan of the old TV show "Buffy the Vampire Slayer". The heroine of that show solves her problems by shoving wooden stakes through their hearts. It occurred to me that if my way of dealing with problem people was to shove wooden stakes through their hearts, then a good first step might be to lure those people out to some place with no witnesses--such as an abandoned parking lot. I hadn't heard anything about Vanessa solving problems via the pointy stick method; then again, if she was sufficiently good at it, perhaps there would have been no survivors to tell me about it. I think I said something clever like "Oh ha ha ha you noticed that blog post, eh?" Vanessa was in pretty good shape; in a fair fight, she could probably put me down. My legs were longer; I could probably outrun her.

She didn't stab me, of course. She wasn't angry; she looked happy. She smiled and said that Google Webmaster Central had a new feature, a feature that I would like. And she was right. It was a list of pages on other sites that linked to my site, noting which pages on my site they linked to. It listed more sites than Yahoo! had found, too. It was a lot of data, data that made me happy.

So now I'll list some other sites that link to this site, sites that didn't get mentioned in my previous post about the data from Yahoo!.

  • GeoURL links to pages based on geography. I've tagged some of my site's pages with latitude/longitude coordinates, and thus they are listed.
  • search.centraldatabase.org It's pages of search results. Normally if you run a search engine, you ask that your search results pages not be crawled/found by other search engines. Otherwise it looks like you're trying to serve spam pages on topics like "hamachi".
  • blogger.com When people post rebuttals to my blog posts, they do so via blogger.com. So there are a bunch of blogger.com pages that have links back to my blog posts.
  • 43things.com In theory, this web service allows you to maintain a "to do" list. I played with it for a while and then pretty much stopped. Still, when I did things that resulted in web pages, I used this service to link to those web pages.
  • en.wikipedia Apparently, I am a world authority on a few niche-y topics. Thus, some of my writings get linked from there. Also, people want to know what some things look like--things that real photographers don't bother to take photos of, but which I do. Thus, my photos occasionally get linked.
  • thebishop.net Hmm, Tim Bishop hasn't posted to his blog for a while. Back when he did post, I posted some snarky replies, and those linked back to my site.
  • looksmartjapanesefood.com Hmm, a dubious-looking page of search results surrounded by annoying animated ads. Hmm, the documentation for this link-listing tool did mention something about not filtering for webspam. Yeesh, the internet is a mess. Let's move on.
  • slashdot.org Yes, I occasionally post snide remarks to Slashdot the nerdly news site.
  • embruns.net My Paris travelogue annoyed this guy so much that he wrote a rant against it.
  • del.icio.us Some people use del.icio.us to bookmark pages on this site. (del.icio.us user featured in this sample link: Irwando of Team Sharkbait, yayy!)
  • youngpoets.ca Links to the Daily Nonsense page as a "fun" site. Is this a good time to say "Happy National Poetry Month!"? Oh wait, I guess that .ca at the end of their domain name means that they are Canadian. They probably don't celebrate the USA national poetry month. Philistines.
  • wikilens.org It's that book recommendation site I use; my profile page there links back to this site.
  • vcci.or.jp If your memory is very, very good then you might remember that a couple of puzzle hunts have refered to the ancient Japanese game Genjikou. Someone in Japan wrote a report on Genjikou and linked to one of my puzzle hunt write-ups. Apparently I am a world authority in some niche-y things.
  • Static Zombie Peter Sarrett blogs about a few things, including puzzle hunts. I have been known to leave a snide comment.
  • Spectre Collie Chuck Jordan made the mistake of working with me once--he was young and needed the money. Now he must endure me posting snide comments on his blog.
  • San Francisco Trusts in Cod This music band home page links to the No-Name Sushi menu.
  • Schneier on Security Once every couple of years, I feel obliged to post a comment on Bruce Schneier's blog.
  • inside looking out Charles Ying made the mistake of working with me once-- What's that? I already used that joke? OK, I'll stop.
  • Of Time and the River used one of my photos and gave me a photo credit
  • Hacker Tourism notes that I am a self-proclaimed hacker tourist.
  • Mobygames I worked for a year in the videogames industry, and it was all worth it for that link
  • Mirror Project Occasionally I take a photo of myself in the mirror. E.g., this one right after I took my pants off.
  • Matt Cutts: Gadgets, Google, and SEO I occasionally post unhelpful advice here.
  • Hogwarts Game write-up links to my write-up, which in turn links back there. It's a puzzle-hunt linkitude love-fest.
  • LOMO.HOMES: LENE2000 A. E. Graves has a few weird cameras. When she wants to post photos from her "lomo" camera, they go here.
  • Linkstew A few years ago, I left some comments in Benjy Stewart's blog. A year ago, I left some graffiti on his office whiteboard.
  • kottke.org Jason Kottke is a highly acclaimed web developer, but any schmoe can post comments on his blog.
  • Lorem Ipsum Is there any blog out there on the WWW that I haven't posted a comment on?
  • Isotope A while back, I posted something about this local comic book store. Then they posted blog entry noting that I had posted about them. Now here I am posting about this blog entry noting their link to me. I think we have discovered the perpetual motion machine for the web.
  • In Passing... Another place I have posted comments. Everybody in the world is welcome to know my opinion about anything.
  • Firearm Buzz This website claims to have reviewed one of my web pages and determined that it's about ninja smoke bombs. For their level of quality, "reviewed" == "looked at the title, ignored the article". Frickin' webspam garbage.
  • Eve Andersson Back when the WWW was mostly chemistry grad students posting their office hours, a set of excellent web pages burst forth about the glory of the number Pi. That page was the work of Eve Andersson. When I have some meager pi-related information to share, I offer it up to her.
  • Embjapan.de Japan Forum German language speakers discuss Japanese performers of Jamaican-style music and link to an American's web site. I feel so cosmopolitan.
  • Defective Yeti Considering the name of Matthew Baldwin's awesome blog, you wouldn't think I'd need to post a comment with the correct spelling of "Wookiee".
  • Cockeyed.com I said disparaging things about penis jokes on a website called "cockeyed.com"?
  • Markov Googler It seemed like a good halfbaked idea at the time, I'm sure.
  • Black Pine Circle School: Us Never write your "about" page after going without sleep for three days.
  • Blorvak Even though someone helped create the excellent comic "Oddjob", I still feel obliged to post cryptic comments on their blog.
  • (link omitted) Do you remember a while back when AOL intentionally exposed the web history of some of their users? One of those users visited my site. Another site set up a pretty web site listing all of the pages that this user visited. That page links to my site. I won't link to it, though. AOL figured out that they were wrong to expose that information. It sure would be nice if that web site were to stop propogating it.
  • All Consuming Yes, I am on All Consuming. I can not think of anything to say about it right now, though.
  • tourb.us This is a site where I carefully keep track of which concerts I will never get around to attending.
  • sourceforge.net I worked on a program called Skitter Tag: where Open Source meets Abandonware!!
  • Scout Technologies I posted a modern art/embedded software joke as a comment on Julie Farago's blog. Apparently, I have no shame.
  • tribe.net I can no longer remember why I wanted a tribe.net profile
  • rodcorp Good grief. How many links does Google know about? I can't keep typing up cute little remarks about all of these! I think I will stop soon.
  • mapper.ofdoom.com I could have sworn that I saw some "2.0" version of the Mapper OfDoom that used Google maps to display the maps. Whatever happened to that?
  • Tom Lester's photos from that road trip we took a few years back
  • lahosken.googlepages.com Oh gee I forgot about that until just now.
  • 43 People Another robot co-op page. Here, I name-drop more web celebrities.
  • The Ageless Project In which we learn that I am older than dirt.
  • hk.knowledge.yahoo.com In China, I am regarded as a world expert on how to write "sushi" in English.
  • Geoswiki Wow, someone out there still cares about GEOS? Bless them.
  • fury.com Many beings leave comments on Kevin Fox's blog. Unlike many of those beings, I am human and not a spambot.
  • Engineering & Where I requested some technical support
  • Yahoo!.com Remember the Yahoo! web directory?
  • de.wikipedia An editor of the German wikipedia links to my site and seems to anticipate a day when the German wikipedia needs an entry for "Laurence Hoskens"? I suspect he's going to have a difficult time researching that topic.
  • I Blame the Patriarchy Despite my ongoing oppression of women, I occasionally dare to post comments on this blog.
  • Notes from the BillMonk Chuck Groom, not to be confused with Chuck Jordan, made the mistake of working with me for a... Oh, I'm just plain out of jokes.
  • le cadavre exquise I am tired. So tired. I don't know what to write about these links anymore. Please let the links stop?
  • Andrew Chatham I never would have made this post in Andrew's blog if I thought I was going to have to try to think of something to write about the resulting link now.
  • livejournal.com If you had asked me yesterday, I would have said that I thought that livejournal was pretty cool. But right now I would rather gnaw my fingers off than try to think of something to say about it.

Aiyee! That's enough. There's still at least another hundred sites to write about but... No. No more. My brain. So tired. So very tired.

I suppose that Vanessa had her revenge on me after all.

Labels: , ,

Milestone: 6 Million Hits

[Update: I meant seven-millionth. It's seven. I'm not re-celebrating six. Sorry, I posted this in a hurry, didn't proof-read, didn't fact check, sloppy work, sloppy.]

Good gracious, it is the site's six-millionth hit.

Let's look at the log record for that six-millionth hit:

71.68.117.149 - - [01/Dec/2006:10:49:32 -0400] "GET /departures/monterey/0/3292_from_water.jpg HTTP/1.1" 200 31201 "http://www.lahosken.san-francisco.ca.us/departures/monterey/0/" "Mozilla/4.0 (compatible; MSIE 6.0; America Online Browser 1.1; rev1.2; Windows NT 5.1; SV1; FunWebProducts; ZangoToolbar 4.8.3)"

This appears to be an AOL user visiting my Monterey travelog. They're at the page, the graphics are loading, and this log shows them downloading a photo of Monterey taken from a boat tour.

Last time the site hit such a milestone, I pulled some site information from Google Webmaster Central. This time, in the interest of fairness or whatever, I'll give some information from Yahoo Site Explorer. (As ever, my opinions are mine, not my employer's.) Yahoo Site Explorer shows many uninteresting things and one interesting thing. The interesting thing is: You can ask for a list of pages that link to your site--any page in your site. What sites link to this site?

Oh, and there's more sites, apparently. How long have I been working on this list? For an hour, I think. I'm at work now. I'm supposed to be working. Uhm... I'm going to stop this list now.

Labels: , ,

Milestone: 6 Million hits

Today, this website enjoyed its six-millionth hit. That hit was all about Amazingly Big things up in the Seattle area. I'm talking about the Pier 86 Grain Terminal and Microsoft. Let's take a look:

64.4.8.135 - - [13/Jul/2006:18:23:59 -0400] "GET /departures/Seattle/10/36views.html HTTP/1.0" 200 16858 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

This is the "msnbot" search crawler, scouting the internet for content to display in MSN Search results. It just crawled a page of my not-so-recently-updated travel photos, carefully confirming that they haven't changed.

(My opinions are mine, not my employers'.) Internet search geeks talk a lot about Google Search versus Yahoo search; they don't talk so much about MSN Search. But in one important regard in June 2006, MSN did even more to help internet users than Google did. So three cheers for MSN Search, a pocket of goodness buried in a big company, crawling the web so that they can show people of many nations some photos of huge grain silos.

I don't know how coherent that was. It's past my bedtime. Good night, wonderful internetty people of many nations. I hope you continue to find the Japanese ska reviews useful and/or inciteful.

Labels: , ,

Milestone: 5 Million Hits

Good gracious, it is the site's five-millionth hit. That's five million hits in seven years (plus a couple of days).

Let's look at the log record for that five-millionth hit:

71.112.234.3 - - [12/Mar/2006:03:00:50 -0400] "GET departures/Seattle/11/03759_great_republic_painting_tm.jpg HTTP/1.1" 200 4070 "-" Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"

This appears to be a Verizon customer in the Seattle area looking at the New Year 2005 Berkeley-Vancouver road trip travelog. They're running MSIE. I don't know much about Microsoft software, but that .NET CLR 1.1.4322 might mean that this person is running an old version of the .NET framework--noawadays you see 2.0.50727 a bunch.

Most people find the site by searching Google. (My opinions and statements are mine, not that of my employer.) What kinds of things do they search for? The most popular searches are

  • hogtied
  • poke her
  • japanese punk
  • free comix
  • unknown phone numbers
  • ninja smoke bombs
  • telegraph machine
  • japanese ska
  • david thatcher
  • mongol 800

Thanks to the power of Google sitemaps, I can regularly check out which things people search for such that this site shows up in the results--whether or not people click to go to this site. So, what are the most popular queries for which people could click to go to this site, but don't?

  • tahoo
  • pg & e
  • puzzle
  • telcan
  • dblock
  • xjapan
  • antsy
  • demoui
  • bay area night game
  • porta portal

I guess if I want to write about one of those topics, I could, and be assured that I've got a head start showing up in search results pages for those words.

Anyhow, it's been a fun seven years. Thank you for reading!

Labels: ,

Site: Milestone

Wow, it's the site's four-millionth web hit:

66.249.64.68 - - [28/Sep/2005:23:09:05 -0400] "GET /self/MBrooce/MBbruce.html HTTP/1.0" 304 - "-" "Googlebot/2.1 (+http://www.google.com/bot.html)"

Ah, the Googlebot spider visited to confirm that I haven't changed the main MégaBröoce page since February. The price of excellence is eternal vigilance!

Labels: ,

[Powered by Blogger | Feed | Feeds I Like ]

home |