Larry Hosken: New

Book Report: She Come by it Natural

It's a biography of Dolly Parton braided together with a bit of the book author's family background. That family background might be summarized as "poor, hard-working woman," and helps the probably-privileged-upbringing reader to appreciate Parton's extra appeal to folks with that background. Parton worked hard, was poor, underestimated, disrespected… but she was talented and got very lucky, and eventually was rich, underestimated, and disrespected. But you, dear reader, will respect Parton once you read the obstacles she overcame.

screen shot of linked tumblr post



Neeva's shutting down. Neeva was a subscription-funded search company, in contrast to the usual free-with-ads search.

I used Neeva for a while; it cleared up something I was curious about, poked a hole in some received wisdom.

Back when I worked at Google, part of the company echo chamber chatter was that collecting user data was useful for personalized search, not just useful for ads. The usual example: if the user searches for [jaguar], it's not immediately clear which sense of the word they're interested in, car or cat. But if you've remembered the user's recent searches, you might know which sense of "jaguar" they're interested in; if they recently searched for directions to the zoo, show them the cat; if they recently searched for [fiat], show them the car. I nodded along; this sounded reasonable. But as I paid attention to my searches, it eventually dawned on me that I didn't actually make any ambiguous searches like [jaguar].

But I wasn't 100% sure. Maybe I thought [how many minutes steam yam] was unambiguous, but maybe that's because Google was kindly steering me to the recipe pages, and steering me away from some pages with some hypothetical other sense of yam steaming. (If there are other meanings of "steaming yams," please don't tell me. I have delicate sensibilities.)

One of Neeva's features was that it didn't try to figure out your interests from your searches, nice from a privacy point of view. I used it to figure out whether my web searches were ambiguous.

They weren't. Apparently, my searches are clear and unambiguous. I guess that's true for pretty much everybody.

Sure, the word "hulk" is ambiguous: could be the superhero or just a word for something big. But if I'm searching the internet for something, I'm probably not thinking it should be something big, but I don't especially care what that something is. That's not really something people do. The word "hulk" is ambiguous, but the web search [hulk] is almost certainly about the superhero.

Anyhow, pour one out for Neeva.



Only 80s kids will understand this exciting development in the field of artificial intelligence:

screen shot of chat room. one person wrote "Bard already thinking two moves ahead of me" with a screen shot of Bard AI in which Bard posts a picture of a tic-tac-toe game saying "You cant put your O in the center square, because I already have my X there. It is your turn again. Where would you like to put your O?" but the center square is empty. lahosken replies: "The only winning move is not to let your opponent play"



I pre-ordered the book 50 Years of Text Games and you can, too. It's a book of 50+ essays about computer games; the author chose one game to write about per year from 1971 to 2020. If that has you thinking "oh hey that sounds kind of familiar maybe?" some years back, I was reading early drafts of those essays which were newsletter posts at the time. They were pretty interesting. Some were about games I'd already played, and were good for nostalgia; some were about games I hadn't played, and were good for learning. If you're not sure whether you're interested, The book's website has a list of the written-about games and a link to the newsletter where you can read the posts that got me hooked.



My reports of the death of California Open Data Portal's COVID-19 wastewater surveillance data were exaggerated or premature or something. When I emailed the CDPH folks to whine that their page linked to stale data, a nice CDPH person wrote back:

…We are in the process of updating our data processes. This data on the California Open Data Portal should be updated by the end of this week. We appreciate your patience.…

So maybe I'll have raw data to play with again in a few days? Yay!



Oof, when I blogged about "heading in to Starbucks," that was before I heard about them closing all their stores in Ithaca, a town where all the Starbucks had voted to unionize. On second thought, I guess I'll go to Peet's this morning.



[Update: When I wrote this, I assumed that the California Open Data Portal had permanently stopped updating their Cal-SuWers data. But I was wrong, it was just "on pause" for a while as they changed some processes. I regret the error. OTOH, I'm glad to have been wrong in this case…]

I continue to check my little dashboard of San Francisco COVID-19 numbers each morning when deciding whether heading in to Starbucks is reasonable or embarrassingly risking my health for coffee. For the past few days, the wastewater measurement has been kinda troubling:

Line graph showing three lines. Two of the lines are pretty stable. But the Green line shot up steeply.

That brown-green line is wastewater data. For a long time it was slowly drifting down (yay!) but recently, it shot up steeply to a level up above the pretty-safe level. And then new data stopped coming in, so my dashboard graph kept showing the most-recent number, still up above the pretty-safe level.

San Francisco's wastewater data is pretty "noisy": big swings up and down. When that upward spike first appeared, I wasn't too worried. I'd seen spikes like that before; as more data came in, that seemingly-scary spike would probably turn out to be an outlier.

But then I realized I'd been looking at that probably-an-outlier number for over a week. That was unusual. So I investigated, and finally noticed on the documentation for the data my dashboard was fetching that the data hadn't updated since April 27. I was fetching my data on California's Open Data Portal that showed data from the California Water Board. Too bad that they'd stopped updating, but maybe not surprising since California had stopped its COVID emergency measures in early April.

I checked the California Department of Public Health's dashboard. It had recent data! It hadn't stopped updating on April 27!

With some poking around, I have a guess at what happened:

Anyhow, the nice California Department of Public Health people make their computed data available, so I suppose I'll switch my dashboard to use that:

Line graph showing three lines. All three of the lines are pretty stable.

This chart looks similar-but-different. San Francisco has two sewage treatment plants. When I had the "raw" data, I could do fancy-pants calculations where I'd give one plant's measurement more "weight" if it had measured a bigger sample. Since I no longer have the "raw" data, I don't know the sample sizes anymore, so I just average together the two plants' measurements willy-nilly. It's fine.

Most importantly, this similar-but-different graph agrees with the old graph in that it suggests that the recent fall in reported new COVID cases was for real, not just false hope from reduced testing.



RSAC (the RSA computer security convention) was in town, and for a while security advertisement abounded. This ad was my favorite. Seen from across the street, the ad seemed to say that "Huntress" was some tool that mean computer crooks would use to target my business. Only when I got closer did I see that actually this Huntress thing was somehow supposed to protect me from this targeting.

bus shelter ad with copy: Hackers are targeting your business. [something tiny] Huntress is how bus shelter ad with copy: Hackers are targeting your business. You need to be ready. Huntress is how

Like, if you're a CISO, one minute you're making a mental note to scan for this thing to expunge it from your network… and then the next minute you're cutting a check to install it.



If you liked my writeup of being on the 2020 MIT Mystery Hunt writing team, you'll love Alex Irpan's writeup of being on the 2023 MIT Mystery Hunt writing team. Thrills! Chills! Image memes! Crises that really make you appreciate some of the internal "milestone" deadlines that the 2020 MIT Mystery Hunt Triumvirate set! OK, maybe you had to be there. If you're considering winning Mystery Hunt, I recommend this writeup.



I continue to watch my little dashboard of San Francisco COVID-19 numbers to figure out if activities like in-person grocery shopping are ordinary errands or dangerous stunts. I watch a few numbers. Folks argue about which number to watch; and I roll my eyes: folks have settled on a few numbers to watch that are all pretty good. And because they're all pretty good, they tend to agree. Except except except not lately, not in San Francisco (and, I bet, not in California generally).

graph charting three COVID-19 measures. Two are drifting down (yay) but one stubbornly stays above the "safe" line

In late March, the numbers were above the pretty-safe line (argh), but slowly drifting down (yay). Then, in the first week of April, the state of California relaxed a bunch of restrictions on hospitals. Many people who periodically had to get tested didn't have to anymore. Maybe ½ to ⅔ as many tests took place each day.

The absolute number of new cases reported in San Francisco continued to fall. (That's the red line on my graph.) Did that mean that COVID-19 was in decline in SF? The fraction of tests in San Francisco that game back positive (the purple graph line) stopped falling; instead, it held steady. Did that mean that COVID-19 was holding on in SF, but that you might think it was in decline because fewer asymptomatic-but-infected people were getting tested?

It makes sense that the fraction of positive tests might not decline as fast as the fraction of infected people. San Francisco reports statistics for the super-duper official PCR tests administered by medical professionals. If some unhappy person gives themselves an at-home test and it's positive, that person then goes to get a super-duper official test so that their health insurance will pay for their treatment. There aren't many of these almost-certainly-infected people getting tested compared to the number of probably-not-infected hospital-people; but starting in early April, the fraction of almost-certainly-infected people went up, because many probably-not-infected hospital-people stopped getting tested.

In hindsight, I think COVID-19 really is in decline in SF. I say that because the excellent folks at our wastewater plants are finding less COVID-19. (That's the green-brown line on my graph.). Normally, I don't pay much attention to the wastewater-COVID numbers from day to day. Those measurements are very noisy! You're looking at my graph and saying "that looks pretty jagged; you should smooth out those numbers" and I'm telling you: those are the smoothed-out numbers. If you pay too-close attention to the wastewater numbers, one day you think "OMG we cured the COVID last week but somehow it didn't make the news" and the next day you think "OMG COVID runs rampant in the streets and we are all doomed." But the wastewater numbers don't care that a bunch of hospital-people no longer need to get tested periodically. Everybody in San Francisco, uhm, contributes to our wastewater tests. And though that graph is very jagged, it's also trending down.

So instead of waiting for that purple line to descend to a "safe" level, I guess I'll just multiply together the numbers I have, compare them to the multiplied-together "safe" numbers, and act accordingly. In happier times, I might keep an eye for some smartie on Twitter to pipe up with a well-researched new "safe" level. But as Twitter continues to implode…if some smartie does try to post there, there's no guarantee that I'd notice.

Anyhow, I guess I'll go grocery shopping in-person soon; and I'll keep on keeping on looking at my little dashboard to decide whether to do things in-person or to re-retreat.




1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023