[Edited to add: If you have questions or concerns about Caja, the Google Caja Discuss group is a good place to ask them.]
When you write a program that's supposed to be secure, you have to plan on security from the beginning; you can't bolt it on afterwards. The idiomatic way to describe a "plan" like we'll write the program first and figure out the security later is "They're asking for some magical security fairy dust to sprinkle over their code."
I'm tweaking a Javascript program that takes HTML from someone else and renders it on a page. I thought my program was getting "sanitized" HTML; that is, HTML that had any potentially-dangerous stuff removed. If I'm showing someone else's HTML on my page, I want to make sure that HTML doesn't have, for example, an <img src="http://sneaky.org/sneaky.gif"> in it. Otherwise, the webmaster of sneaky.org will know whenever someone reads my page.
I thought the program was getting sanitized HTML, but it was getting "raw" HTML, possibly chock-full of evil. Argh, I needed to bolt on some security. I went pleading to some of the security-minded folks for help. I was embarrassed--I 'fessed up that I needed some "magical security fairy dust". The amazing part is that those security-minded folks came through--they pointed me at Caja.
Caja is primarily a system for enforcing security "capabilities" in Javascript. But, but but even if you don't need all of that, you might still want one part:
Caja comes with a XSS sanitizer for HTML that works with your JS code: html-sanitizer.js. And you'll also need html4-defs.js. It looks like you need to build html4-defs.js via Ant. That's kinda annoying, but a lot easier than writing your own HTML sanitizer from scratch.
I looked over the source code. It's checking for bad stuff I hadn't thought to check for. I sure am glad that folks more knowledgeable than me are working on this thing.
Labels: capabilities, link, programming
Hi Larry! Great use of Caja! :) We don't have a "1.0" release of Caja but we do push updates at http://google-caja.googlecode.com/svn/maven/caja/caja/*/caja-*.jar. This jar already contains html4-defs.js pre-built so removes the dependence on ant.
Plus, we also include html-sanitizer-minified.js in that jar. It is html4-defs.js, css-defs.js and html-sanitizer.js concatenated and minified - a single file with the same functionality but with fewer calories and smaller so will download quicker when used.
Regards
Your Friendly Neighborhood Cajadore
In addition to html4-def.js in the JAR, it used to live in the svn repos:
http://code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/html4-defs.js?spec=svn2833&r=2466