Working with CouchDB’s Map/Reduce Framework
This afternoon I've been doing a lot with CouchDB's map/reduce framework for querying data out of CouchDB. The terminology is pretty simple: a Document can hold multiple Views where each view has a Map component that looks at each document in the database and returns something based on the inspection of it's data, and an optional Reduce function that takes all the results of the Map function calls and reduces it to a smaller dataset.
It's pretty standard in a lot of languages: first you operate on the individual elements in a collection, and then you summarize those values. In CouchDB it's all in Javascript. That's not bad, I've done a lot of that in my day, so it's pretty easy to get back into the swing of things.
One interesting issue is that CouchDB is written in erlang, and while I don't see myself digging into the guts of this thing, it's interesting to know where it all comes from, as it makes it a lot easier to understand why they chose Javascript, for instance.
Anyway, let's say I want to see all the merchants that have no OTCs assigned to them. I'd create a Temporary View in the CouchDB web page, and then in the View Code I'd have something like this:
function(doc) { if (doc.meta.label == "QuantumLead.results" && doc.otcs.length == 0) { var key = [doc.division, doc.meta.created]; var blob = { name: doc.merchant.name, sf_id: doc.merchant.sf_id }; emit(key, blob); } }
The interesting parts here are that the emit() method is really the action item in this function. When we want to add something to the output for this Map function, we have to call emit() with the first argument being the key, and the second the value. The key, as shown here, can be a multi-part key, and the value can be any Javascript object.
The thing I like about the use of Javascript here is that the attributes look like "dotted methods" and not hash members. This makes it so much easier to reference the data within a doc by just using the key names and dots. Very nice use of Javascript.
So now that I have my first few Views and Documents in the system, I need to work on getting things out of these calls, and into some nicely formatted output for the important demo that's coming up.