FileMap is a lightweight system for applying Unix-style file processing tools to large amounts of data stored in files. It provides full map-reduce functionality without requiring that you switch your processing to any particular language or runtime environment, install any special software, or have root on your storage and processing nodes.
Example: Compute word frequencies in a text corpus. FileMap stores the files across a set of machines and executes the pipeline in parallel. Word list are divided up across the nodes and tallied in parallel:
$ fm map -i "/etext/*" "sed -f words.sed | fm split -n 9 |> sort | uniq -c"
Issue tracking is hosted at Google Code.