mapred
Class Count

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by mapred.Count
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class Count
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool

EPiC MapReduce main class. Based on provided parameters via command line, an appropriate MapReduce class is executed. Currently, there are two implementation of counting job that can be called via Count:

MapRedEpic
EPiC approach - counting on encrypted fields.
MapRedPlainCountOne
Plain-text counting.

Usage of Count via command-line:

hadoop jar <JARFILE> %mapred.Count [options] <input> <output>

JARFILE
The JAR file containing EPiC.
input
A HDFS path to the directory containing the input data.
Note: Only input files in the specified directory which start with "data" are read.
output
A HDFS path to the directory containing the results.
Note: Existing output will be automatically removed when Count is started.
The options given to Hadoop must be prefixed by "-D". The following options are supported:
paramfile
Name of the encryption parameters file. Path to the parameters file must be relative to this class inside the the JAR file.
Example: -Dparamfile=params.txt
request
A HDFS path to the user's request file containing the counting query. The specified path is relative to the input directory.
Example: -Drequest=request
mapred
Specifies which MapReduce approach to be executed. Currently the following values are supported:
epic
Calling MapRedEpic to execute the query, which applies a variant version of the approach presented in the EPiC paper. Precisely, based on the provided request, which contains the encrypted coefficients of the queried indicator polynomial, the Mappers evaluate the indicator polynomial for each record by multiplying the monomials with the coefficients before adding them together. In the last step at the Reducer, those results from Mappers (now considered as the value of the indicator polynomial evaluated for the corresponding subsets) are added together to obtain the final results and return to the user.

The approach presented in the EPiC paper is implemented in MapRedEpicReducerEvaluate, in which the Mappers compute the monomials without multiplying with the coefficients. At the final step, the Reducer adds those results from the Mappers together and then multiplies with the given coefficients to yield the final results.

An older approach of EPiC (see MapRedNotSendCoeff) is to keep the coefficients at the user side. The Mappers and Reducers only need to compute the monomials and add them together, then return to the user, who will be responsible to multiply the results with the precomputed coefficients to obtain the counting result. This, however, requires much more communication for downloading the results, therefore, is impractical.

plain
Calling MapRedPlainCountOne to count on plain-text values of the multiple countable fields. This implementation does not support range counting, boolean expressions, etc. Another illustration of plain-text counting is implemented in MapRedPlainCountAll which counts all possible values in one MapReduce job.
Example: -Dmapred=epic

Author:
vohuudtr

Constructor Summary
Count()
           
 
Method Summary
static void main(java.lang.String[] args)
          Entry point of the class.
 int run(java.lang.String[] args)
          Run the job.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Constructor Detail

Count

public Count()
Method Detail

run

public int run(java.lang.String[] args)
        throws java.lang.Exception
Run the job.

Specified by:
run in interface org.apache.hadoop.util.Tool
Parameters:
args - argument list for the running job.
Throws:
java.lang.Exception - if errors occur.

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Entry point of the class.

Parameters:
args - command-line arguments provided to the class.
Throws:
java.lang.Exception - if errors occur.